### Q1. NLP Processing Steps

Write Python code to perform the following steps:

1. Segment into tokens  
2. Remove stopwords  
3. Apply lemmatization (not stemming)  
4. Keep only verbs and nouns (use POS tags)  

**Input text:**  
"John enjoys playing football while Mary loves reading books in the library."

In [8]:
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk import word_tokenize, pos_tag

text = "John enjoys playing football while Mary loves reading books in the library."

# 1. Tokenize
tokens = word_tokenize(text)

# 2. Remove stopwords
stop_words = set(stopwords.words('english'))
filtered_tokens = [w for w in tokens if w.lower() not in stop_words]

# 3. Lemmatization (verb-based for better results)
lemmatizer = WordNetLemmatizer()
lemmatized = [lemmatizer.lemmatize(w, pos='v') for w in filtered_tokens]

# 4. POS-tag and keep only nouns & verbs
pos = pos_tag(lemmatized)
allowed_pos = {'NN', 'NNS', 'VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ'}

final_output = [word for word, tag in pos if tag in allowed_pos]

print("Final:", final_output)

Final: ['enjoy', 'play', 'football', 'love', 'read', 'book', 'library']


### Q2. Named Entity Recognition (NER) and Pronoun Ambiguity Detection

Use Python and any NLP model to perform:

1. Named Entity Recognition (NER)  
2. Pronoun ambiguity detection using the following rule:  
   - If the text contains a pronoun ("he", "she", "they"), print:  
     **"Warning: Possible pronoun ambiguity detected!"**

**Input text:**  
"Chris met Alex at Apple headquarters in California. He told him about the new iPhone launch."


In [7]:
import spacy

# Load spaCy English model
nlp = spacy.load("en_core_web_sm")

text = "Chris met Alex at Apple headquarters in California. He told him about the new iPhone launch."

# 1. Named Entity Recognition (NER)
doc = nlp(text)

print("Named Entities:")
for ent in doc.ents:
    print(f"{ent.text}  -->  {ent.label_}")

# 2. Pronoun ambiguity detection
pronouns = {"he", "she", "they"}
words = [token.text.lower() for token in doc]

if any(p in words for p in pronouns):
    print("\nWarning: Possible pronoun ambiguity detected!")


Named Entities:
Chris  -->  PERSON
Alex  -->  PERSON
Apple  -->  ORG
California  -->  GPE
iPhone  -->  ORG

