# 6. A) Apply log linear model for sentiment analysis
# B)Implement and extract Named Entity recognition techniques using given text:
'''Deepak Jasani, Head of retail research, HDFC Securities, said: “Investors will look to the European Central Bank later Thursday for reassurance that surging prices are just transitory, and not about to spiral out of control. In addition to the ECB policy meeting, investors are awaiting a report later Thursday on US economic growth, which is likely to show a cooling recovery, as well as weekly jobs data.”.'''


A) Apply a Log-Linear Model for Sentiment Analysis
We’ll use scikit-learn's Logistic Regression model (a type of log-linear model) on a small sample dataset for demonstration:

In [5]:
import nltk
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Sample Data
texts = [
    "I love this product, it's amazing!",
    "Terrible experience, I hate it.",
    "Really good quality and fast delivery.",
    "Worst service ever.",
    "I'm very happy with the purchase.",
    "It's bad, don't buy it."
]
labels = [1, 0, 1, 0, 1, 0]  # 1 = positive, 0 = negative

# Convert text to bag-of-words features
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

# Train Logistic Regression (log-linear model)
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3)
model = LogisticRegression()
model.fit(X_train, y_train)

# Test on new sentence
test_text = ["I am not happy with the product"]
test_vec = vectorizer.transform(test_text)
pred = model.predict(test_vec)

print("Sentiment:", "Positive 😊" if pred[0] == 1 else "Negative 😞")


Sentiment: Positive 😊


B) Named Entity Recognition (NER) 

In [7]:
import nltk
from nltk import word_tokenize, pos_tag, ne_chunk

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('maxent_ne_chunker_tab')
nltk.download('words')

text = '''Deepak Jasani, Head of retail research, HDFC Securities, said: “Investors will look to the European Central Bank Later Thursday for reassurance that surging prices are just transitory, and not about to spiral out of control. In addition to the ECB policy meeting, investors are awaiting a report later Thursday on US economic growth, which is likely to show a cooling recovery, as well as weekly jobs data.”'''

# Tokenize, POS tagging, and chunking
tokens = word_tokenize(text)
tags = pos_tag(tokens)
tree = ne_chunk(tags)

# Extract named entities
print("Named Entities:")
for subtree in tree:
    if hasattr(subtree, 'label'):
        print(f"{subtree.label()}: {' '.join(c[0] for c in subtree)}")


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Gauri\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\Gauri\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     C:\Users\Gauri\AppData\Roaming\nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package maxent_ne_chunker_tab to
[nltk_data]     C:\Users\Gauri\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping chunkers\maxent_ne_chunker_tab.zip.
[nltk_data] Downloading package words to
[nltk_data]     C:\Users\Gauri\AppData\Roaming\nltk_data...
[nltk_data]   Package words is already up-to-date!


Named Entities:
PERSON: Deepak
ORGANIZATION: Jasani
ORGANIZATION: HDFC Securities
ORGANIZATION: European Central Bank
ORGANIZATION: ECB
GSP: US
