# Sentiment Analysis & Text Classification

Polarity - Gives the emotion of the text 

    1. (-1,0) - Negative Sentiment 
    
    2. (0,1) - Positive Sentiment

Subjectivity - Level of objectivity or personalisaton & Opinion
 
    1. 1 - Very Much Subjective - Too Personalised and Opinionated
    
    2. 0 - Very Factual - Not Personalised 

In [None]:
# Initialising the Text Blob 

from textblob import TextBlob

In [None]:
text_1 = 'Vinod is very happy today'

In [None]:
blob_1 = TextBlob(text_1)

In [None]:
blob_1.sentiment

In [None]:
text_2 = 'The Movie was not on expected line. I did not enjoy the film at all. It was a waste of time'

In [None]:
blob_2 = TextBlob(text_2)

In [None]:
blob_2.sentiment

In [None]:
text_3 = 'The Sun is going to set at 6pm'

In [None]:
blob_3 = TextBlob(text_3)

In [None]:
blob_3.sentiment

In [None]:
# So text_3 is a neutral statement and subjectivity wise it is a factual Statement 

In [None]:
text = '''Ever since Finance Minister Nirmala Sitharaman presented her fifth and last full budget of the Modi government’s second term on February 1, 2023, every aspect of the budget has been analysed thread-bare by stakeholders, experts, and members of the commentariat. This author, himself, starting with the analysis of the economic survey, analysed the budget story through his framework of a five-part series.
Make no mistake- along with the finance minister’s budget speech and demands for grants, the outcome budget (OB) forms the trinity of the most important budget document and still, only a few have decided to deep dive into it. And the reasons are as follows-
The outcome budget, in its current avatar, is of recent origin and has not yet received the proper attention of the commentariat.
It is a complex, confusing and lengthy document that is difficult to decode. The FY24 outcome budget is 280 pages long, while FY23 one was 297 pages long.
Unlike demands of grants which provides actuals of previous, revised estimate of current year and the budget estimate for next year, the outcome budget is a standalone document that just enumerates the budgeted output and outcome targets for selected schemes under ministries and departments. It throws no light on what was the target of the previous year, nor does it throw light on past years’ achievements against the target.
Advertisement. It is natural then that the outcome budget does not get the attention of the analysts it deserves. And this in itself was reason enough for this member of the commentariat to deep dive into the outcome budget of FY24 and to arrive at meaningful insight to also examine the outcome budget of FY23 critically. And here goes my analysis. As two years’ outcome budget documents are so voluminous, this analysis covers only outcome budget of two Ministries with an attempt to grasp whether the budgetary outlay for FY24 is properly aligned with monitorable outputs and outcomes or whether the outcome budget has turned into a standalone document falling short on its basic premise, losing its rigour and seriousness.
The key aspect that I seek to address is whether output, outcome and key milestones to be achieved in the financial year are synced seamlessly and explained synchronously because, in its absence, the budget outlays remain what they are— simple annual expenditure targets defeating the very purpose of having an outcome budget.
THE OUTCOME BUDGET
For the uninitiated, I begin with a primer on- what is the outcome budget.
Till recently, before FY2017-18, as part of the Union budget, only the financial outlays of schemes of various ministries were part of the budget document while the expected outputs and outcomes of schemes were prepared and presented separately by each ministry (initiated by P. Chidambaram as Finance Minister in FY2008.)'''

In [None]:
blob_4 = TextBlob(text)

In [None]:
blob_4.sentiment

# Text Classification

In [None]:
import numpy as np
import pandas as pd 
from sklearn.datasets import fetch_20newsgroups

In [None]:
train = fetch_20newsgroups(subset='train')

In [None]:
test = fetch_20newsgroups(subset='test')

In [None]:
train

In [None]:
test

In [None]:
train.keys()

In [None]:
test.keys()

In [None]:
train['target_names']

In [None]:
np.unique(train['target'])

In [None]:
print(train['data'][1])

In [None]:
print(train['data'][10])

In [None]:
print(train['data'][1000])

In [None]:
test.keys()

# Building a Text Classification Model 

In [None]:
# Feature Set 

train['data']

In [None]:
# Target 

train['target']

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer

from sklearn.naive_bayes import MultinomialNB

In [None]:
from sklearn.pipeline import make_pipeline

In [None]:
mnb = make_pipeline(TfidfVectorizer(), MultinomialNB())

In [None]:
## Training

mnb.fit(train['data'], train['target'])

In [None]:
pred = mnb.predict(test['data'])

In [None]:
pred

# Evaluating the Performance

In [None]:
from sklearn.metrics import classification_report, confusion_matrix

In [None]:
report = classification_report(test['target'], pred)
cm = confusion_matrix(test['target'], pred)

print('The report:\n', report)
print('\n\n')
print('The confusion matrix:\n', cm)

# Showing the text topic

In [None]:
def predict_news_group(doc):
    group_pred = mnb.predict([doc])
    return test['target_names'][group_pred[0]]

In [None]:
text = 'Nowadays, there is a lot of mixing of politics and religion. We need to relook at the impact of this. Will it help the politics in the long run?'

In [None]:
predict_news_group(text)

In [None]:
text_2 = 'Sports is more of an entertainment these days'

In [None]:
predict_news_group(text_2)

In [None]:
text_3 = 'Windows came early was difficult to understand. As there was release of Windows, it became more user friendly'

In [None]:
predict_news_group(text_3)