# Sentiment analysis

### Sentiment Analysis, or Opinion Mining, is a sub-field of Natural Language Processing (NLP) that tries to identify and extract opinions within a given text.

- sentiment analysis so important because Businesses today are heavily dependent on data. Majority of this data however, is unstructured text coming from sources like emails, chats, social media, surveys, articles, and documents. The micro-blogging content coming from Twitter and Facebook poses serious challenges, not only because of the amount of data involved, but also because of the kind of language used in them to express sentiments, i.e., short forms, memes and emoticons

- sentiment analysis is used in all over the world to analyse the people reaction.
- Different company use sentiment analysis to analyse the product review.
- on the basis of product review company take for decision.

### Importing the library

In [1]:
import pandas as pd
import tweepy

## VADER Sentiment Analysis

- VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.

- VADER uses a combination of A sentiment lexicon is a list of lexical features (e.g., words) which are generally labelled according to their semantic orientation as either positive or negative.

- VADER has been found to be quite successful when dealing with social media texts, NY Times editorials, movie reviews, and product reviews. This is because VADER not only tells about the Positivity and Negativity score but also tells us about how positive or negative a sentiment is.

- VADER performs very well with emojis, slangs and acronyms in sentences.

In [2]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

In [3]:
#Authenticate keys of twitter
consumer_key = 'Enter Here'
consumer_secret = 'Enter Here'
access_token = 'Enter Here'
access_token_secret = 'Enter Here'

In [4]:
#with the help of key accessing the twitter
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

tweets = api.search('Machine Learning', count=200)


data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

display(data.head(10))


print(tweets[0].created_at)

Unnamed: 0,Tweets
0,RT @KirkDBorne: Learning Math For #MachineLear...
1,Buen articulo sobre la IA en estos momentos: h...
2,"@radiobotics‚Äô machine learning technology, tha..."
3,Impact of a combination of quantitative indice...
4,Topological Properties of Resting-State fMRI F...
5,Performance and clinical impact of machine lea...
6,Handmade trileaflet valve design and validatio...
7,Prediction of breast cancer risk using a machi...
8,RT @IntelBusiness: Improve business processes ...
9,RT @ElliottSaslow: We are in a digital arms ra...


2019-01-29 13:28:06


In [5]:
import nltk
# lexicon is considered a corpus since a list of words is also a body of text
nltk.download('lexicon')

[nltk_data] Error loading lexicon: Package 'lexicon' not found in
[nltk_data]     index


False

In [6]:
#creating the object for analysing the polarity
sid = SentimentIntensityAnalyzer()

listy = []

for index, row in data.iterrows():
  ss = sid.polarity_scores(row["Tweets"])
  listy.append(ss)
  
se = pd.Series(listy)
data['polarity'] = se.values

display(data.head(100))

Unnamed: 0,Tweets,polarity
0,RT @KirkDBorne: Learning Math For #MachineLear...,"{'neg': 0.0, 'neu': 0.807, 'pos': 0.193, 'comp..."
1,Buen articulo sobre la IA en estos momentos: h...,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound..."
2,"@radiobotics‚Äô machine learning technology, tha...","{'neg': 0.0, 'neu': 0.791, 'pos': 0.209, 'comp..."
3,Impact of a combination of quantitative indice...,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound..."
4,Topological Properties of Resting-State fMRI F...,"{'neg': 0.0, 'neu': 0.805, 'pos': 0.195, 'comp..."
5,Performance and clinical impact of machine lea...,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound..."
6,Handmade trileaflet valve design and validatio...,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound..."
7,Prediction of breast cancer risk using a machi...,"{'neg': 0.317, 'neu': 0.683, 'pos': 0.0, 'comp..."
8,RT @IntelBusiness: Improve business processes ...,"{'neg': 0.0, 'neu': 0.573, 'pos': 0.427, 'comp..."
9,RT @ElliottSaslow: We are in a digital arms ra...,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound..."


### The Compound score is a metric that calculates the sum of all the lexicon ratings which have been normalized between -1(most extreme negative) and +1 (most extreme positive).

![1_G8yV2iaqqfaGfmRPRem2Fw.png](attachment:1_G8yV2iaqqfaGfmRPRem2Fw.png)