# 1.1 Introduction:
Sentimental analysis is a procedure of determining whether a piece of writing (microblogging in twitter here) is  positive, negative or neutral using several computations. It is mainly used for determining the attitude of a speaker and is also known as opinion mining.


# 1.2 Practical uses of Sentimental analysis: 
1. Inform and make operational improvements or capital expenditures. 
2. Evaluate guest likes and dislikes for your property AND your competitors’ properties.
3. Apply Sentiment Analysis for surveys. 
4. Understand your guests better ;-)


# 1.3 Objective:
To perform sentimental analysis on the 100 random tweets related to:
1. Donald Trump
2. IPL
3. Avengers


# 1.4 Requirements: 
1. Python
2. Python libraies including collections, tweepy and csv
3. Aylien text analysis api


# 1.5 Preprocessing

####  1.5.1  Hashtags

A hashtag is a word or an un-spaced phrase prefixed with the hash symbol (#).
These are used to both naming subjects and phrases that are currently in
trending topics. For example, #iPad, #news

Regular Expression: `#(\w+)`

Replace Expression: `HASH_\1`

####  1.5.2  Handles

Every Twitter user has a unique username. Any thing directed towards that user
can be indicated be writing their username preceded by ‘@’. Thus, these are
like proper nouns. For example, @Apple

Regular Expression: `@(\w+)`

Replace Expression: `HNDL_\1`

####  1.5.3  URLs

Users often share hyperlinks in their tweets. Twitter shortens them using its
in-house URL shortening service, like https://t.co/1uoxlp8FwN - such links also
enables Twitter to alert users if the link leads out of its domain. From the
point of view of text classification, a particular URL is not important.
However, presence of a URL can be an important feature. Regular expression for
detecting a URL is fairly complex because of different types of URLs that can
be there, but because of Twitter’s shortening service, we can use a relatively
simple regular expression.

Regular Expression: `(http|https|ftp)://[a-zA-Z0-9\\./]+`

Replace Expression: `URL`

####  1.5.4  Emoticons

Use of emoticons is very prevalent throughout the web, more so on micro-
blogging sites. We identify the following emoticons and replace them with a
single word. Table 4 lists the emoticons we are currently detecting. All other
emoticons would be ignored.

<div style="text-align:center"> 
<table border="1">
<tr><td colspan="1" align="center">Emoticons </td><td colspan="6" align="center">Examples </td></tr>
<tr><td align="left"><tt>EMOT_SMILEY</tt>   </td><td align="left"><tt>:-)</tt>  </td><td align="left"><tt>:)</tt>   </td><td align="left"><tt>(:</tt>   </td><td align="left"><tt>(-:</tt>  </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt> </td></tr>
<tr><td align="left"><tt>EMOT_LAUGH</tt>    </td><td align="left"><tt>:-D</tt>  </td><td align="left"><tt>:D</tt>   </td><td align="left"><tt>X-D</tt>  </td><td align="left"><tt>XD</tt>   </td><td align="left"><tt>xD</tt>   </td><td align="left"><tt></tt> </td></tr>
<tr><td align="left"><tt>EMOT_LOVE</tt>     </td><td align="left"><tt>&lt;3</tt>    </td><td align="left"><tt>:*</tt>   </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt> </td></tr>
<tr><td align="left"><tt>EMOT_WINK</tt>     </td><td align="left"><tt>;-)</tt>  </td><td align="left"><tt>;)</tt>   </td><td align="left"><tt>;-D</tt>  </td><td align="left"><tt>;D</tt>   </td><td align="left"><tt>(;</tt>   </td><td align="left"><tt>(-;</tt> </td></tr>
<tr><td align="left"><tt>EMOT_FROWN</tt>    </td><td align="left"><tt>:-(</tt>  </td><td align="left"><tt>:(</tt>   </td><td align="left"><tt>(:</tt>   </td><td align="left"><tt>(-:</tt>  </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt> </td></tr>
<tr><td align="left"><tt>EMOT_CRY</tt>  </td><td align="left"><tt>:,(</tt>  </td><td align="left"><tt>:'(</tt>  </td><td align="left"><tt>:"(</tt>  </td><td align="left"><tt>:((</tt>  </td><td align="left"><tt></tt>     </td><td align="left"><tt></tt> </td></tr></table>


<div style="text-align:center">Table 4: List of Emoticons</div>
<a id="tab:emot">
</a>
</div>

####  1.5.5  Punctuations

Although not all Punctuations are important from the point of view of
classification but some of these, like question mark, exclamation mark can
also provide information about the sentiments of the text. We replace every
word boundary by a list of relevant punctuations present at that point. Table
5 lists the punctuations currently identified. We also remove any single
quotes that might exist in the text.

<div style="text-align:center"> 
<table border="1">
<tr><td colspan="1" align="center">Punctuations </td><td colspan="2" align="center">Examples </td></tr>
<tr><td align="left"><tt>PUNC_DOT</tt> </td><td align="left"><tt>.</tt> </td><td align="left"><tt></tt> </td></tr>
<tr><td align="left"><tt>PUNC_EXCL</tt> </td><td align="left"><tt>!</tt> </td><td align="left"><tt>¡</tt> </td></tr>
<tr><td align="left"><tt>PUNC_QUES</tt> </td><td align="left"><tt>?</tt> </td><td align="left"><tt>¿</tt> </td></tr>
<tr><td align="left"><tt>PUNC_ELLP</tt> </td><td align="left"><tt>...</tt> </td><td align="left"><tt>…</tt> </td></tr></table>


<div style="text-align:center">Table 5: List of Punctuations</div>
<a id="tab:punc">
</a>
</div>

####  1.6.6  Repeating Characters

People often use repeating characters while using colloquial language, like
"I’m in a hurrryyyyy", "We won, yaaayyyyy!" As our final pre-processing step,
we replace characters repeating more than twice as two characters.

Regular Expression: `(.)\1{1,}`

Replace Expression: `\1\1`

##### source for 1.5: https://github.com/ayushoriginal/Sentiment-Analysis-Twitter/blob/master/README.md


In [43]:
import pandas as pd
from IPython.display import display
from IPython.display import Image
sample = pd.read_csv('Sentiment_Analysis_of_100_Tweets_About_trump.csv')
neutral = 0
positive = 0
negative = 0
for i in sample['Sentiment']:
    if i == 'neutral':
        neutral+=1
    elif i == 'positive':
        positive+=1
    elif i == 'negative':
        negative+=1

# 1.6 Sentimental Analysis of 100 Tweets about Donald Trump
### 1.6.1 CSV File Sample:-

In [44]:
pd.options.display.max_columns = None
sample.head(7)

Unnamed: 0,Tweet,Sentiment
0,@TomiLahren Jay Z poster boy of scum and ghett...,negative
1,"Texas elected official gets misdemeanor, accus...",neutral
2,White House chief of staff urges Trump to remo...,neutral
3,@Ronsmithracing2 @FlashResists @realDonaldTrum...,neutral
4,@ZekeJMiller Trump is a liar. \nThat is all.,negative
5,Mexico Just Made a MAJOR Threat to President T...,neutral
6,Will We Stop Trump Before Its Too Late? https:...,neutral


### 1.6.2 Analysis on Pie Chart:-

In [45]:
import plotly
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(connected=True)

labels = ['Positive', 'Negative', 'Neutral']
values = [positive,negative,neutral]
colors = ['#30a832', '#fc2c19', '#fcd219']
domains = [
    {'x': [7.0, 7.0], 'y': [7.0, 7.0]}
]
traces = []

for domain in domains:
    trace = go.Pie(labels = labels,
                   values = values,
                   domain = domain,
                   hoverinfo = 'label+percent',
                  marker=dict(colors=colors, 
                           line=dict(color='#000000', width=3)))
    traces.append(trace)

layout = go.Layout(
                   autosize = True,
                   title = 'Analysis For 100 Tweets Related to Donald Trump')
fig = go.Figure(data = traces, layout = layout)
py.offline.iplot(fig, show_link = False)

# 1.7 Sentimental Analysis of 100 Tweets about IPL
### 1.7.1 CSV File Sample:-

In [46]:
sample = pd.read_csv('Sentiment_Analysis_of_100_Tweets_About_IPL.csv')
neutral = 0
positive = 0
negative = 0
for i in sample['Sentiment']:
    if i == 'neutral':
        neutral+=1
    elif i == 'positive':
        positive+=1
    elif i == 'negative':
        negative+=1
pd.options.display.max_columns = None
sample.head(7)

Unnamed: 0,Tweet,Sentiment
0,"@Sanam_Official Noo, sorry I was actually pre-...",neutral
1,Congratulations team kkr #ipl,positive
2,Pics-9: #ShahRukhKhan @iamsrk with #AbRam and ...,positive
3,A 50 off 19 balls by @SunilPNarine74 helped th...,neutral
4,"Virat Kohli - ""I think we were 15 short with t...",negative
5,The hero tonight!! Take a bow sir We won!!!!!!...,positive
6,"Virat Kohli - ""Me and AB getting out to a part...",neutral


### 1.7.2 Analysis on Pie Chart:-

In [47]:
values = [positive,negative,neutral]
for domain in domains:
    trace = go.Pie(labels = labels,
                   values = values,
                   domain = domain,
                   hoverinfo = 'label+percent',
                  marker=dict(colors=colors, 
                           line=dict(color='#000000', width=3)))
    traces.append(trace)

layout = go.Layout(
                   autosize = True,
                   title = 'Analysis For 100 Tweets Related to IPL')
fig = go.Figure(data = traces, layout = layout)
py.offline.iplot(fig, show_link = False)

# 1.8 Sentimental Analysis of 100 Tweets about Avengers
### 1.8.1 CSV File Sample:-

In [48]:
sample = pd.read_csv('Sentiment_Analysis_of_100_Tweets_About_Avengers.csv')
neutral = 0
positive = 0
negative = 0
for i in sample['Sentiment']:
    if i == 'neutral':
        neutral+=1
    elif i == 'positive':
        positive+=1
    elif i == 'negative':
        negative+=1
pd.options.display.max_columns = None
sample.head(7)

Unnamed: 0,Tweet,Sentiment
0,Latest Avengers: Infinity War Trailer Puts Spo...,neutral
1,Women have made it so on social media they are...,neutral
2,@sweeet_caroline chris evans already said aven...,positive
3,We explain what the @Avengers #InfinityWar tra...,neutral
4,#BlackWidow #CaptainAmerica\n#Romanogers #Capw...,neutral
5,@Marvel @Avengers In summation. https://t.co/...,neutral
6,The crazy part bout the new avengers is #4 com...,neutral


### 1.8.2 Analysis on Pie Chart:-

In [49]:
values = [positive,negative,neutral]
for domain in domains:
    trace = go.Pie(labels = labels,
                   values = values,
                   domain = domain,
                   hoverinfo = 'label+percent',
                  marker=dict(colors=colors, 
                           line=dict(color='#000000', width=3)))
    traces.append(trace)

layout = go.Layout(
                   autosize = True,
                   title = 'Analysis For 100 Tweets Related to IPL')
fig = go.Figure(data = traces, layout = layout)
py.offline.iplot(fig, show_link = False)

## 1.9 Drawbacks
Text API used which only analysis limited amount of tweets at single hit.

## 1.10 Conclusion
The classifies has achieved the accuracy around 80%. The peak value of accuracy is 87%.  