# Sentiment Analysis

# 1.1 Introduction
<br>

When it comes to text data, there are a few popular techniques that we'll be going through in the next few notebooks, starting with sentiment analysis. A few key points to remember with sentiment analysis.

1. **TextBlob Module:** Linguistic researchers have labeled the sentiment of words based on their domain expertise. Sentiment of words can vary based on where it is in a sentence. The TextBlob module allows us to take advantage of these labels.
2. **Sentiment Labels:** Each word in a corpus is labeled in terms of polarity and subjectivity (there are more labels as well, but we're going to ignore them for now). A corpus' sentiment is the average of these.
   * **Polarity**: How positive or negative a word is. -1 is very negative. +1 is very positive.
   * **Subjectivity**: How subjective, or opinionated a word is. 0 is fact. +1 is very much an opinion.

Let's take a look at the sentiment of the various posts for all personality type.

## 1.2 Sentiment of Chats

In [1]:
# We'll start by reading in the corpus, which preserves word order
import pandas as pd

datas = pd.read_pickle('corpus.pkl')
del datas.index.name
datas

Unnamed: 0,posts
INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...
ENTP,'I'm finding the lack of me in these posts ver...
INTP,'Good one _____ https://www.youtube.com/wat...
INTJ,"'Dear INTP, I enjoyed our conversation the o..."
ENTJ,'You're fired.|||That's another silly misconce...
INTJ,'18/37 @.@|||Science is not perfect. No scien...
INFJ,"'No, I can't draw on my own nails (haha). Thos..."
INTJ,'I tend to build up a collection of things on ...
INFJ,"I'm not sure, that's a good question. The dist..."
INTP,'https://www.youtube.com/watch?v=w8-egj0y8Qs||...


In [2]:
datas = datas.transpose()
datas = datas.groupby(level=0, axis=1).sum()
datas = datas.transpose()
datas

Unnamed: 0,posts
ENFJ,'https://www.youtube.com/watch?v=PLAaiKvHvZs||...
ENFP,"'He doesn't want to go on the trip without me,..."
ENTJ,'You're fired.|||That's another silly misconce...
ENTP,'I'm finding the lack of me in these posts ver...
ESFJ,'Why not?|||Any other ESFJs originally mistype...
ESFP,'Edit: I forgot what board this was on.|||I am...
ESTJ,"this is such a catch 22 |||I'm here! Although,..."
ESTP,Splinter Cell Blacklist for Xbox 360.|||ESTPs ...
INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...
INFP,'I think we do agree. I personally don't consi...


In [3]:
# Let's add the personality names' as well
personality_Names = ['Extraverted iNtuitive Feeling Judging', 'Extraverted iNtuitive Feeling Perceiving', 'Extraverted iNtuitive Thinking Judging', 'Extraverted iNtuitive Thinking Perceiving', 'Extraverted Sensing Feeling Judging', 'Extraverted Sensing Feeling Perceiving',
              'Extraverted Sensing Thinking Judging', 'Extraverted Sensing Thinking Perceiving', 'Introverted iNtuitive Feeling Judging', 'Introverted iNtuitive Feeling Perceiving', 'Introverted iNtuitive Thinking Judging', 'Introverted iNtuitive Thinking Perceiving','Introverted Sensing Feeling Judging','Introverted Sensing Feeling Perceiving','Introverted Sensing Thinking Judging','Introverted Sensing Thinking Perceiving']

datas['personality_Names'] = personality_Names
datas

Unnamed: 0,posts,personality_Names
ENFJ,'https://www.youtube.com/watch?v=PLAaiKvHvZs||...,Extraverted iNtuitive Feeling Judging
ENFP,"'He doesn't want to go on the trip without me,...",Extraverted iNtuitive Feeling Perceiving
ENTJ,'You're fired.|||That's another silly misconce...,Extraverted iNtuitive Thinking Judging
ENTP,'I'm finding the lack of me in these posts ver...,Extraverted iNtuitive Thinking Perceiving
ESFJ,'Why not?|||Any other ESFJs originally mistype...,Extraverted Sensing Feeling Judging
ESFP,'Edit: I forgot what board this was on.|||I am...,Extraverted Sensing Feeling Perceiving
ESTJ,"this is such a catch 22 |||I'm here! Although,...",Extraverted Sensing Thinking Judging
ESTP,Splinter Cell Blacklist for Xbox 360.|||ESTPs ...,Extraverted Sensing Thinking Perceiving
INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...,Introverted iNtuitive Feeling Judging
INFP,'I think we do agree. I personally don't consi...,Introverted iNtuitive Feeling Perceiving


In [4]:
# Create quick lambda functions to find the polarity and subjectivity of each chat
from textblob import TextBlob 

pol = lambda x: TextBlob(x).sentiment.polarity
sub = lambda x: TextBlob(x).sentiment.subjectivity

datas['polarity'] = datas['posts'].apply(pol)
datas['subjectivity'] = datas['posts'].apply(sub)
datas

Unnamed: 0,posts,personality_Names,polarity,subjectivity
ENFJ,'https://www.youtube.com/watch?v=PLAaiKvHvZs||...,Extraverted iNtuitive Feeling Judging,0.15736,0.550461
ENFP,"'He doesn't want to go on the trip without me,...",Extraverted iNtuitive Feeling Perceiving,0.151955,0.554108
ENTJ,'You're fired.|||That's another silly misconce...,Extraverted iNtuitive Thinking Judging,0.123833,0.530783
ENTP,'I'm finding the lack of me in these posts ver...,Extraverted iNtuitive Thinking Perceiving,0.121166,0.534339
ESFJ,'Why not?|||Any other ESFJs originally mistype...,Extraverted Sensing Feeling Judging,0.147629,0.538649
ESFP,'Edit: I forgot what board this was on.|||I am...,Extraverted Sensing Feeling Perceiving,0.127564,0.543597
ESTJ,"this is such a catch 22 |||I'm here! Although,...",Extraverted Sensing Thinking Judging,0.125126,0.531198
ESTP,Splinter Cell Blacklist for Xbox 360.|||ESTPs ...,Extraverted Sensing Thinking Perceiving,0.122432,0.53449
INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...,Introverted iNtuitive Feeling Judging,0.13331,0.537131
INFP,'I think we do agree. I personally don't consi...,Introverted iNtuitive Feeling Perceiving,0.130098,0.541689


In [6]:
# Let's plot the results
import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = [12, 8]

for index, Personality_type in enumerate(datas.index):
    x = datas.polarity.loc[Personality_type]
    y = datas.subjectivity.loc[Personality_type]
    plt.scatter(x, y, color='blue')
    plt.text(x+.001, y+.001, datas['personality_Names'][index], fontsize=10)
    plt.xlim(-.1, .3) 
    
plt.title('Sentiment Analysis', fontsize=20)
plt.xlabel('<-- Negative -------- Positive -->', fontsize=15)
plt.ylabel('<-- Facts -------- Opinions -->', fontsize=15)

plt.show()

<Figure size 1200x800 with 1 Axes>

## Sentiment of Text Chat

Instead of looking at the overall sentiment, let's see if there's anything interesting about the sentiment of each personality type throughout each posts.

In [7]:
# Split each chat into 8 parts
import numpy as np
import math

def split_text(text, n=8):
    '''Takes in a string of text and splits into n equal parts, with a default of 8 equal parts.'''

    # Calculate length of text, the size of each chunk of text and the starting points of each chunk of text
    length = len(text)
    size = math.floor(length / n)
    start = np.arange(0, length, size)
    
    # Pull out equally sized pieces of text and put it into a list
    split_list = []
    for piece in range(n):
        split_list.append(text[start[piece]:start[piece]+size])
    return split_list

In [8]:
# Let's take a look at our data again
datas

Unnamed: 0,posts,personality_Names,polarity,subjectivity
ENFJ,'https://www.youtube.com/watch?v=PLAaiKvHvZs||...,Extraverted iNtuitive Feeling Judging,0.15736,0.550461
ENFP,"'He doesn't want to go on the trip without me,...",Extraverted iNtuitive Feeling Perceiving,0.151955,0.554108
ENTJ,'You're fired.|||That's another silly misconce...,Extraverted iNtuitive Thinking Judging,0.123833,0.530783
ENTP,'I'm finding the lack of me in these posts ver...,Extraverted iNtuitive Thinking Perceiving,0.121166,0.534339
ESFJ,'Why not?|||Any other ESFJs originally mistype...,Extraverted Sensing Feeling Judging,0.147629,0.538649
ESFP,'Edit: I forgot what board this was on.|||I am...,Extraverted Sensing Feeling Perceiving,0.127564,0.543597
ESTJ,"this is such a catch 22 |||I'm here! Although,...",Extraverted Sensing Thinking Judging,0.125126,0.531198
ESTP,Splinter Cell Blacklist for Xbox 360.|||ESTPs ...,Extraverted Sensing Thinking Perceiving,0.122432,0.53449
INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...,Introverted iNtuitive Feeling Judging,0.13331,0.537131
INFP,'I think we do agree. I personally don't consi...,Introverted iNtuitive Feeling Perceiving,0.130098,0.541689


In [9]:
# Let's create a list to hold all of the pieces of text
list_pieces = []
for t in datas.posts:
    split = split_text(t)
    list_pieces.append(split)
    
list_pieces

In [None]:
# The list has 8 elements, one for each posts
len(list_pieces)

In [None]:
# Each post has been split into 8 pieces of text
len(list_pieces[0])

In [None]:
# Calculate the polarity for each piece of text

polarity_posts = []
for lp in list_pieces:
    polarity_piece = []
    for p in lp:
        polarity_piece.append(TextBlob(p).sentiment.polarity)
    polarity_posts.append(polarity_piece)
    
polarity_posts

In [None]:
# Show the plot for one personality type polarity
plt.plot(polarity_posts[0])
plt.title(datas['personality_Names'].index[0])
plt.show()

In [None]:
# Show the plot for all personality types
plt.rcParams['figure.figsize'] = [16, 12]

for index, Personality_type in enumerate(datas.index):    
    plt.subplot(4, 4, index+1)
    plt.plot(polarity_posts[index])
    plt.plot(np.arange(0,10), np.zeros(10))
    plt.title(datas['personality_Names'][index])
    plt.ylim(ymin=-.2, ymax=.3)
    
plt.show()