# Sentiment Analysis
---
This notebook aims to get the sentimental analysis and use the coefficients found in logistic regression model to update the weights of words related to anxiety. 

## Libraries Used
---

In [1]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
import pickle

from nltk.sentiment.vader import SentimentIntensityAnalyzer

## Import Data
---

In [2]:
df = pd.read_csv('../../data/clean/anx_writing.csv').drop(columns = ['Unnamed: 0'], axis = 1)
df.head()

Unnamed: 0,author,link_flair_text,num_comments,subreddit,created_utc,text
0,JackW357,DAE Questions,9,Anxiety,1606687976,Anyone else scared of dying and scared of when...
1,belladoll1021,Health,1,Anxiety,1606687615,Tight throat Can a tight throat and gagging fe...
2,ashwinderegg,Advice Needed,3,Anxiety,1606687588,Anxiety overriding my intuition. Does anyone e...
3,ashwinderegg,Advice Needed,7,Anxiety,1606687588,Anxiety overriding my intuition. Does anyone e...
4,lachapoxxx,Advice Needed,1,Anxiety,1606687488,hey friends! i need some advice my anxiety has...


In [3]:
#Instantiate Sentiment Analyzer
vader = SentimentIntensityAnalyzer()

In [4]:
#Create dictionary from coefficient csv file

word = pd.read_csv('../data/word_1_gram.csv')
word = dict(zip(word['Unnamed: 0'], word['coefficient']))
#copied from https://cmdlinetips.com/2021/04/convert-two-column-values-from-pandas-dataframe-to-a-dictionary/

In [5]:
#Update lexicon with coefficient dictionary
vader.lexicon.update(word)

In [6]:
#Create List of score for all the text
score = [vader.polarity_scores(str(sent)) for sent in df['text']]

In [7]:
#Get individual score into list
negative_score = [sub['neg'] for sub in score]
neutral_score = [sub['neu'] for sub in score]
positive_score = [sub['pos'] for sub in score]
compound_score = [sub['compound'] for sub in score]

In [8]:
#Create new columns for ths scores
df['negative_score'] = negative_score
df['neutral_score'] = neutral_score
df['positive_score'] = positive_score
df['compound_score'] = compound_score

In [9]:
df.head(3)

Unnamed: 0,author,link_flair_text,num_comments,subreddit,created_utc,text,negative_score,neutral_score,positive_score,compound_score
0,JackW357,DAE Questions,9,Anxiety,1606687976,Anyone else scared of dying and scared of when...,0.802,0.172,0.026,-0.9888
1,belladoll1021,Health,1,Anxiety,1606687615,Tight throat Can a tight throat and gagging fe...,0.894,0.106,0.0,-0.9783
2,ashwinderegg,Advice Needed,3,Anxiety,1606687588,Anxiety overriding my intuition. Does anyone e...,0.646,0.285,0.069,-0.999


In [10]:
df.tail(3)

Unnamed: 0,author,link_flair_text,num_comments,subreddit,created_utc,text,negative_score,neutral_score,positive_score,compound_score
5997,Midget_Cowboy,,30,writing,1631761085,"Proper Time to Introduce the ""Big Incident"" So...",0.212,0.485,0.303,0.0712
5998,throwaway5820175,Advice,5,writing,1631759726,Possible copyright issues? I'm wanting to writ...,0.192,0.323,0.485,0.9855
5999,Longjumping-Celery54,Other,1,writing,1604277033,"A poem I wrote called ""Another World"" She wake...",0.282,0.471,0.247,-0.7789


## Save Dataframe with Scores
---

In [11]:
df.to_csv('../data/anx_writing_sentiment_scores.csv')

## Save Dictionary to Pickle File

In [12]:
pickle.dump(word,open('../chatbot_code/chatbot/pickles/word_score.pkl','wb'))

## Recap
---
The updated lexicon will be used for the sentiment analysis to categorize different levels of anxiety.