**Machine Learning to predict public sentiment from text data**

In [120]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [121]:
data=pd.read_csv('/content/judge-1377884607_tweet_product_company.csv',encoding='ISO-8859-1')

In [122]:
data.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


In [123]:
data.drop('emotion_in_tweet_is_directed_at',axis=1,inplace=True) #dropping the column

In [124]:
data.head()

Unnamed: 0,tweet_text,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Positive emotion


In [125]:
messages=data[['tweet_text','is_there_an_emotion_directed_at_a_brand_or_product']]
messages.columns=['text','response']

In [126]:
messages.head()

Unnamed: 0,text,response
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Positive emotion


In [127]:
messages.isna().sum() # finding the NaN

text        1
response    0
dtype: int64

In [128]:
messages[1:10]

Unnamed: 0,text,response
1,@jessedee Know about @fludapp ? Awesome iPad/i...,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Positive emotion
5,@teachntech00 New iPad Apps For #SpeechTherapy...,No emotion toward brand or product
6,,No emotion toward brand or product
7,"#SXSW is just starting, #CTIA is around the co...",Positive emotion
8,Beautifully smart and simple idea RT @madebyma...,Positive emotion
9,Counting down the days to #sxsw plus strong Ca...,Positive emotion


In [129]:
messages=messages.dropna() # dropping NaN and resetting the index
messages=messages.reset_index(drop=True)

In [130]:
messages[1:10]

Unnamed: 0,text,response
1,@jessedee Know about @fludapp ? Awesome iPad/i...,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Positive emotion
5,@teachntech00 New iPad Apps For #SpeechTherapy...,No emotion toward brand or product
6,"#SXSW is just starting, #CTIA is around the co...",Positive emotion
7,Beautifully smart and simple idea RT @madebyma...,Positive emotion
8,Counting down the days to #sxsw plus strong Ca...,Positive emotion
9,Excited to meet the @samsungmobileus at #sxsw ...,Positive emotion


In [131]:
messages.response.unique() # finding the classes in the target column

array(['Negative emotion', 'Positive emotion',
       'No emotion toward brand or product', "I can't tell"], dtype=object)

In [132]:
from sklearn.preprocessing import LabelEncoder # Encoding the target column
le=LabelEncoder()
messages['response']=le.fit_transform(messages['response'])

In [133]:
messages[1:10]

Unnamed: 0,text,response
1,@jessedee Know about @fludapp ? Awesome iPad/i...,3
2,@swonderlin Can not wait for #iPad 2 also. The...,3
3,@sxsw I hope this year's festival isn't as cra...,1
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,3
5,@teachntech00 New iPad Apps For #SpeechTherapy...,2
6,"#SXSW is just starting, #CTIA is around the co...",3
7,Beautifully smart and simple idea RT @madebyma...,3
8,Counting down the days to #sxsw plus strong Ca...,3
9,Excited to meet the @samsungmobileus at #sxsw ...,3


The target column contain 4 responses :Negative emotion(1), No emotion toward brand or product(2),Positive emotion(3), and I can't tell(4)



In [134]:
#PRE_PROCESSING
from keras.preprocessing import text
tokenizer=text.Tokenizer()

In [135]:
tokenizer.fit_on_texts(list(messages['text'])) #tokenizing

In [136]:
tokenized_text=tokenizer.texts_to_sequences(messages['text'])

In [137]:
from keras.utils import pad_sequences #padding

In [138]:
X=pad_sequences(tokenized_text,maxlen=100)

In [139]:
Y=messages['response']

In [140]:
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.2)

In [141]:
# modelling using LSTM,using softmax and categorical_crossentropy since multiclass classification
from keras.models import Sequential
from keras.layers import Dense, LSTM,Embedding,Dropout

In [142]:
model=Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index)+1,output_dim=128,input_length=100))

In [143]:
model.add(LSTM(100))
model.add(Dropout(0.5))
model.add(Dense(50,activation='relu'))
model.add(Dense(4,activation='softmax'))

In [144]:
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics='accuracy')

In [145]:
model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_4 (Embedding)     (None, 100, 128)          1298944   
                                                                 
 lstm_4 (LSTM)               (None, 100)               91600     
                                                                 
 dropout_4 (Dropout)         (None, 100)               0         
                                                                 
 dense_8 (Dense)             (None, 50)                5050      
                                                                 
 dense_9 (Dense)             (None, 4)                 204       
                                                                 
Total params: 1395798 (5.32 MB)
Trainable params: 1395798 (5.32 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [146]:
history=model.fit(X_train,Y_train,epochs=10,validation_split=0.1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [147]:
y_pred=model.predict(X_test) # model prediction



In [148]:
y_preds = np.argmax(y_pred,axis=1)

In [177]:
y_preds

array([2, 3, 3, ..., 2, 2, 3])

In [178]:
out_response = le.inverse_transform(y_preds) #decoding the prediction response

In [156]:
from sklearn.metrics import accuracy_score
accuracy_score(y_preds,Y_test)

0.685541506322155

Testing the model with new input

In [208]:
test_text='After using my iPod as a phone for 3 hours I have once again decided how much I hate apple... #somanycomplaints #wantmyhtcback'
text1=tokenizer.texts_to_sequences([test_text])
text1=pad_sequences(text1,maxlen=100)
output=model.predict(text1)
output=np.argmax(output,axis=1)



In [209]:
out1= le.inverse_transform(output)
print('Response :',out1)

Response : ['Negative emotion']


In [216]:
test_text1='I love the iPhone do not disturb setting. #Apple #Useful #Relax #Quiet #Disconnect'
text1=tokenizer.texts_to_sequences([test_text1])
text1=pad_sequences(text1,maxlen=100)
output=model.predict(text1)
output=np.argmax(output,axis=1)



In [217]:
out1= le.inverse_transform(output)
print('Response:',out1)

Response: ['Positive emotion']


In [214]:
test_text1='The Apple store at town opens at noon and closes early!'
text1=tokenizer.texts_to_sequences([test_text1])
text1=pad_sequences(text1,maxlen=100)
output=model.predict(text1)
output=np.argmax(output,axis=1)



In [215]:
out1= le.inverse_transform(output)
print('Response:',out1)

Response: ['No emotion toward brand or product']
