# Sentiment analysis
Tutorial from [Analytics Vidhya](https://www.analyticsvidhya.com/blog/2019/11/comprehensive-guide-attention-mechanism-deep-learning/) by Prodip Hore and Sayan Chatterjee

## Dataset

UCI Machine Learning Repository: Sentiment Labelled Sentences Data Set
('From Group to Individual Labels using Deep Features', Kotzias et. al,. KDD 2015)

Sentences: 3000
Labels: Positive (1) - Negative (0)


Example:

* "The mic is great." Positive ->  `The mic is great.	1`

* "What a waste of money and time!." Negative -> `What a waste of money and time!.	0`


## Architecture

Input layer -> Embedding layer -> LSTM -> Dense (softmax) -> Label

In [1]:
import numpy as np
with open('data/amazon.txt', mode='r') as f:
    lines = f.readlines()
    
with open('data/yelp.txt', mode='r') as f:
    lines += f.readlines()

sentences = [line.split('\t')[0] for line in lines]
labels = [int(line.split('\t')[1]) for line in lines]
labels = np.asarray(labels)
print(len(labels))

2000


In [2]:
from tensorflow.keras.preprocessing.text import Tokenizer

t = Tokenizer()
t.fit_on_texts(sentences)
text_matrix= t.texts_to_sequences(sentences)

len_mat = []
for i in range(len(text_matrix)):
    len_mat.append(len(text_matrix[i]))

vocab_size = len(t.word_index) + 1

In [3]:
from tensorflow.keras.preprocessing.sequence import pad_sequences

features = 32

tex_pad = pad_sequences(text_matrix, maxlen=features, padding='post')

x_train = tex_pad[:1600,:]
y_train = labels[:1600]
x_test = tex_pad[1600:,:]
y_test = labels[1600:]

print(len(x_train))
print(len(y_train))
print(len(x_test))
print(len(y_test))

1600
1600
400
400


In [4]:
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.regularizers import l2

inputs = Input(shape=(features,))
embedding = Embedding(input_dim=vocab_size, output_dim=features, input_length=features, embeddings_regularizer=l2(.001))
embd_out = embedding(inputs)
lstm = LSTM(100, dropout=0.3, recurrent_dropout=0.2)
lstm_out = lstm(embd_out)

prob = Dense(1, activation='sigmoid')
outputs = prob(lstm_out)

model = Model(inputs, outputs)

print(model.summary())

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 32)]              0         
_________________________________________________________________
embedding (Embedding)        (None, 32, 32)            104288    
_________________________________________________________________
lstm (LSTM)                  (None, 100)               53200     
_________________________________________________________________
dense (Dense)                (None, 1)                 101       
Total params: 157,589
Trainable params: 157,589
Non-trainable params: 0
_________________________________________________________________
None


In [5]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc'])


model.fit(x=x_train,y=y_train,
          batch_size=100,
          epochs=10,
          verbose=1,
          shuffle=True,
          validation_data=(x_test,y_test)
         )

Train on 1600 samples, validate on 400 samples
Epoch 1/10


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f7d641dfa58>

In [6]:
print(t.sequences_to_texts(x_test[:10]))
print(y_test[:10])

pred = model.predict(x_test[:10])
print(pred)

['i miss it and wish they had one in philadelphia', 'we got sitting fairly fast but ended up waiting 40 minutes just to place our order another 30 minutes before the food arrived', 'they also have the best cheese crisp in town', 'good value great food great service', "couldn't ask for a more satisfying meal", 'the food is good', 'it was awesome', 'i just wanted to leave', 'we made the drive all the way from north scottsdale and i was not one bit disappointed', 'i will not be eating there again']
[1 0 1 1 1 1 1 0 1 0]
[[0.10233185]
 [0.06516251]
 [0.98587275]
 [0.9905857 ]
 [0.10396948]
 [0.988444  ]
 [0.98555195]
 [0.97717357]
 [0.03623567]
 [0.03208158]]


## Architecture

Input layer -> Embedding layer -> LSTM -> Attention -> Dense (softmax) -> Label

### Attention (Bahdanau et al., 2015)
Additive Attention

1. $\large score(s_t, h_i) = v_a^T \text{tanh}(W_a[s_t;h_i])$ -> $\large \text{tanh}(W_ah + b)$

2. $\large \alpha_{ti}=\frac{exp(score_{ti})}{\sum_{k=1}{N}{exp(score_{tk})}}$

3. $\large \alpha \cdot h$

In [7]:
import tensorflow.keras.backend as K
from tensorflow.keras.layers import Layer

class BahdanauAttention(Layer):
    def __init__(self, **kwargs):
        super(BahdanauAttention, self).__init__(**kwargs)
    
    def build(self, input_shape):
        self.W=self.add_weight(name="att_weight",shape=(input_shape[-1],1),initializer="normal")
        self.b=self.add_weight(name="att_bias",shape=(input_shape[1],1),initializer="zeros")        
        super(BahdanauAttention, self).build(input_shape)
    
    def call(self, x):
        et=K.squeeze(K.tanh(K.dot(x,self.W)+self.b),axis=-1)
        at=K.softmax(et)
        at=K.expand_dims(at,axis=-1)
        output=x*at
        return K.sum(output,axis=1), at
    
    def compute_output_shape(self, input_shape):
        return (input_shape[0],input_shape[-1])
    
    def get_config(self):
        return super(BahdanauAttention, self).get_config()


In [8]:
from tensorflow.keras.layers import Attention, GlobalAveragePooling1D

inputs1 = Input(shape=(features,))
embedding1 = Embedding(input_dim=vocab_size, output_dim=features, input_length=features, embeddings_regularizer=l2(.001))
embd_out1 = embedding1(inputs1)
lstm1 = LSTM(100, dropout=0.3, recurrent_dropout=0.2, return_sequences=True)
lstm_out1 = lstm1(embd_out1)

# attention = GlobalAveragePooling1D(Attention()([lstm_out1, lstm_out1]))
weighted_values, weights = BahdanauAttention()(lstm_out1)

prob1 = Dense(1, activation='sigmoid')
outputs1 = prob(weighted_values)

model1 = Model(inputs1, outputs1)
attention_model = Model(inputs1, weights)


print(model1.summary())

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 32)]              0         
_________________________________________________________________
embedding_1 (Embedding)      (None, 32, 32)            104288    
_________________________________________________________________
lstm_1 (LSTM)                (None, 32, 100)           53200     
_________________________________________________________________
bahdanau_attention (Bahdanau ((None, 100), (None, 32,  132       
_________________________________________________________________
dense (Dense)                (None, 1)                 101       
Total params: 157,721
Trainable params: 157,721
Non-trainable params: 0
_________________________________________________________________
None


In [9]:
model1.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc'])
model1.fit(x=x_train,y=y_train,
          batch_size=100,
          epochs=10,
          verbose=1,
          shuffle=True,
          validation_data=(x_test,y_test)
          )

Train on 1600 samples, validate on 400 samples
Epoch 1/10


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f7cfd08f080>

In [16]:
print(t.sequences_to_texts(x_test[:10]))
print(x_test[:10])
print(y_test[:10])

['i miss it and wish they had one in philadelphia', 'we got sitting fairly fast but ended up waiting 40 minutes just to place our order another 30 minutes before the food arrived', 'they also have the best cheese crisp in town', 'good value great food great service', "couldn't ask for a more satisfying meal", 'the food is good', 'it was awesome', 'i just wanted to leave', 'we made the drive all the way from north scottsdale and i was not one bit disappointed', 'i will not be eating there again']
[[   3 2866    5    2 1101   37   25   40   14 2867    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0]
 [  32  108  819  756  331   28  661   52  425  727  124   50    6   26
    78  198  209  592  124  205    1   24  364    0    0    0    0    0
     0    0    0    0]
 [  37   63   22    1   53  912  854   14  444    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0

In [18]:
pred = model1.predict(x_test[:10])
attention_pred = attention_model.predict(x_test[:10])

print(attention_pred.shape)
print(np.argmax(attention_pred, axis=1))
print(pred)

(10, 32, 1)
[[31]
 [31]
 [20]
 [18]
 [31]
 [23]
 [24]
 [31]
 [31]
 [22]]
[[0.9543078 ]
 [0.07817101]
 [0.9978306 ]
 [0.9991669 ]
 [0.01837063]
 [0.99657476]
 [0.99649674]
 [0.06141832]
 [0.04383131]
 [0.005243  ]]


In [19]:
print(attention_pred.reshape(10,1,32))

[[[0.02454229 0.02461432 0.02473488 0.02483217 0.02492744 0.02502763
   0.02515558 0.02524929 0.02535192 0.02551333 0.02570811 0.02591711
   0.02617419 0.02654193 0.02705272 0.0276044  0.02817859 0.02858539
   0.02883467 0.02894229 0.02918516 0.02973128 0.03076462 0.03247773
   0.03503154 0.03831901 0.04183935 0.0448849  0.04706173 0.04841043
   0.04918556 0.04962051]]

 [[0.02423973 0.02435366 0.02445743 0.0245304  0.02461175 0.0247054
   0.02480593 0.02488237 0.02498384 0.02517285 0.02540021 0.02566215
   0.02598485 0.02635182 0.02687965 0.02740537 0.02796045 0.02833679
   0.02855904 0.02867477 0.02896723 0.02950063 0.03050787 0.03215694
   0.03459605 0.03778385 0.04144003 0.04501927 0.048005   0.05015386
   0.05153949 0.05237128]]

 [[0.01947381 0.01953141 0.01964355 0.01972592 0.01983152 0.01992946
   0.02006838 0.02023746 0.02054298 0.02104525 0.0218302  0.02299102
   0.02472147 0.02721487 0.03044251 0.03386971 0.03684956 0.03887039
   0.0400018  0.04045329 0.04058119 0.04055452 0