## <div style ="font-size:25px; border-radius:25px; border:3666; padding:10px;  background-color:#fffafa; text-align:center; color:#000000;">Long Short Term Memory</div>


<b>Long Short-Term Memory :</b> LSTM (Long Short-Term Memory) is a type of RNN that uses memory cells and gating mechanisms (input, output, and forget gates) to capture long-term dependencies and prevent issues like vanishing gradients

<img src="https://databasecamp.de/wp-content/uploads/lstm-architecture-1024x709.png">

<div style="font-size:25px; color:#000000; background-color:#ffffff; padding:25px; ">


<p><b>Cell State: </b>
This is the key feature of LSTMs, responsible for maintaining information across long sequences. It serves as a kind of "memory" that can persist or be modified across time steps.
</p>
<p><b> Forget Gate:
 </b>Decides what information from the previous cell state should be discarded. It takes the previous hidden state and the current input, applies a sigmoid activation function, and outputs a number between 0 and 1 for each number in the cell state, determining what to forget.
 </p>
<p><b>Input Gate:
 </b>Decides what new information should be stored in the cell state. It has two parts:
A sigmoid function to control which values to update.
A tanh function to create a new candidate vector  for the cell state.

</p>
<p><b>Cell State Update:
 </b>The cell state is updated by combining the forget gate's decision and the new candidate information.
</p>
<p><b>Output Gate:
 </b>Decides what part of the cell state should be output as the new hidden state.The hidden state is computed by applying a tanh function on the updated cell state, modulated by the output gate’s decision.

</p>
</div>

In [8]:
import pandas as pd
import numpy as np


from sklearn.feature_extraction.text import CountVectorizer
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, SpatialDropout1D,LSTM
from sklearn.model_selection import train_test_split
import re

In [9]:
df = pd.read_csv("D:\Projects\Deep Learning\Deep-Learning\Sentiment.csv")
df.columns

  df = pd.read_csv("D:\Projects\Deep Learning\Deep-Learning\Sentiment.csv")


Index(['id', 'candidate', 'candidate_confidence', 'relevant_yn',
       'relevant_yn_confidence', 'sentiment', 'sentiment_confidence',
       'subject_matter', 'subject_matter_confidence', 'candidate_gold', 'name',
       'relevant_yn_gold', 'retweet_count', 'sentiment_gold',
       'subject_matter_gold', 'text', 'tweet_coord', 'tweet_created',
       'tweet_id', 'tweet_location', 'user_timezone'],
      dtype='object')

In [10]:
# Keeping only the neccessary columns
data = df[['text','sentiment']]

In [11]:
data = data[data.sentiment != "Neutral"]
data['text'] = data['text'].apply(lambda x: x.lower())
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))

print(data[ data['sentiment'] == 'Positive'].size)
print(data[ data['sentiment'] == 'Negative'].size)

for idx,row in data.iterrows():
    row[0] = row[0].replace('rt',' ')
    
max_fatures = 2000
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(data['text'].values)
X = tokenizer.texts_to_sequences(data['text'].values)
X = pad_sequences(X)

  data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))
  row[0] = row[0].replace('rt',' ')
  row[0] = row[0].replace('rt',' ')


4472
16986


In [12]:
embed_dim = 128
lstm_out = 196

model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(196, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(model.summary())



None


In [13]:
Y = pd.get_dummies(data['sentiment']).values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)

(7188, 28) (7188, 2)
(3541, 28) (3541, 2)


In [14]:
batch_size = 32
model.fit(X_train, Y_train, epochs = 7, batch_size=batch_size, verbose = 2)

Epoch 1/7
225/225 - 10s - 43ms/step - accuracy: 0.8086 - loss: 0.4506
Epoch 2/7
225/225 - 8s - 35ms/step - accuracy: 0.8620 - loss: 0.3278
Epoch 3/7
225/225 - 8s - 33ms/step - accuracy: 0.8787 - loss: 0.2873
Epoch 4/7
225/225 - 6s - 26ms/step - accuracy: 0.8944 - loss: 0.2601
Epoch 5/7
225/225 - 7s - 31ms/step - accuracy: 0.9075 - loss: 0.2342
Epoch 6/7
225/225 - 5s - 23ms/step - accuracy: 0.9169 - loss: 0.2110
Epoch 7/7
225/225 - 5s - 24ms/step - accuracy: 0.9203 - loss: 0.1949


<keras.src.callbacks.history.History at 0x211031c0590>

In [15]:
validation_size = 1500

X_validate = X_test[-validation_size:]
Y_validate = Y_test[-validation_size:]
X_test = X_test[:-validation_size]
Y_test = Y_test[:-validation_size]
score,acc = model.evaluate(X_test, Y_test, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

64/64 - 1s - 14ms/step - accuracy: 0.8422 - loss: 0.3886
score: 0.39
acc: 0.84
