Use Jupyter notebook
Gensim version = 4.3.0

In [1]:
import pandas as pd
import numpy as np
import nltk
nltk.download('wordnet')
import re
import contractions
from bs4 import BeautifulSoup
#!pip install gensim
import gensim.downloader as api
from gensim.models import Word2Vec
import multiprocessing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Perceptron
from sklearn.svm import LinearSVC
from sklearn.metrics import f1_score, precision_score, recall_score  
#!pip install tensorflow
import tensorflow as tf
import tensorflow.keras as keras

#remove warnings in output
import warnings
warnings.filterwarnings('ignore')


[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\dipal\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


## 1. Dataset Generation

Getting dataset from input file

In [2]:
df = pd.read_table("amazon_reviews_us_Beauty_v1_00.tsv", on_bad_lines='skip')
df = df[["star_rating", "review_body"]]

Created new column class to separate classes as Class 1, 2 and 3 in according to the star rating. Removed the grabage data.
Sampled 20000 instances of each class to get dataset of 60000 instances.

In [3]:
df["class"] = df["star_rating"].apply(lambda x : 3 if str(x) == '4' or str(x) == '5' 
                                            else 2 if str(x) == '3' 
                                            else 1 if str(x) == '2' or str(x) == '1'
                                            else 0)

df.drop(df[(df['class'] == 0)].index, inplace=True)
df = df.groupby('class').sample(n=20000, replace=True)


<b>Preprocessing of data</b> -
Removed html tags, urls, non-alphabetical letters and extra spaces from the reviews. Performed contractions on every word in review.

In [4]:
#remove html tags
df['review_body'] = df['review_body'].apply(lambda x: BeautifulSoup(str(x)).get_text())

#remove url
df['review_body'] = df['review_body'].apply(lambda x: re.split('https:\/\/.*', str(x))[0])

#remove non-alphabetical words
df['review_body'] = df['review_body'].replace('[^a-zA-Z ]', '', regex=True)

#remove extra spaces
df['review_body'] = df['review_body'].str.strip()

#perform contractions
df['review_body'] = df['review_body'].apply(lambda x: contractions.fix(str(x)))

Created new column to save reviews as a review split into list of words 

In [9]:
df['review_body1'] = df['review_body'].apply(lambda x: x.split())

Removed the stop words

In [5]:
from nltk.corpus import stopwords 
nltk.download('stopwords')

stopwords = stopwords.words('english')

df['review_body'] = df['review_body'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stopwords)]))

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\dipal\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


## 2. Word Embedding

### Q.2 a) Pretrained Word2Vec model

Used existing api.load() method to load pre-trained dataset.

In [6]:
wv = api.load('word2vec-google-news-300')

In [55]:
result = wv.most_similar(positive=['King', 'Woman'], negative=['Man'], topn = 5)
print("Similar words:")
result

Similar words:


[('Queen', 0.4929387867450714),
 ('Tupou_V.', 0.45174285769462585),
 ('Oprah_BFF_Gayle', 0.4422132968902588),
 ('Jackson', 0.440250426530838),
 ('NECN_Alison', 0.4331282675266266)]

In [54]:
result = wv.similarity('excellent', 'outstanding')
print("Similarity between \'excellent\' and \'outstanding\' :" + str(result))
result = wv.most_similar(positive=['excellent', 'outstanding'], topn = 5)
print("Similar words:")
result

Similarity between 'excellent' and 'outstanding' :0.55674857
Similar words:


[('oustanding', 0.750198483467102),
 ('exceptional', 0.7280517220497131),
 ('terrific', 0.7081279158592224),
 ('superb', 0.6691538095474243),
 ('exemplary', 0.6476037502288818)]

In [56]:
result = wv.similarity('Amazing', 'Good')
print("Similarity between \'Amazing\' and \'Good\' :" + str(result))
result = wv.most_similar(positive=['Amazing', 'Good'], topn = 5)
print("Similar words:")
result

Similarity between 'Amazing' and 'Good' :0.38268512
Similar words:


[('Awesome', 0.6103745102882385),
 ('Terrific', 0.5929716229438782),
 ('Awful', 0.5901114344596863),
 ('Wonderful', 0.5890116095542908),
 ('Bad', 0.582302987575531)]

In [57]:
result = wv.similarity('hair', 'straightener')
print("Similarity between \'hair\' and \'straightener\' :" + str(result))
result = wv.most_similar(positive=['hair', 'straightener'], topn = 5)
print("Similar words:")
result

Similarity between 'hair' and 'straightener' :0.45512608
Similar words:


[('tresses', 0.7045788168907166),
 ('straightening_iron', 0.6670017838478088),
 ('blowdry', 0.6644049286842346),
 ('straighteners', 0.6463422179222107),
 ('curly_hair', 0.6419693827629089)]

In [84]:
result = wv.most_similar(positive=['sleep', 'morning'], negative=['night'], topn = 5)
print("Similar words:")
result

Similar words:


[('nap', 0.5254397392272949),
 ('sleeping', 0.5071698427200317),
 ('naps', 0.5010553002357483),
 ('restful_sleep', 0.5006630420684814),
 ('doze', 0.4848100543022156)]

### Q.2 b) Trained Word2Vec model from local dataset

Built the vocab for model using the parameters given in question and reviews that are split into words. Trained the model to create a word vector of given vocab.

In [17]:
num_workers = multiprocessing.cpu_count()

model = Word2Vec(vector_size=300, window=13, min_count=9, workers=num_workers)
model.build_vocab(df.review_body1, progress_per=10000)
model.train(df.review_body1, total_examples=model.corpus_count, epochs=model.epochs)

(7014953, 8649930)

In [12]:
# result = model.wv.most_similar(positive=['King', 'Woman'], negative=['Man'])
# result
print('Word \'Woman\' does not exist in our vocabulary. Hence cannot find the similarities')

Word 'Woman' does not exist in our vocabulary. Hence cannot find the similarities


In [72]:
result = model.wv.similarity('excellent', 'outstanding')
print("Similarity between \'excellent\' and \'outstanding\' :" + str(result))
result = model.wv.most_similar(positive=['excellent', 'outstanding'], topn = 5)
print("Similar words:")
result

Similarity between 'excellent' and 'outstanding' :0.6133924
Similar words:


[('Quick', 0.8102356195449829),
 ('Wonderful', 0.7927247881889343),
 ('par', 0.779434323310852),
 ('speedy', 0.7787030339241028),
 ('competitive', 0.7593490481376648)]

In [73]:
result = model.wv.similarity('Amazing', 'Good')
print("Similarity between \'Amazing\' and \'Good\' :" + str(result))
result = model.wv.most_similar(positive=['Amazing', 'Good'], topn = 5)
print("Similar words:")
result

Similarity between 'Amazing' and 'Good' :0.51051456
Similar words:


[('Excellent', 0.8870605826377869),
 ('Great', 0.8442416191101074),
 ('Wonderful', 0.8369162678718567),
 ('Awesome', 0.8105168342590332),
 ('Decent', 0.8068863153457642)]

In [75]:
result = model.wv.similarity('hair', 'straightener')
print("Similarity between \'hair\' and \'straightener\' :" + str(result))
result = model.wv.most_similar(positive=['hair', 'straightener'], topn = 5)
print("Similar words:")
result

Similarity between 'hair' and 'straightener' :0.5208654
Similar words:


[('straighten', 0.8750419616699219),
 ('wavy', 0.8607210516929626),
 ('straightening', 0.8529598116874695),
 ('curly', 0.8519555330276489),
 ('straight', 0.851128339767456)]

In [83]:
result = model.wv.most_similar(positive=['sleep', 'morning'], negative=['night'], topn = 5)
print("Similar words:")
result

Similar words:


[('sleeping', 0.80987948179245),
 ('wake', 0.803053617477417),
 ('woke', 0.7903323173522949),
 ('bed', 0.7244474291801453),
 ('arms', 0.7103226780891418)]

<b>Q. What do you conclude from comparing vectors generated by yourself and the pretrained model? </b><br>
A. Pretrained model has better vocab than our own model. Hence it captures better similarities of words in most of the cases. <br>
As the vocab size increases, the similarities vector improvises.<br><br>
<b>Q. Which of the Word2Vec models seems to encode semantic similarities between words better? </b><br>
A. Pretrained model seems to encode semantic similarities between words better.




## 3. Simple models

Created average vectors for each review. For this, first added all the vectors in each review in a list. Then, calculated the mean of those vectors using np.mean(). If there is no vector, we append 300 zeroes as a replacemet since the original vector will have 300 values. 

In [97]:
avg_vector = []

for review in df['review_body1']:
    vectors = []
    for word in review:
        if word in wv.key_to_index:
            vectors.append(wv.get_vector(word))
    
    if len(vectors) > 0:
        avg_vector.append(np.mean(vectors, axis = 0))
    else:
        avg_vector.append(np.zeros(300))

Split the code into train and test dataset

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(avg_vector, df['class'], stratify=df['class'], 
                                                    test_size=0.2, random_state=42)

<b> Perceptron model </b><br>

In [16]:
model_p = Perceptron(random_state=5)
model_p.fit(X_train, Y_train)

#Testing the model
Y_pred = model_p.predict(X_test)

precision_score_p = precision_score(Y_test, Y_pred, average=None)
recall_score_p = recall_score(Y_test, Y_pred, average=None)
f1_score_p = f1_score(Y_test, Y_pred, average=None)

print("Perceptron model output:")
print("class1: ", precision_score_p[0], ", ", recall_score_p[0], ", ", f1_score_p[0])
print("class2: ", precision_score_p[1], ", ", recall_score_p[1], ", ", f1_score_p[1])
print("class3: ", precision_score_p[2], ", ", recall_score_p[2], ", ", f1_score_p[2])
print("average:", precision_score(Y_test, Y_pred, average='weighted'),", ", recall_score(Y_test, Y_pred, average='weighted'),
     ", ", f1_score(Y_test, Y_pred, average='weighted'))

Perceptron model output:
class1:  0.6569439840901558 ,  0.4955 ,  0.5649137808180134
class2:  0.4934014474244359 ,  0.5795 ,  0.532996091055415
class3:  0.6413068844807468 ,  0.687 ,  0.6633675316837658
average: 0.5972174386651129 ,  0.5873333333333334 ,  0.5870924678523981


In [93]:
print("Perceptron model output with TF-IDF (Taken from HW1) : \n",
"class1:  0.6009032564772997 ,  0.632 ,  0.6160594614353601 \n",
"class2:  0.5062782521346058 ,  0.504 ,  0.5051365572538211 \n",
"class3:  0.6688533193387562 ,  0.63725 ,  0.652669312508001 \n",
"average: 0.5920116093168872 ,  0.5910833333333333 ,  0.5912884437323942")

Perceptron model output with TF-IDF (Taken from HW1) : 
 class1:  0.6009032564772997 ,  0.632 ,  0.6160594614353601 
 class2:  0.5062782521346058 ,  0.504 ,  0.5051365572538211 
 class3:  0.6688533193387562 ,  0.63725 ,  0.652669312508001 
 average: 0.5920116093168872 ,  0.5910833333333333 ,  0.5912884437323942


<b> SVM Model </b>

In [17]:
model_s = LinearSVC(random_state=5) 
model_s.fit(X_train, Y_train)

#Testing the model
Y_pred = model_s.predict(X_test)

precision_score_s = precision_score(Y_test, Y_pred, average=None)
recall_score_s = recall_score(Y_test, Y_pred, average=None)
f1_score_s = f1_score(Y_test, Y_pred, average=None)

print("SVM model output:")
print("class1: ", precision_score_s[0], ", ", recall_score_s[0], ", ", f1_score_s[0])
print("class2: ", precision_score_s[1], ", ", recall_score_s[1], ", ", f1_score_s[1])
print("class3: ", precision_score_s[2], ", ", recall_score_s[2], ", ", f1_score_s[2])
print("average:", precision_score(Y_test, Y_pred, average='weighted'),", ", recall_score(Y_test, Y_pred, average='weighted'),
     ", ", f1_score(Y_test, Y_pred, average='weighted'))

SVM model output:
class1:  0.6340898564150069 ,  0.6845 ,  0.6583313296465496
class2:  0.5646145313366612 ,  0.509 ,  0.5353668156718381
class3:  0.6874386653581943 ,  0.7005 ,  0.6939078751857355
average: 0.6287143510366208 ,  0.6313333333333333 ,  0.6292020068347077


In [95]:
print("SVM model output with TF-IDF (Taken from HW1): \n",
"class1:  0.6601401982112642 ,  0.68275 ,  0.67125476219737 \n",
"class2:  0.5764705882352941 ,  0.539 ,  0.5571059431524549 \n",
"class3:  0.7215619694397284 ,  0.74375 ,  0.7324879970454267 \n", 
"average: 0.6527242519620956 ,  0.6551666666666667 ,  0.6536162341317505")

SVM model output with TF-IDF (Taken from HW1): 
 class1:  0.6601401982112642 ,  0.68275 ,  0.67125476219737 
 class2:  0.5764705882352941 ,  0.539 ,  0.5571059431524549 
 class3:  0.7215619694397284 ,  0.74375 ,  0.7324879970454267 
 average: 0.6527242519620956 ,  0.6551666666666667 ,  0.6536162341317505


<b>Q. What do you conclude from comparing performances for the models trained using the two different feature types (TF-IDF and your trained Word2Vec features)?</b><br>
A. In case of Perceptron, both the models with TF-IDF and Word2Vec features give almost similar accuracies. <br>
However, TF-IDF gives better accuracies than Word2Vec features in both of the models. This is because TF-IDF captures the context of local data whereas Word2Vec considers relationship with entire dataset. This is the reason TF-IDF is working better with smaller dataset.


## 4. Feedforward Neural Networks

### Q. 4a) FNN with average vectors

1. Created Sequential model using Keras. Then added 2 hidden layers with parameters described in question. <br>
2. Created final output model with 3 units since its a classification with 3 classes. <br>
3. Compiled the model using 'categorical_crossentropy' loss. <br>
4. Converted all training and testing data of average vectors from previous steps into numpy array to use it in further steps. <br>
5. Converted output data into categorical data with mention of 3 classes. <br>
6. Fit the model with 15 epochs and batch_size as 32. <br>
7. Finally, evaluated the test accuracy of data.

In [99]:
model_fnn = keras.models.Sequential()

model_fnn.add(keras.layers.Dense(units=100, activation='relu', input_dim = 300))
model_fnn.add(keras.layers.Dense(units=10, activation='relu'))
model_fnn.add(keras.layers.Dense(units=3, activation='softmax'))

model_fnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

X_train_array_fnn = np.array(X_train)
X_test_array_fnn = np.array(X_test)
Y_train_array_fnn = np.array(Y_train)
Y_test_array_fnn = np.array(Y_test)

Y_train_onehot_fnn = keras.utils.to_categorical(Y_train_array_fnn - 1, num_classes = 3)
Y_test_onehot_fnn = keras.utils.to_categorical(Y_test_array_fnn - 1, num_classes = 3)

model_fnn.fit(X_train_array_fnn, Y_train_onehot_fnn, epochs = 15, verbose = 1, validation_split = 0, 
         validation_data = (X_test_array_fnn, Y_test_onehot_fnn))

test_loss, test_acc = model_fnn.evaluate(X_test_array_fnn, Y_test_onehot_fnn)

print("Test accuracy for FNN with average vectors: ", test_acc)
#predictions = model.predict(new_data)


Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15
Test accuracy for FNN with average vectors:  0.6524999737739563


### Q. 4b) FNN with concatenated vectors

Created concatenated vectors for each review. For this, first concatenated all the vectors in each review in a list. Then, if the length of vectors is less than 10, appended vectors with 300 zeroes to make atleast 10 vectors. Finally selected only first 10 vectors from the review.

In [100]:
concat_vector = []

for review in df['review_body1']:
    vectors = []
    i = 0
    for word in review:
        if word in wv.key_to_index:
            vectors.append(wv.get_vector(word))
        
    if len(vectors) < 10:
        count = len(vectors)      
        while count <= 10:
            vectors.append(np.zeros(300))
            count += 1
           
    concat_vector.append(vectors[:10])

1. Split the concatenated vectors into training and testing data. <br>
2. Created Sequential model using Keras. Then added 2 hidden layers with parameters described in question. <br>
3. Created final output model with 3 units since its a classification with 3 classes. <br>
4. Compiled the model using 'categorical_crossentropy' loss. <br>
5. Converted all training and testing data into numpy array to use it in further steps. <br>
6. Reshaped the input arrays as per the need of model input. <br>
7. Converted output data into categorical data with mention of 3 classes. <br>
8. Fit the model with 10 epochs and batch_size as 32. <br>
9. Finally, evaluated the test accuracy of data.

In [101]:
X_train_fnn, X_test_fnn, Y_train_fnn, Y_test_fnn = train_test_split(concat_vector, df['class'], stratify=df['class'], 
                                                    test_size=0.2, random_state=42)


model_fnn = keras.models.Sequential()

model_fnn.add(keras.layers.Dense(units=100, activation='relu', input_dim = 3000))
model_fnn.add(keras.layers.Dense(units=10, activation='relu'))
model_fnn.add(keras.layers.Dense(units=3, activation='softmax'))

model_fnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

X_train_array_fnn = np.array(X_train_fnn)
X_test_array_fnn = np.array(X_test_fnn)
Y_train_array_fnn = np.array(Y_train_fnn)
Y_test_array_fnn = np.array(Y_test_fnn)

X_train_array_fnn = X_train_array_fnn.reshape(48000, 3000)
X_test_array_fnn = X_test_array_fnn.reshape(12000, 3000)

Y_train_onehot_fnn = keras.utils.to_categorical(Y_train_array_fnn - 1, num_classes = 3)
Y_test_onehot_fnn = keras.utils.to_categorical(Y_test_array_fnn - 1, num_classes = 3)


model_fnn.fit(X_train_array_fnn, Y_train_onehot_fnn, epochs = 10, verbose = 1, validation_split = 0, 
         validation_data = (X_test_array_fnn, Y_test_onehot_fnn))

test_loss, test_acc = model_fnn.evaluate(X_test_array_fnn, Y_test_onehot_fnn)

print("Test accuracy for FNN with concatenated vectors: ", test_acc)
#predictions = model.predict(new_data)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy for FNN with concatenated vectors:  0.5435000061988831


<b> Q. What do you conclude by comparing accuracy values you obtain with those obtained in the “’Simple Models” section? </b><br>
A. FNN with average vectors give better accuracies than simple models. But the FNN with concatenated vectors does not give as good accuracy as simple models.
This is because FNNs build more complex relationships than simple models. But in concatenated vectors, we are considering only first 10 words of reviews and dropping the rest. Hence, it affects the accuracy of model.


## 5. Recurrent Neural Networks 

Created concatenated vectors for each review. For this, first concatenated indexes of all the vectors in each review in a list. Then, if the length of vectors is less than 10, appended vectors with zero to make atleast 20 vectors. Finally selected only first 20 vectors from the review.

In [22]:
rnn_vector = []

for review in df['review_body1']:
    vectors = []
    i = 0
    for word in review:
        if word in wv.key_to_index:
            vectors.append(wv.key_to_index[word])
    
    
    if len(vectors) < 20:
        count = len(vectors)      
        while count <= 20:
            vectors.append(0)
            count += 1
           
    rnn_vector.append(vectors[:20])

Split the concatenated vectors into training and testing data.

In [37]:
X_train_rnn, X_test_rnn, Y_train_rnn, Y_test_rnn = train_test_split(rnn_vector, df['class'], stratify=df['class'], 
                                                    test_size=0.2, random_state=42)


### Q. 5a) Simple RNN

1. Created Sequential model using Keras. 
2. Added embedding layer into RNN to give vector embedding input. Then added RNNlayer. <br>
3. Created final output model with 3 units since its a classification with 3 classes. <br>
4. Compiled the model using 'categorical_crossentropy' loss. <br>
5. Converted all training and testing data into numpy array to use it in further steps. <br>
6. Converted output data into categorical data with mention of 3 classes. <br>
7. Fit the model with 10 epochs and batch_size as 32. <br>
8. Finally, evaluated the test accuracy of data.

In [7]:
model_rnn = keras.models.Sequential()
model_rnn.add(keras.layers.Embedding(input_dim = len(wv.vocab), output_dim = 300, input_length = 20, weights = [wv.vectors]))
model_rnn.add(keras.layers.SimpleRNN(units=20, activation='relu'))
model_rnn.add(keras.layers.Dense(units=3, activation='softmax'))

model_rnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
X_train_array_rnn = np.array(X_train_rnn)
X_test_array_rnn = np.array(X_test_rnn)
Y_train_array_rnn = np.array(Y_train_rnn)
Y_test_array_rnn = np.array(Y_test_rnn)

Y_train_onehot_rnn = keras.utils.to_categorical(Y_train_array_rnn - 1, num_classes = 3)
Y_test_onehot_rnn = keras.utils.to_categorical(Y_test_array_rnn - 1, num_classes = 3)

In [None]:
model_rnn.fit(X_train_array_rnn, Y_train_onehot_rnn, epochs = 10, validation_data = (X_test_array_rnn, Y_test_onehot_rnn))

test_loss, test_acc = model_rnn.evaluate(X_test_array_rnn, Y_test_onehot_rnn)

print("Test accuracy for Simple RNN: ", test_acc)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy for Simple RNN:  0.4659166634082794


<b> Q. What do you conclude by comparing accuracy values you obtain with those obtained with feedforward neural network models? </b><br>
A. FNN is giving better accuracy than RNN due to complexity of model.

### Q. 5b) GRU

1. Created Sequential model using Keras. 
2. Added embedding layer into model to give vector embedding input. Then added GRU layer. <br>
3. Created final output model with 3 units since its a classification with 3 classes. <br>
4. Compiled the model using 'categorical_crossentropy' loss. <br>
5. Fit the model with 10 epochs and batch_size as 32. <br>
6. Finally, evaluated the test accuracy of data.

In [None]:
model_gru = keras.models.Sequential()
model_gru.add(keras.layers.Embedding(input_dim = len(wv.key_to_index), output_dim = 300, input_length = 20, weights = [wv.vectors]))
model_gru.add(keras.layers.GRU(units=20, activation='relu'))
model_gru.add(keras.layers.Dense(units=3, activation='softmax'))

model_gru.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])



In [None]:
model_gru.fit(X_train_array_rnn, Y_train_onehot_rnn, epochs = 10, validation_data = (X_test_array_rnn, Y_test_onehot_rnn))

test_loss, test_acc = model_gru.evaluate(X_test_array_rnn, Y_test_onehot_rnn)

print("Test accuracy for GRU: ", test_acc)

# #predictions = model.predict(new_data)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy for GRU:  0.503333330154419


### Q. 5c) LSTM

1. Created Sequential model using Keras. 
2. Added embedding layer into model to give vector embedding input. Then added LSTM layer. <br>
3. Created final output model with 3 units since its a classification with 3 classes. <br>
4. Compiled the model using 'categorical_crossentropy' loss. <br>
5. Fit the model with 10 epochs and batch_size as 32. <br>
6. Finally, evaluated the test accuracy of data.

In [None]:
model_lstm = keras.models.Sequential()
model_lstm.add(keras.layers.Embedding(input_dim = len(wv.key_to_index), output_dim = 300, input_length = 20, weights = [wv.vectors]))
model_lstm.add(keras.layers.LSTM(units=20, activation='relu'))
model_lstm.add(keras.layers.Dense(units=3, activation='softmax'))

model_lstm.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])



In [None]:
model_lstm.fit(X_train_array_rnn, Y_train_onehot_rnn, epochs = 10, validation_data = (X_test_array_rnn, Y_test_onehot_rnn))

test_loss, test_acc = model_lstm.evaluate(X_test_array_rnn, Y_test_onehot_rnn)

print(test_loss, " ", test_acc)

# #predictions = model.predict(new_data)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
0.9778949618339539   0.500249981880188


<b> Q. What do you conclude by comparing accuracy values you obtain by GRU, LSTM, and simple RNN? </b><br>
GRU and LSTM gives better accuracies than simple RNN. This is because GRU and LSTM handles long term accuracies better than RNNs. Also, GRUs are less prone to overfitting while LST