In [3]:
import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
import re

In [5]:
data = pd.read_csv('archive/Sentiment.csv')
# Keeping only the necessary columns
data = data[['text', 'sentiment']]

We will create a function to remove unwanted charactrs in Tweets using Regex.

In [7]:
def preProcess_data(text):
    text = text.lower()
    new_text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
    new_text = re.sub('rt', '', new_text)
    return new_text

data['text'] = data['text'].apply(preProcess_data)

We will use TensorFlow's tokenizer to tokenize our dataset, and TensorFlow's pad_sequences to pad our sequences.

In [44]:
max_fatures = 2000

tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(data['text'].values)
X = tokenizer.texts_to_sequences(data['text'].values)
# print(X[:5])
X = pad_sequences(X, 28) 
# print(X[:5])

Y = pd.get_dummies(data['sentiment']).values


Now we will split the dataset into training and testing portions.

In [45]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2)

In [47]:
X.shape[1]

28

It is now time to design and create the deep learning model. We will simply use an embedding layer and some LSTM layers with dropout.

In [48]:
embed_dim = 128
lstm_out = 196

model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.3, recurrent_dropout=0.2, return_sequences=True))
model.add(LSTM(128,recurrent_dropout=0.2))
model.add(Dense(3,activation='softmax'))

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

We will now fit the model.

In [49]:
batch_size = 512

model.fit(X_train, Y_train, 
          epochs = 10, 
          batch_size=batch_size, 
          validation_data=(X_test, Y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fa5d4a7e460>

Now the deep learning model is trained, we will save the model so that we do not have to train every time we reload our server. Instead, we just use the trained model. Note that I have not done much hyper-parameter tuning or model improvement, as you can do it by yourself to deploy an improved model.

In [50]:
model.save('sentiment.h5')

### Step 3: Creating a REST API using FAST API

We will create a REST API using FAST API. We will create a new file named app.py. We will first do the important imports.

In [76]:
import numpy as np
from fastapi import FastAPI, Form
import pandas as pd
from starlette.responses import HTMLResponse
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import tensorflow as tf
import re
import uvicorn

Here we have imported FastAPI and Form from the fast API library, using which we will create an Input Form and endpoint for our API. We have imported HTMLResponse from starlette.response, which will help in creating an input form.

We will start by creating an input form so that users can input data, i.e., a test string on which we can test the sentiment.

We have created our FastAPI app in the first line and used the get method on the /predict route, which will return an HTML response so that the user can see a real HTML page, and input the data on forms using the post method. We will use that data to predict on.

You can run your app now by running the following command.