<a href="https://colab.research.google.com/github/OmkarModi/Text_classification/blob/main/text_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi Class Text Classification

In [1]:
import pandas as pd
import numpy as np
import re
import nltk
from sklearn import preprocessing

In [2]:
#paths to various files used in projects

data_path = "/content/root2ai - Data.csv"
word_embedding_path = 'glove.6B.300d.txt'

## data preprocessing

In [3]:
data = pd.read_csv(data_path)
print(data.head())
print(data.info())

                                                Text      Target
0  reserve bank forming expert committee based in...  Blockchain
1          director could play role financial system  Blockchain
2  preliminary discuss secure transaction study r...  Blockchain
3  security indeed prove essential transforming f...  Blockchain
4  bank settlement normally take three days based...  Blockchain
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 22704 entries, 0 to 22703
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Text    22701 non-null  object
 1   Target  22704 non-null  object
dtypes: object(2)
memory usage: 354.9+ KB
None


from info we can find that our data consist some empty or null cells so we need to deal with it

In [4]:
#it is necessary to clean the cells that have NaN values or are empty 
#so that don't raise errors while performing classification
data.dropna(inplace=True)

###train test spliting and label encoding

the labels provided are categorial data so it is necessary to encode them so computer could understand them and train it

In [5]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

X = data['Text']
y = data['Target']
encoder = preprocessing.LabelEncoder()
y = np.array(encoder.fit_transform(y))

X_train, X_test, y_train , y_test = train_test_split(X,y, test_size = 0.2 , random_state = 0)

enc = OneHotEncoder(sparse=False)
onehot_train_y = y_train.reshape(len(y_train),1)  #reshaping it to 2d array as OneHotEncoder requires 2d array as perameter
onehot_train_y = enc.fit_transform(onehot_train_y)
onehot_test_y = y_test.reshape(len(y_test),1)
onehot_test_y = enc.fit_transform(onehot_test_y)

classes label of our data consists can be obtained. there are 11 classes our labels are distributed

In [7]:
class_names = list(encoder.classes_)
print(class_names)

['Bigdata', 'Blockchain', 'Cyber Security', 'Data Security', 'FinTech', 'Microservices', 'Neobanks', 'Reg Tech', 'Robo Advising', 'Stock Trading', 'credit reporting']


NOTE- In preprocessing step our text needs to be cleaned. we should clean all non word characters, html tags, stopwords and other noises in texts. Data provided to us is already cleaned and is lowercased so this step is skipped.

##Feature Selection

raw text is transformed into meaningful feature vectors

###Count vectors

Count Vector is a matrix notation of the dataset in which every row represents a text from the data, every column represents a word from the text, and every cell represents the frequency count of a particular term in a particular document.

In [8]:
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
count_vect.fit(X_train)
count_vect_xtrain = count_vect.transform(X_train)

Data representation is similar to that of count Vectors but each cell contains a scalar quantity rather than frequency which represents the relative importance of a term in the document 

###Word Level Tfid

In [9]:
from sklearn.feature_extraction.text import TfidfVectorizer
word_tfid_vect = TfidfVectorizer()
word_tfid_vect_xtrain = word_tfid_vect.fit_transform(X_train)

###ngram Level Tfid

group of n adjacent words is considered because the group contain important information rather than single word.

In [10]:
ngram_tfid_vect = TfidfVectorizer(ngram_range = (2,3))
ngram_tfid_vect_xtrain = ngram_tfid_vect.fit_transform(X_train)

###Character Level Tfid

Here character level score is counted 

In [11]:
char_tfid_vect = TfidfVectorizer(analyzer = 'char',ngram_range=(2,3))
char_tfid_vect_xtrain = char_tfid_vect.fit_transform(X_train)

###Word2Vec

A word embedding is a form of representing words and documents using a dense vector representation. The position of a word within the vector space is learned from text and is based on the words that surround the word when it is used.

In [12]:
import gensim.models 
from nltk.tokenize import word_tokenize,sent_tokenize
nltk.download('punkt')
sentence = data['Text'].tolist()
sent_token = [word_tokenize(sent) for sent in sentence]
model = gensim.models.Word2Vec(sentences=sent_token)

model.wv.init_sims()

#using average vectors is found to be useful feature

def word_averaging(wv, words):
    all_words, mean = set(), []
    
    for word in words:
        if isinstance(word, np.ndarray):
            mean.append(word)
        elif word in wv.vocab:
            mean.append(wv.syn0norm[wv.vocab[word].index])
            all_words.add(wv.vocab[word].index)

    if not mean:
        return np.zeros(wv.vector_size,)

    mean = gensim.matutils.unitvec(np.array(mean).mean(axis=0)).astype(np.float32)
    return mean

def  word_averaging_list(wv, text_list):
    return np.vstack([word_averaging(wv, text) for text in text_list ])

def w2v_tokenize_text(text):
    tokens = []
    for sent in sent_tokenize(text, language='english'):
        for word in word_tokenize(sent, language='english'):
            if len(word) < 2:
                continue
            tokens.append(word)
    return tokens

train_tokenized = X_train.apply(w2v_tokenize_text)
test_tokenized = X_test.apply(w2v_tokenize_text)

X_train_word_average = word_averaging_list(model.wv,train_tokenized)
X_test_word_average = word_averaging_list(model.wv,test_tokenized)

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!




In [13]:
from keras.preprocessing import text, sequence

#this maps the word to index 
token = text.Tokenizer()
token.fit_on_texts(data['Text'])
word_index = token.word_index

#padding sequences to further feed as input to models
train_seq_x = sequence.pad_sequences(token.texts_to_sequences(X_train), maxlen=50)
test_seq_x = sequence.pad_sequences(token.texts_to_sequences(X_test), maxlen=50)

#creating embedding matrix that stores vector representation of words
embedding_matrix = np.zeros((len(word_index) + 1, 100))
for word, i in word_index.items():
  if word in list(model.wv.vocab):
    embedding_vector = model[word]
    embedding_matrix[i] = embedding_vector

  app.launch_new_instance()


###Doc2Vec

This is similar to word2vec but here instead of word whole text is represented to a vector

In [None]:
from tqdm import tqdm
tqdm.pandas(desc="progress-bar")
from gensim.models import Doc2Vec
from sklearn import utils
import gensim
from gensim.models.doc2vec import TaggedDocument
import re

def label_sentences(corpus, label_type):
  labeled = []
  for i, v in enumerate(corpus):
      label = label_type + '_' + str(i)
      labeled.append(TaggedDocument(v.split(), [label]))
  return labeled

X_train_d2v = label_sentences(X_train, 'Train')
X_test_d2v = label_sentences(X_test, 'Test')
all_data = X_train_d2v + X_test_d2v

model_dbow = Doc2Vec(dm=0, vector_size=300, negative=5, min_count=1, alpha=0.065, min_alpha=0.065)
model_dbow.build_vocab([x for x in tqdm(all_data)])

for epoch in range(30):
    model_dbow.train(utils.shuffle([x for x in tqdm(all_data)]), total_examples=len(all_data), epochs=1)
    model_dbow.alpha -= 0.002
    model_dbow.min_alpha = model_dbow.alpha

def get_vectors(model, corpus_size, vectors_size, vectors_type):
    """
    Get vectors from trained doc2vec model
    :param doc2vec_model: Trained Doc2Vec model
    :param corpus_size: Size of the data
    :param vectors_size: Size of the embedding vectors
    :param vectors_type: Training or Testing vectors
    :return: list of vectors
    """
    vectors = np.zeros((corpus_size, vectors_size))
    for i in range(0, corpus_size):
        prefix = vectors_type + '_' + str(i)
        vectors[i] = model.docvecs[prefix]
    return vectors
    
train_vectors_dbow = get_vectors(model_dbow, len(X_train_d2v), 300, 'Train')
test_vectors_dbow = get_vectors(model_dbow, len(X_test_d2v), 300, 'Test')

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
train_vectors_dbow = scaler.fit_transform(train_vectors_dbow)
test_vectors_dbow = scaler.transform(test_vectors_dbow)

## Model Building

A dictionary that will story evalution matrix for various model and for each feature

In [15]:
model = {}

Function to train and fit various models

In [16]:
from sklearn import metrics
def model_fit(model,X_train,y_train,X_test,y_test):
  classifier = model
  classifier.fit(X_train,y_train)
  y_pred = classifier.predict(X_test)
  metric = {'accuracy' : metrics.accuracy_score(y_test,y_pred), 'recall' : metrics.recall_score(y_test,y_pred,average = 'weighted',zero_division=0), 'precision' : metrics.precision_score(y_test,y_pred, average = 'weighted',zero_division=0) , 'f1_score' : metrics.f1_score(y_test, y_pred, average='macro',zero_division=0) }
  return metric

### Logistic Regression

In [None]:
from sklearn.linear_model import LogisticRegression
LR={}
LR['count_vector'] = model_fit(LogisticRegression(max_iter=250),count_vect_xtrain,y_train,count_vect.transform(X_test),y_test)
LR['word_tfid'] = model_fit(LogisticRegression(max_iter=250),word_tfid_vect_xtrain,y_train,word_tfid_vect.transform(X_test),y_test)
LR['ngram_tfid'] = model_fit(LogisticRegression(max_iter=250),ngram_tfid_vect_xtrain,y_train,ngram_tfid_vect.transform(X_test),y_test)
LR['char_tfid'] = model_fit(LogisticRegression(max_iter=250),char_tfid_vect_xtrain,y_train,char_tfid_vect.transform(X_test),y_test)
LR['word2v'] = model_fit(LogisticRegression(max_iter=250),X_train_word_average,y_train,X_test_word_average,y_test)
LR['doc2v'] = model_fit(LogisticRegression(max_iter=250),train_vectors_dbow,y_train,test_vectors_dbow,y_test)

model['LogisticRegression'] = LR

### Naive Bayes

In [18]:
from sklearn.naive_bayes import MultinomialNB
NB={}
NB['count_vector'] = model_fit(MultinomialNB(),count_vect_xtrain,y_train,count_vect.transform(X_test),y_test)
NB['word_tfid'] = model_fit(MultinomialNB(),word_tfid_vect_xtrain,y_train,word_tfid_vect.transform(X_test),y_test)
NB['ngram_tfid'] = model_fit(MultinomialNB(),ngram_tfid_vect_xtrain,y_train,ngram_tfid_vect.transform(X_test),y_test)
NB['char_tfid'] = model_fit(MultinomialNB(),char_tfid_vect_xtrain,y_train,char_tfid_vect.transform(X_test),y_test)

model['NaiveBayes'] = NB

### SVM

In [19]:
from sklearn.linear_model import SGDClassifier
SVM = {}
SVM['count_vector'] = model_fit(SGDClassifier(),count_vect_xtrain,y_train,count_vect.transform(X_test),y_test)
SVM['word_tfid'] = model_fit(SGDClassifier(),word_tfid_vect_xtrain,y_train,word_tfid_vect.transform(X_test),y_test)
SVM['ngram_tfid'] = model_fit(SGDClassifier(),ngram_tfid_vect_xtrain,y_train,ngram_tfid_vect.transform(X_test),y_test)
SVM['char_tfid'] = model_fit(SGDClassifier(),char_tfid_vect_xtrain,y_train,char_tfid_vect.transform(X_test),y_test)
SVM['word2v'] = model_fit(SGDClassifier(),X_train_word_average,y_train,X_test_word_average,y_test)
SVM['doc2v'] = model_fit(SGDClassifier(),train_vectors_dbow,y_train,test_vectors_dbow,y_test)

model['SVM'] = SVM

### Random Forest Classifier

In [20]:
from sklearn.ensemble import RandomForestClassifier
RF={}
RF['count_vector'] = model_fit(RandomForestClassifier(),count_vect_xtrain,y_train,count_vect.transform(X_test),y_test)
RF['word_tfid'] = model_fit(RandomForestClassifier(),word_tfid_vect_xtrain,y_train,word_tfid_vect.transform(X_test),y_test)
RF['ngram_tfid'] = model_fit(RandomForestClassifier(),ngram_tfid_vect_xtrain,y_train,ngram_tfid_vect.transform(X_test),y_test)
RF['char_tfid'] = model_fit(RandomForestClassifier(),char_tfid_vect_xtrain,y_train,char_tfid_vect.transform(X_test),y_test)
RF['word2v'] = model_fit(RandomForestClassifier(),X_train_word_average,y_train,X_test_word_average,y_test)
RF['doc2v'] = model_fit(RandomForestClassifier(),train_vectors_dbow,y_train,test_vectors_dbow,y_test)

model['RandomForest'] = RF

### Extreme Gradient Boosting(XGB)

In [21]:
import xgboost
XGB={}
XGB['count_vector'] = model_fit(xgboost.XGBClassifier(),count_vect_xtrain,y_train,count_vect.transform(X_test),y_test)
XGB['word_tfid'] = model_fit(xgboost.XGBClassifier(),word_tfid_vect_xtrain,y_train,word_tfid_vect.transform(X_test),y_test)
XGB['ngram_tfid'] = model_fit(xgboost.XGBClassifier(),ngram_tfid_vect_xtrain,y_train,ngram_tfid_vect.transform(X_test),y_test)
XGB['char_tfid'] = model_fit(xgboost.XGBClassifier(),char_tfid_vect_xtrain,y_train,char_tfid_vect.transform(X_test),y_test)
XGB['word2v'] = model_fit(xgboost.XGBClassifier(),X_train_word_average,y_train,X_test_word_average,y_test)
XGB['doc2v'] = model_fit(xgboost.XGBClassifier(),train_vectors_dbow,y_train,test_vectors_dbow,y_test)

model['XGB'] = XGB

### Neural Network

In [38]:
import tensorflow as tf
from keras import layers, models, optimizers
def create_model_architecture(input_size):
    # create input layer 
    input_layer = layers.Input((input_size,), sparse=True)
    
    # create hidden layer
    hidden_layer = layers.Dense(100, activation="relu")(input_layer)
    
    # create output layer
    output_layer = layers.Dense(11, activation="sigmoid")(hidden_layer)

    classifier = models.Model(inputs = input_layer, outputs = output_layer)
    classifier.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy',metrics=['accuracy'])
    print(classifier.summary())
    return classifier 


In [29]:
def NN_model(X_train,y_train,X_test,y_test):
  classifier = create_model_architecture(X_train.shape[1])
  classifier.fit(X_train,y_train,epochs=25)
  y_pred = classifier.predict(X_test)
  y_pred = y_pred.argmax(axis =-1)
  metric = {'accuracy' : metrics.accuracy_score(y_test,y_pred), 'recall' : metrics.recall_score(y_test,y_pred,average = 'weighted'), 'precision' : metrics.precision_score(y_test,y_pred, average = 'weighted') , 'f1_score' : metrics.f1_score(y_test, y_pred, average='macro') }
  return metric

In [None]:
NN = {}
NN['count_vector'] = NN_model(count_vect_xtrain, onehot_train_y,count_vect.transform(X_test),y_test)
NN['word_tfid'] = NN_model(word_tfid_vect_xtrain.toarray(),onehot_train_y,word_tfid_vect.transform(X_test).toarray(),y_test)
NN['ngram_tfid'] = NN_model(ngram_tfid_vect_xtrain.toarray(),onehot_train_y,ngram_tfid_vect.transform(X_test).toarray(),y_test)
NN['char_tfid'] = NN_model(char_tfid_vect_xtrain.toarray(),onehot_train_y,char_tfid_vect.transform(X_test).toarray(),y_test)
NN['word2v'] = NN_model(X_train_word_average,onehot_train_y,X_test_word_average,y_test)
NN['doc2v'] = NN_model(train_vectors_dbow,onehot_train_y,test_vectors_dbow,y_test)

model['NN'] = NN

Model: "model_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_8 (InputLayer)         [(None, 10506)]           0         
_________________________________________________________________
dense_18 (Dense)             (None, 100)               1050700   
_________________________________________________________________
dense_19 (Dense)             (None, 11)                1111      
Total params: 1,051,811
Trainable params: 1,051,811
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/25


  "shape. This may consume a large amount of memory." % value)


Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Model: "model_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_9 (InputLayer)         [(None, 10506)]           0         
_________________________________________________________________
dense_20 (Dense)             (None, 100)               1050700   
_________________________________________________________________
dense_21 (Dense)             (None, 11)                1111      
Total params: 1,051,811
Trainable params: 1,051,811
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 

In [30]:
from sklearn import metrics
def DNN_model(classify,X_train,y_train,X_test,y_test):
  classifier = classify
  classifier.fit(X_train,y_train,epochs=100,verbose=2)
  y_pred = classifier.predict(X_test)
  y_pred = np.argmax(y_pred,axis =-1)
  metric = {'accuracy' : metrics.accuracy_score(y_test,y_pred), 'recall' : metrics.recall_score(y_test,y_pred,average = 'weighted'), 'precision' : metrics.precision_score(y_test,y_pred, average = 'weighted') , 'f1_score' : metrics.f1_score(y_test, y_pred, average='macro') }
  return metric

###Convolutional Neural Network

In [31]:
from keras import layers
def create_cnn():
    # Add an Input Layer
    input_layer = layers.Input((50, ))

    # Add the word embedding Layer
    embedding_layer = layers.Embedding(len(word_index) + 1, 100, weights=[embedding_matrix], trainable=False)(input_layer)
    embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer)

    # Add the convolutional Layer
    conv_layer = layers.Convolution1D(100, 3, activation="relu")(embedding_layer)

    # Add the pooling Layer
    pooling_layer = layers.GlobalMaxPool1D()(conv_layer)

    # Add the output Layers
    output_layer1 = layers.Dense(50, activation="relu")(pooling_layer)
    output_layer1 = layers.Dropout(0.25)(output_layer1)
    output_layer2 = layers.Dense(11, activation="sigmoid")(output_layer1)

    # Compile the model
    model = models.Model(inputs=input_layer, outputs=output_layer2)
    model.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    
    return model

classifier = create_cnn()
cnn = {}
cnn = DNN_model(classifier,train_seq_x,onehot_train_y,test_seq_x,y_test)
model['CNN'] = cnn

Epoch 1/100
568/568 - 7s - loss: 1.9808 - accuracy: 0.3964
Epoch 2/100
568/568 - 6s - loss: 1.8993 - accuracy: 0.4117
Epoch 3/100
568/568 - 7s - loss: 1.8744 - accuracy: 0.4183
Epoch 4/100
568/568 - 9s - loss: 1.8505 - accuracy: 0.4263
Epoch 5/100
568/568 - 9s - loss: 1.8380 - accuracy: 0.4284
Epoch 6/100
568/568 - 9s - loss: 1.8193 - accuracy: 0.4308
Epoch 7/100
568/568 - 9s - loss: 1.8125 - accuracy: 0.4314
Epoch 8/100
568/568 - 10s - loss: 1.7977 - accuracy: 0.4351
Epoch 9/100
568/568 - 10s - loss: 1.7870 - accuracy: 0.4386
Epoch 10/100
568/568 - 10s - loss: 1.7737 - accuracy: 0.4470
Epoch 11/100
568/568 - 10s - loss: 1.7702 - accuracy: 0.4473
Epoch 12/100
568/568 - 10s - loss: 1.7566 - accuracy: 0.4472
Epoch 13/100
568/568 - 10s - loss: 1.7536 - accuracy: 0.4494
Epoch 14/100
568/568 - 10s - loss: 1.7464 - accuracy: 0.4503
Epoch 15/100
568/568 - 10s - loss: 1.7455 - accuracy: 0.4487
Epoch 16/100
568/568 - 10s - loss: 1.7407 - accuracy: 0.4496
Epoch 17/100
568/568 - 10s - loss: 1.738

  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
print(cnn)

###LSTM

In [33]:
def create_rnn_lstm():
    # Add an Input Layer
    input_layer = layers.Input((50, ))

    # Add the word embedding Layer
    embedding_layer = layers.Embedding(len(word_index) + 1, 100, weights=[embedding_matrix], trainable=False)(input_layer)
    embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer)

    # Add the LSTM Layer
    lstm_layer = layers.LSTM(100)(embedding_layer)

    # Add the output Layers
    output_layer1 = layers.Dense(50, activation="relu")(lstm_layer)
    output_layer1 = layers.Dropout(0.25)(output_layer1)
    output_layer2 = layers.Dense(11, activation="sigmoid")(output_layer1)

    # Compile the model
    model = models.Model(inputs=input_layer, outputs=output_layer2)
    model.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy')
    
    return model

classifier = create_rnn_lstm()
lstm = {}
lstm = DNN_model(classifier,train_seq_x,onehot_train_y,test_seq_x,y_test)

model['LSTM'] = lstm

Epoch 1/100
568/568 - 26s - loss: 1.9502
Epoch 2/100
568/568 - 20s - loss: 1.8985
Epoch 3/100
568/568 - 21s - loss: 1.8780
Epoch 4/100
568/568 - 21s - loss: 1.8657
Epoch 5/100
568/568 - 20s - loss: 1.8541
Epoch 6/100
568/568 - 21s - loss: 1.8430
Epoch 7/100
568/568 - 20s - loss: 1.8338
Epoch 8/100
568/568 - 20s - loss: 1.8214
Epoch 9/100
568/568 - 21s - loss: 1.8087
Epoch 10/100
568/568 - 21s - loss: 1.8038
Epoch 11/100
568/568 - 21s - loss: 1.7956
Epoch 12/100
568/568 - 21s - loss: 1.7855
Epoch 13/100
568/568 - 21s - loss: 1.7714
Epoch 14/100
568/568 - 21s - loss: 1.7712
Epoch 15/100
568/568 - 21s - loss: 1.7610
Epoch 16/100
568/568 - 20s - loss: 1.7617
Epoch 17/100
568/568 - 20s - loss: 1.7460
Epoch 18/100
568/568 - 21s - loss: 1.7408
Epoch 19/100
568/568 - 21s - loss: 1.7328
Epoch 20/100
568/568 - 21s - loss: 1.7263
Epoch 21/100
568/568 - 21s - loss: 1.7158
Epoch 22/100
568/568 - 20s - loss: 1.7153
Epoch 23/100
568/568 - 20s - loss: 1.7011
Epoch 24/100
568/568 - 20s - loss: 1.6994
E

In [41]:
print(lstm['accuracy'])

0.41554723629156576


### Gated Recurrent Unit

In [34]:
def create_rnn_gru():
    # Add an Input Layer
    input_layer = layers.Input((50, ))

    # Add the word embedding Layer
    embedding_layer = layers.Embedding(len(word_index) + 1, 100, weights=[embedding_matrix], trainable=False)(input_layer)
    embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer)

    # Add the GRU Layer
    lstm_layer = layers.GRU(100)(embedding_layer)

    # Add the output Layers
    output_layer1 = layers.Dense(50, activation="relu")(lstm_layer)
    output_layer1 = layers.Dropout(0.25)(output_layer1)
    output_layer2 = layers.Dense(11, activation="sigmoid")(output_layer1)

    # Compile the model
    model = models.Model(inputs=input_layer, outputs=output_layer2)
    model.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy',metrics= ['accuracy'])
    
    return model

classifier = create_rnn_gru()
rnn_gru = {}
rnn_gru = DNN_model(classifier,train_seq_x,onehot_train_y,test_seq_x,y_test)
model['GATED RNN'] = rnn_gru

Epoch 1/100
568/568 - 23s - loss: 1.9694 - accuracy: 0.4013
Epoch 2/100
568/568 - 19s - loss: 1.8943 - accuracy: 0.4155
Epoch 3/100
568/568 - 19s - loss: 1.8797 - accuracy: 0.4195
Epoch 4/100
568/568 - 19s - loss: 1.8668 - accuracy: 0.4202
Epoch 5/100
568/568 - 19s - loss: 1.8560 - accuracy: 0.4222
Epoch 6/100
568/568 - 19s - loss: 1.8396 - accuracy: 0.4269
Epoch 7/100
568/568 - 19s - loss: 1.8165 - accuracy: 0.4322
Epoch 8/100
568/568 - 19s - loss: 1.8072 - accuracy: 0.4347
Epoch 9/100
568/568 - 19s - loss: 1.7957 - accuracy: 0.4351
Epoch 10/100
568/568 - 19s - loss: 1.7796 - accuracy: 0.4397
Epoch 11/100
568/568 - 19s - loss: 1.7708 - accuracy: 0.4421
Epoch 12/100
568/568 - 19s - loss: 1.7578 - accuracy: 0.4432
Epoch 13/100
568/568 - 19s - loss: 1.7471 - accuracy: 0.4482
Epoch 14/100
568/568 - 19s - loss: 1.7381 - accuracy: 0.4531
Epoch 15/100
568/568 - 19s - loss: 1.7277 - accuracy: 0.4532
Epoch 16/100
568/568 - 19s - loss: 1.7269 - accuracy: 0.4549
Epoch 17/100
568/568 - 19s - loss

### Recurrent CNN

In [None]:
def create_rcnn():
    # Add an Input Layer
    input_layer = layers.Input((50, ))

    # Add the word embedding Layer
    embedding_layer = layers.Embedding(len(word_index) + 1, 100, weights=[embedding_matrix], trainable=False)(input_layer)
    embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer)
    
    # Add the recurrent layer
    rnn_layer = layers.Bidirectional(layers.GRU(50, return_sequences=True))(embedding_layer)
    
    # Add the convolutional Layer
    conv_layer = layers.Convolution1D(100, 3, activation="relu")(embedding_layer)

    # Add the pooling Layer
    pooling_layer = layers.GlobalMaxPool1D()(conv_layer)

    # Add the output Layers
    output_layer1 = layers.Dense(50, activation="relu")(pooling_layer)
    output_layer1 = layers.Dropout(0.25)(output_layer1)
    output_layer2 = layers.Dense(11, activation="sigmoid")(output_layer1)

    # Compile the model
    model = models.Model(inputs=input_layer, outputs=output_layer2)
    model.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy',metrics=['accuracy'])
    
    return model

classifier = create_rcnn()
rcnn = {}
rcnn = DNN_model(classifier,train_seq_x,onehot_train_y,test_seq_x,y_test)

model['RCNN'] = rcnn

In [37]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Embedding
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D

vocab_size=len(word_index)+1
model = Sequential()
model.add(Embedding(vocab_size, 100, input_length=50,trainable=False))
model.add(Conv1D(filters=32, kernel_size=8, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(25, activation='relu'))
model.add(Dense(11, activation='sigmoid'))
print(model.summary())
# compile network
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(train_seq_x, onehot_train_y, epochs=30, verbose=2)
# evaluate
loss, acc = model.evaluate(test_seq_x, onehot_test_y, verbose=0)
print('Test Accuracy: %f' % (acc*100))

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_7 (Embedding)      (None, 50, 100)           1141000   
_________________________________________________________________
conv1d_5 (Conv1D)            (None, 43, 32)            25632     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 21, 32)            0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 672)               0         
_________________________________________________________________
dense_14 (Dense)             (None, 25)                16825     
_________________________________________________________________
dense_15 (Dense)             (None, 11)                286       
Total params: 1,183,743
Trainable params: 42,743
Non-trainable params: 1,141,000
_______________________________________