## **Women's E-Commerce Clothing Reviews**

In this notebook we solve classification task using DL. data set also available on Kaggle Datasets: https://www.kaggle.com/nicapotato/womens-ecommerce-clothing-reviews

**Context:**

This is a Women’s Clothing E-Commerce dataset revolving around the reviews written by customers. Its nine supportive features offer a great environment to parse out the text through its multiple dimensions. Because this is real commercial data, it has been anonymized, and references to the company in the review text and body have been replaced with “retailer”.

**Content:**

This dataset includes 23486 rows and 10 feature variables. Each row corresponds to a customer review, and includes the variables:

Clothing ID: Integer Categorical variable that refers to the specific piece being reviewed.
Age: Positive Integer variable of the reviewers age.
Title: String variable for the title of the review.
Review Text: String variable for the review body.
Rating: Positive Ordinal Integer variable for the product score granted by the customer from 1 Worst, to 5 Best.
Recommended IND: Binary variable stating where the customer recommends the product where 1 is recommended, 0 is not recommended.
Positive Feedback Count: Positive Integer documenting the number of other customers who found this review positive.
Division Name: Categorical name of the product high level division.
Department Name: Categorical name of the product department name.
Class Name: Categorical name of the product class name.

Goal:
To build a DL calssification model for NLP.

1. Importing the data
Data:

1.1. Preparing environment and importing libraries

In [0]:
# With some code from Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2 and Tensorflow tutorials.

In [2]:
try:
    %tensorflow_version 2.x
except Exception:
    pass

TensorFlow 2.x selected.


In [0]:
SETUP = True

In [4]:
if SETUP:
    !pip install -q -U toai
    !pip install -q -U nb_black
    !pip install -q -U tensorflow-datasets
    !pip install -q -U --no-deps tensorflow-addons~=0.6
    print(__import__("toai").__version__)
    print(__import__("tensorflow").__version__)

0.2.5
2.0.0


In [0]:
# %load_ext nb_black

In [0]:
import os

# os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
# os.environ["CUDA_VISIBLE_DEVICES"] = "0"

In [0]:
from toai.utils import save_file, load_file

In [0]:
from toai.data.utils import balance_df_labels

In [0]:
from sklearn.metrics import accuracy_score, precision_score, f1_score

In [10]:
from toai.imports import *
from toai.data import DataBundle, DataParams, DataContainer
from toai.metrics import sparse_top_2_categorical_accuracy
import tensorflow as tf
from tensorflow import keras
import tensorflow_addons as tfa
import tensorflow_datasets as tfds



In [0]:
import matplotlib
import matplotlib.pyplot as plt # to run these lines few times

%matplotlib inline

In [12]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
DATA_DIR = Path("drive/My Drive/Kiti/AI/sprint12/project/data/womens-ecommerce-clothing-reviews")
DATA_DIR.mkdir(parents=True, exist_ok=True)

TEMP_DIR = Path("drive/My Drive/Kiti/AI/sprint12/project/temp/womens-ecommerce-clothing-reviews")
TEMP_DIR.mkdir(parents=True, exist_ok=True)

In [0]:
def setup_kaggle():
    x = !ls kaggle.json
    assert x == ['kaggle.json'], 'Upload kaggle.json'
    !mkdir /root/.kaggle
    !mv kaggle.json /root/.kaggle
    !chmod 600 /root/.kaggle/kaggle.json

setup_kaggle()

mkdir: cannot create directory ‘/root/.kaggle’: File exists


In [0]:
import kaggle

In [0]:
if SETUP:
    shutil.rmtree(str(DATA_DIR))
    DATA_DIR.mkdir(parents=True, exist_ok=True)
    TEMP_DIR.mkdir(parents=True, exist_ok=True)
    kaggle.api.authenticate()
    kaggle.api.dataset_download_files(
        dataset="nicapotato/womens-ecommerce-clothing-reviews", path=DATA_DIR, unzip=True
    )

In [0]:
BATCH_SIZE = 32

In [0]:
raw_data = pd.read_csv(
    DATA_DIR / "Womens Clothing E-Commerce Reviews.csv", low_memory=False
)

In [81]:
raw_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23486 entries, 0 to 23485
Data columns (total 11 columns):
Unnamed: 0                 23486 non-null int64
Clothing ID                23486 non-null int64
Age                        23486 non-null int64
Title                      19676 non-null object
Review Text                22641 non-null object
Rating                     23486 non-null int64
Recommended IND            23486 non-null int64
Positive Feedback Count    23486 non-null int64
Division Name              23472 non-null object
Department Name            23472 non-null object
Class Name                 23472 non-null object
dtypes: int64(6), object(5)
memory usage: 2.0+ MB


In [82]:
raw_data.describe(include="all")

Unnamed: 0.1,Unnamed: 0,Clothing ID,Age,Title,Review Text,Rating,Recommended IND,Positive Feedback Count,Division Name,Department Name,Class Name
count,23486.0,23486.0,23486.0,19676,22641,23486.0,23486.0,23486.0,23472,23472,23472
unique,,,,13993,22634,,,,3,6,20
top,,,,Love it!,Perfect fit and i've gotten so many compliment...,,,,General,Tops,Dresses
freq,,,,136,3,,,,13850,10468,6319
mean,11742.5,918.118709,43.198544,,,4.196032,0.822362,2.535936,,,
std,6779.968547,203.29898,12.279544,,,1.110031,0.382216,5.702202,,,
min,0.0,0.0,18.0,,,1.0,0.0,0.0,,,
25%,5871.25,861.0,34.0,,,4.0,1.0,0.0,,,
50%,11742.5,936.0,41.0,,,5.0,1.0,1.0,,,
75%,17613.75,1078.0,52.0,,,5.0,1.0,3.0,,,


In [83]:
raw_data.head()

Unnamed: 0.1,Unnamed: 0,Clothing ID,Age,Title,Review Text,Rating,Recommended IND,Positive Feedback Count,Division Name,Department Name,Class Name
0,0,767,33,,Absolutely wonderful - silky and sexy and comf...,4,1,0,Initmates,Intimate,Intimates
1,1,1080,34,,Love this dress! it's sooo pretty. i happene...,5,1,4,General,Dresses,Dresses
2,2,1077,60,Some major design flaws,I had such high hopes for this dress and reall...,3,0,0,General,Dresses,Dresses
3,3,1049,50,My favorite buy!,"I love, love, love this jumpsuit. it's fun, fl...",5,1,0,General Petite,Bottoms,Pants
4,4,847,47,Flattering shirt,This shirt is very flattering to all due to th...,5,1,6,General,Tops,Blouses


In [84]:
raw_data["Rating"].value_counts()

5    13131
4     5077
3     2871
2     1565
1      842
Name: Rating, dtype: int64

We drop rows with empty Revie text

In [0]:
def drop_values(df, col_name, values):
    return df.loc[~df[col_name].isin(values), :].reset_index(drop=True)

In [0]:
def drop_rare_values(df, col_name, threshold):
    counts = df[col_name].value_counts(normalize=True)
    return df.loc[df[col_name].isin(counts[counts > threshold].index), :].reset_index(
        drop=True
    )

In [0]:
def make_category_map(labels):
    return {x: i for i, x in enumerate(sorted(set(labels)))}

In [0]:
def init_category_map(filename, labels):
    try:
        category_map = load_file(filename)
    except:
        category_map = make_category_map(labels)
        save_file(category_map, filename)
        return category_map

In [0]:
df = raw_data

In [0]:
category_map = make_category_map(df["Rating"].values)

In [0]:
category_map = init_category_map(
    TEMP_DIR / "category_map.pickle", df["Rating"].values
)

In [0]:
category_map

In [0]:
category_map = make_category_map(df["Rating"].values)

In [0]:
n_categories = len(category_map)

In [0]:
df["Rating"] = df["Rating"].map(category_map)

In [96]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23486 entries, 0 to 23485
Data columns (total 11 columns):
Unnamed: 0                 23486 non-null int64
Clothing ID                23486 non-null int64
Age                        23486 non-null int64
Title                      19676 non-null object
Review Text                22641 non-null object
Rating                     23486 non-null int64
Recommended IND            23486 non-null int64
Positive Feedback Count    23486 non-null int64
Division Name              23472 non-null object
Department Name            23472 non-null object
Class Name                 23472 non-null object
dtypes: int64(6), object(5)
memory usage: 2.0+ MB


In [0]:
df = df[~df["Review Text"].isna()]

In [98]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 22641 entries, 0 to 23485
Data columns (total 11 columns):
Unnamed: 0                 22641 non-null int64
Clothing ID                22641 non-null int64
Age                        22641 non-null int64
Title                      19675 non-null object
Review Text                22641 non-null object
Rating                     22641 non-null int64
Recommended IND            22641 non-null int64
Positive Feedback Count    22641 non-null int64
Division Name              22628 non-null object
Department Name            22628 non-null object
Class Name                 22628 non-null object
dtypes: int64(6), object(5)
memory usage: 2.1+ MB


In [99]:
df.tail()

Unnamed: 0.1,Unnamed: 0,Clothing ID,Age,Title,Review Text,Rating,Recommended IND,Positive Feedback Count,Division Name,Department Name,Class Name
23481,23481,1104,34,Great dress for many occasions,I was very happy to snag this dress at such a ...,4,1,0,General Petite,Dresses,Dresses
23482,23482,862,48,Wish it was made of cotton,"It reminds me of maternity clothes. soft, stre...",2,1,0,General Petite,Tops,Knits
23483,23483,1104,31,"Cute, but see through","This fit well, but the top was very see throug...",2,0,1,General Petite,Dresses,Dresses
23484,23484,1084,28,"Very cute dress, perfect for summer parties an...",I bought this dress for a wedding i have this ...,2,1,2,General,Dresses,Dresses
23485,23485,1104,52,Please make more like this one!,This dress in a lovely platinum is feminine an...,4,1,22,General Petite,Dresses,Dresses


In [0]:
data_container = DataContainer(
    *DataBundle.split(
        data_bundle=DataBundle.from_dataframe(
            dataframe=df, x_col="Review Text", y_col="Rating"
        ),
        fracs=[0.8, 0.1, 0.1],
    )
)

In [0]:
data_container.train = DataBundle.from_unbalanced_data_bundle(data_container.train)

We are checking if the data set have been balanced (comparing total raw dataset and training data set after balancing)

In [102]:
raw_data["Rating"].value_counts()

4    13131
3     5077
2     2871
1     1565
0      842
Name: Rating, dtype: int64

In [103]:
data_container.train.value_counts()

{0: 9780, 1: 9928, 2: 9048, 3: 7806, 4: 10055}

In [0]:
class_weights = dict(
    enumerate(
        sk.utils.class_weight.compute_class_weight(
            "balanced", np.unique(data_container.train.y), data_container.train.y
        )
    )
)

In [105]:
class_weights

{0: 0.9533128834355828,
 1: 0.9391015310233682,
 2: 1.0304376657824934,
 3: 1.1943889315910838,
 4: 0.9272401790154152}

In [106]:
len(data_container.train), len(data_container.validation), len(data_container.test)

(46617, 2265, 2263)

In [0]:
def make_dataset2(data_bundle):
    x_dataset = tf.data.Dataset.from_tensor_slices(data_bundle.x)
    y_dataset = tf.data.Dataset.from_tensor_slices(data_bundle.y)
    return tf.data.Dataset.zip((x_dataset, y_dataset))

In [0]:
data_container.train.dataset = make_dataset2(data_container.train)
data_container.validation.dataset = make_dataset2(data_container.validation)
data_container.test.dataset = make_dataset2(data_container.test)

In [109]:
data_container.train.x[0]

'Ii loved the idea of this summer top flowing in the breeze... then i got it and the selves did nit give as much slack as i would have hoped #spandex so #returned!'

In [110]:
data_container.train.y[0]

0

In [0]:
def preprocess(x, y, max_length=100):
    x = tf.strings.regex_replace(x, rb"<br\s*/?>", b" ")
    x = tf.strings.regex_replace(x, b"[^a-zA-Z']", b" ")
    x = tf.strings.split(x)
    x = x[:, :max_length]
    return x.to_tensor(default_value=b"<pad>"), y

In [112]:
for x, y in data_container.train.dataset.batch(BATCH_SIZE).map(preprocess).take(1):
    print(x.shape)
    print(y.shape)
    print(x[2])
    print(y[2])

(32, 100)
(32,)
tf.Tensor(
[b'Although' b'i' b'love' b'retailer' b'this' b'product' b"isn't" b'the'
 b'quality' b'or' b'presentation' b'i' b'expect' b'from' b'them' b'the'
 b'dress' b'arrived' b'stuffed' b'into' b'a' b'small' b'plastic' b'bag'
 b'and' b'was' b'a' b'crumpled' b'mess' b'the' b'fabric' b"wasn't" b'soft'
 b'and' b'fluid' b'as' b'i' b'expected' b'but' b'rather' b'was' b'stiff'
 b'and' b'a' b'little' b'scratchy' b'i' b'purchased' b'the' b'ecru'
 b'version' b'which' b'was' b'also' b'slightly' b'darker' b'than' b'the'
 b'photo' b'showed' b'i' b'returned' b'the' b'dress' b'and' b'will'
 b'keep' b'looking' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>'
 b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>'
 b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>'
 b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>' b'<pad>'
 b'<pad>' b'<pad>'], shape=(100,), dtype=string)
tf.Tensor(0, shape=(), dtype=int64)


In [0]:
def make_vocabulary(dataset):
    vocabulary = Counter()
    for x, _ in dataset.batch(BATCH_SIZE).map(preprocess):
        for review in x:
            vocabulary.update(review.numpy().tolist())
    return vocabulary

In [0]:
VOCABULARY_SIZE = 60000

In [0]:
vocabulary = make_vocabulary(data_container.train.dataset)

In [116]:
vocabulary.most_common()[:10]

[(b'<pad>', 1835306),
 (b'the', 166244),
 (b'i', 102478),
 (b'and', 95688),
 (b'it', 90775),
 (b'a', 83429),
 (b'is', 59600),
 (b'to', 50679),
 (b'this', 45335),
 (b'in', 42087)]

In [117]:
len(vocabulary)

13310

In [0]:
truncated_vocabulary = [
    word for word, count in vocabulary.most_common()[:VOCABULARY_SIZE]
]

In [119]:
len(truncated_vocabulary)

13310

In [0]:
word_to_id = {word: index for index, word in enumerate(truncated_vocabulary)}

In [121]:
for word in b"the service is very good".split():
    print(word_to_id.get(word) or VOCABULARY_SIZE)

1
2061
6
23
114


In [0]:
words = tf.constant(truncated_vocabulary)

In [0]:
word_ids = tf.range(len(truncated_vocabulary), dtype=tf.int64)

In [0]:
vocab_init = tf.lookup.KeyValueTensorInitializer(words, word_ids)

In [0]:
n_oov_buckets = 1000

In [0]:
table = tf.lookup.StaticVocabularyTable(vocab_init, n_oov_buckets)

In [127]:
table.lookup(tf.constant([b"the service is very good".split()]))

<tf.Tensor: id=466285, shape=(1, 5), dtype=int64, numpy=array([[   1, 2061,    6,   23,  114]])>

In [0]:
def encode_words(x, y):
    return table.lookup(x), y

In [0]:
def encode_categories(x, y):
    return table.lookup(x), y

In [0]:
train_dataset = (
    data_container.train.dataset.repeat()
    .batch(BATCH_SIZE)
    .map(preprocess)
    .map(encode_words)
    .prefetch(1)
)

In [0]:
validation_dataset = (
    data_container.validation.dataset.batch(BATCH_SIZE)
    .map(preprocess)
    .map(encode_words)
)

In [132]:
for x, y in train_dataset.take(1):
    print(x.shape)
    print(y.shape)
    print(x[0])
    print(y[0])

(32, 100)
(32,)
tf.Tensor(
[4206  147    1  480   13    8  202   25 1087    9    1 3160  205    2
  117    4    3    1 4060  118 3934  334   32   67 4207   32    2   33
   24  943 1303   15  196    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0], shape=(100,), dtype=int64)
tf.Tensor(0, shape=(), dtype=int64)


In [133]:
for x, y in validation_dataset.take(1):
    print(x.shape)
    print(y.shape)
    print(x[0])
    print(y[0])

(32, 100)
(32,)
tf.Tensor(
[   21   258   583 12718  1802   126   747  1822     8    59   970    19
   708    65    13   250   105   405     3    65  1136   281    13     1
   747   144   273     4   312  1448   277   130   179    20   118     2
  1142     1   101   318    12     1   140     1   146   535     6    31
   173    14     5   304   404   306    32     8    10     1   140   150
   195    49    14     4     1   112     6    50    52    14    19   459
     3   795   169   428   102     5    52   297    14   446     9   339
  3830    20  3156    28   729    32   968     0     0     0     0     0
     0     0     0     0], shape=(100,), dtype=int64)
tf.Tensor(3, shape=(), dtype=int64)


In [0]:
train_dataset_steps = len(data_container.train) // BATCH_SIZE

In [0]:
validation_dataset_steps = len(data_container.validation) // BATCH_SIZE

## **Building models**

With unbalanced data set

In [0]:
def make_sequential_lstm_model(
    n_categories, embedding_size, lstm_size, lstm_dropout, dropout
):
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model_lstm_unbalanced = make_sequential_lstm_model(
    n_categories=n_categories,
    embedding_size=64,
    lstm_size=64,
    lstm_dropout=0.3,
    dropout=0.5,
)

In [0]:
model_lstm_unbalanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=3e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model_lstm_unbalanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=10,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10


In [0]:
model_lstm_unbalanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[0.914159221308572, 0.61383927, 0.8455357]

In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        80
           1       0.00      0.00      0.00       175
           2       0.34      0.59      0.43       291
           3       0.31      0.17      0.22       478
           4       0.76      0.91      0.83      1241

    accuracy                           0.61      2265
   macro avg       0.28      0.34      0.30      2265
weighted avg       0.52      0.61      0.56      2265



  'precision', 'predicted', average, warn_for)


**Sequential_lstm with more layers**

In [0]:
def make_sequential_lstm_model(
    n_categories, embedding_size, lstm_size, lstm_dropout, dropout
):
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model_s_lstm_unabalanced = make_sequential_lstm_model(
    n_categories=n_categories,
    embedding_size=64,
    lstm_size=64,
    lstm_dropout=0.2,
    dropout=0.5,
)

In [0]:
model_s_lstm_unabalanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model_s_lstm_unabalanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=30,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30


In [0]:
model_s_lstm_unabalanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[0.856380865403584, 0.64107144, 0.8602679]

In [0]:
print(
    classification_report(
        data_container.validation.y, model_s_lstm_unabalanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        76
           1       0.50      0.04      0.08       156
           2       0.36      0.70      0.48       281
           3       0.37      0.25      0.30       455
           4       0.81      0.87      0.84      1297

    accuracy                           0.64      2265
   macro avg       0.41      0.37      0.34      2265
weighted avg       0.62      0.64      0.61      2265



  'precision', 'predicted', average, warn_for)


**Lstm with embedding layer**

In [0]:
def make_lstm_model(n_categories, embedding_size, lstm_size, lstm_dropout, dropout):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    lstm1_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(embedding_layer, mask=mask)
    lstm2_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(lstm1_layer, mask=mask)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(lstm2_layer)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model = make_lstm_model(
    n_categories=n_categories,
    embedding_size=64,
    lstm_size=64,
    lstm_dropout=0.3,
    dropout=0.5,
)

In [0]:
model.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=3e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=10,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10


In [0]:
model.evaluate(validation_dataset, steps=validation_dataset_steps)



[0.9232704775674002, 0.6267857, 0.8513393]

In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        80
           1       0.39      0.22      0.28       175
           2       0.37      0.41      0.39       291
           3       0.33      0.25      0.28       478
           4       0.76      0.92      0.83      1241

    accuracy                           0.62      2265
   macro avg       0.37      0.36      0.36      2265
weighted avg       0.57      0.62      0.59      2265



  'precision', 'predicted', average, warn_for)


**Changing Learning rate to 1e-4 and epochs to 30**

In [0]:
def make_lstm_model(n_categories, embedding_size, lstm_size, lstm_dropout, dropout):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    lstm1_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(embedding_layer, mask=mask)
    lstm2_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(lstm1_layer, mask=mask)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(lstm2_layer)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model = make_lstm_model(
    n_categories=n_categories,
    embedding_size=64,
    lstm_size=64,
    lstm_dropout=0.2,
    dropout=0.5,
)

In [0]:
model.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=30,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30


In [0]:
model.evaluate(validation_dataset, steps=validation_dataset_steps)



[0.8466829342501504, 0.6473214, 0.86339283]

In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        76
           1       0.00      0.00      0.00       156
           2       0.35      0.65      0.46       281
           3       0.38      0.23      0.29       455
           4       0.80      0.91      0.85      1297

    accuracy                           0.65      2265
   macro avg       0.31      0.36      0.32      2265
weighted avg       0.58      0.65      0.60      2265



  'precision', 'predicted', average, warn_for)


**CNN Model**

In [0]:
def make_cnn_model(
    n_categories, embedding_size, conv_size, kernel_size, dropout, stride=2
):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    cnn1_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(embedding_layer)
    cnn1_dropout = keras.layers.Dropout(dropout)(cnn1_layer)
    cnn2_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(cnn1_dropout)
    cnn2_dropout = keras.layers.Dropout(dropout)(cnn2_layer)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(cnn2_dropout)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model = make_cnn_model(
    n_categories=n_categories,
    embedding_size=64,
    conv_size=64,
    kernel_size=3,
    dropout=0.5,
)

In [0]:
model.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=20,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        80
           1       0.38      0.02      0.03       175
           2       0.36      0.33      0.35       291
           3       0.32      0.31      0.32       478
           4       0.73      0.90      0.81      1241

    accuracy                           0.60      2265
   macro avg       0.36      0.31      0.30      2265
weighted avg       0.54      0.60      0.56      2265



  'precision', 'predicted', average, warn_for)


**Changing learning rate to 1e-4 and epochs to 30, embeddings and lstm size to 128**

---




In [0]:
def make_cnn_model(
    n_categories, embedding_size, conv_size, kernel_size, dropout, stride=2
):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    cnn1_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(embedding_layer)
    cnn1_dropout = keras.layers.Dropout(dropout)(cnn1_layer)
    cnn2_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(cnn1_dropout)
    cnn2_dropout = keras.layers.Dropout(dropout)(cnn2_layer)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(cnn2_dropout)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model = make_cnn_model(
    n_categories=n_categories,
    embedding_size=128,
    conv_size=128,
    kernel_size=3,
    dropout=0.5,
)

In [0]:
model.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=20,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        76
           1       0.38      0.06      0.10       156
           2       0.35      0.41      0.38       281
           3       0.32      0.27      0.29       455
           4       0.77      0.91      0.83      1297

    accuracy                           0.63      2265
   macro avg       0.36      0.33      0.32      2265
weighted avg       0.57      0.63      0.59      2265



  'precision', 'predicted', average, warn_for)


**Wavenet Model**

In [0]:
def make_wavenet_model(n_categories, embedding_size, conv_size, dropout):
    model_layers = []
    for rate in (1, 2, 4, 8, 16) * 3:
        model_layers.append(
            keras.layers.Conv1D(
                filters=conv_size,
                kernel_size=2,
                padding="causal",
                dilation_rate=rate,
                activation=keras.activations.relu,
                kernel_initializer=keras.initializers.he_uniform(),
            )
        )
        model_layers.append(keras.layers.Dropout(dropout))
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            *model_layers,
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model = make_wavenet_model(
    n_categories=n_categories, embedding_size=64, conv_size=64, dropout=0.5
)

In [0]:
model.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=3e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=10,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [0]:
print(
    classification_report(
        data_container.validation.y, model.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        80
           1       0.00      0.00      0.00       175
           2       0.00      0.00      0.00       291
           3       0.00      0.00      0.00       478
           4       0.55      1.00      0.71      1241

    accuracy                           0.55      2265
   macro avg       0.11      0.20      0.14      2265
weighted avg       0.30      0.55      0.39      2265



  'precision', 'predicted', average, warn_for)


**Changing learning rate to 1e-3 and epochs to 20**

In [0]:
def make_wavenet_model(n_categories, embedding_size, conv_size, dropout):
    model_layers = []
    for rate in (1, 2, 4, 8, 16) * 3:
        model_layers.append(
            keras.layers.Conv1D(
                filters=conv_size,
                kernel_size=4,
                padding="causal",
                dilation_rate=rate,
                activation=keras.activations.relu,
                kernel_initializer=keras.initializers.he_uniform(),
            )
        )
        model_layers.append(keras.layers.Dropout(dropout))
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            *model_layers,
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model_wavenet = make_wavenet_model(
    n_categories=n_categories, embedding_size=64, conv_size=64, dropout=0.6
)



In [0]:
model_wavenet.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-3),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [0]:
history = model_wavenet.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=20,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 566 steps, validate for 70 steps
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20


In [0]:
print(
    classification_report(
        data_container.validation.y, model_wavenet.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        76
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       281
           3       0.00      0.00      0.00       455
           4       0.57      1.00      0.73      1297

    accuracy                           0.57      2265
   macro avg       0.11      0.20      0.15      2265
weighted avg       0.33      0.57      0.42      2265



  'precision', 'predicted', average, warn_for)


In [0]:
model_wavenet.evaluate(validation_dataset, steps=validation_dataset_steps)



[1.3304282869611468, 0.5754464, 0.7745536]

## **Models with balanced data sets**

Sequential LSTM model

In [0]:
def make_sequential_lstm_model(
    n_categories, embedding_size, lstm_size, lstm_dropout, dropout
):
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.Bidirectional(
                keras.layers.LSTM(
                    lstm_size, dropout=lstm_dropout, return_sequences=True
                )
            ),
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model_lstm_balanced = make_sequential_lstm_model(
    n_categories=n_categories,
    embedding_size=64,
    lstm_size=64,
    lstm_dropout=0.3,
    dropout=0.5,
)

In [0]:
model_lstm_balanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=3e-4),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [139]:
history = model_lstm_balanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=10,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 1456 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10


KeyboardInterrupt: ignored

In [140]:
model_lstm_balanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[5.069083390917097, 0.21919642, 0.74866074]

In [141]:
print(
    classification_report(
        data_container.validation.y, model_lstm_balanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        82
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       279
           3       0.22      1.00      0.36       501
           4       0.00      0.00      0.00      1247

    accuracy                           0.22      2265
   macro avg       0.04      0.20      0.07      2265
weighted avg       0.05      0.22      0.08      2265



  'precision', 'predicted', average, warn_for)


Tunned hyper parameters

In [0]:
model_lstm_balanced = make_sequential_lstm_model(
    n_categories=n_categories,
    embedding_size=32,
    lstm_size=32,
    lstm_dropout=0.2,
    dropout=0.5,
)

In [0]:
model_lstm_balanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-3),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [152]:
history = model_lstm_balanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=5,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 1456 steps, validate for 70 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [153]:
model_lstm_balanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[3.878310578210013, 0.55133927, 0.7705357]

In [154]:
print(
    classification_report(
        data_container.validation.y, model_lstm_balanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        82
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       279
           3       0.00      0.00      0.00       501
           4       0.55      1.00      0.71      1247

    accuracy                           0.55      2265
   macro avg       0.11      0.20      0.14      2265
weighted avg       0.30      0.55      0.39      2265



  'precision', 'predicted', average, warn_for)


In [155]:
model_lstm_balanced_f1 = f1_score(data_container.validation.y, model_lstm_balanced.predict(validation_dataset).argmax(axis=1), average='macro')

  'precision', 'predicted', average, warn_for)


In [156]:
print(f'F1_macro score: {model_lstm_balanced_f1:.2%}')

F1_macro score: 14.20%


**Lstm with embedding layer**

In [0]:
def make_lstm_model(n_categories, embedding_size, lstm_size, lstm_dropout, dropout):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    lstm1_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(embedding_layer, mask=mask)
    lstm2_layer = keras.layers.Bidirectional(
        keras.layers.LSTM(lstm_size, dropout=lstm_dropout, return_sequences=True)
    )(lstm1_layer, mask=mask)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(lstm2_layer)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model_lstm_embedd_balanced = make_lstm_model(
    n_categories=n_categories,
    embedding_size=8,
    lstm_size=8,
    lstm_dropout=0.3,
    dropout=0.5,
)

In [0]:
model_lstm_embedd_balanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-3),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [160]:
history = model_lstm_embedd_balanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=10,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 1456 steps, validate for 70 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10


In [161]:
model_lstm_embedd_balanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[1.5911267612661635, 0.55133927, 0.7705357]

In [162]:
print(
    classification_report(
        data_container.validation.y, model_lstm_embedd_balanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        82
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       279
           3       0.00      0.00      0.00       501
           4       0.55      1.00      0.71      1247

    accuracy                           0.55      2265
   macro avg       0.11      0.20      0.14      2265
weighted avg       0.30      0.55      0.39      2265



  'precision', 'predicted', average, warn_for)


In [163]:
model_lstm_embedd_balanced_f1 = f1_score(data_container.validation.y, model_lstm_embedd_balanced.predict(validation_dataset).argmax(axis=1), average='macro')

  'precision', 'predicted', average, warn_for)


In [164]:
print(f'F1_macro score: {model_lstm_embedd_balanced_f1:.2%}')

F1_macro score: 14.20%


**CNN model**

In [0]:
def make_cnn_model(
    n_categories, embedding_size, conv_size, kernel_size, dropout, stride=2
):
    input_layer = keras.layers.Input(shape=[None])
    mask = keras.layers.Lambda(lambda inputs: keras.backend.not_equal(inputs, 0))(
        input_layer
    )
    embedding_layer = keras.layers.Embedding(
        VOCABULARY_SIZE + n_oov_buckets, embedding_size
    )(input_layer)
    cnn1_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(embedding_layer)
    cnn1_dropout = keras.layers.Dropout(dropout)(cnn1_layer)
    cnn2_layer = keras.layers.Conv1D(
        conv_size,
        kernel_size=kernel_size,
        strides=stride,
        activation=keras.activations.relu,
        kernel_initializer=keras.initializers.he_uniform(),
    )(cnn1_dropout)
    cnn2_dropout = keras.layers.Dropout(dropout)(cnn2_layer)
    max_pool_layer = keras.layers.GlobalMaxPool1D()(cnn2_dropout)
    dropout_layer = keras.layers.Dropout(dropout)(max_pool_layer)
    output_layer = keras.layers.Dense(
        n_categories, activation=keras.activations.softmax
    )(dropout_layer)
    return keras.Model(inputs=input_layer, outputs=output_layer)

In [0]:
model_cnn_balanced = make_cnn_model(
    n_categories=n_categories,
    embedding_size=8,
    conv_size=8,
    kernel_size=3,
    dropout=0.5,
)

In [0]:
model_cnn_balanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-3),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [168]:
history = model_cnn_balanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=5,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 1456 steps, validate for 70 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [169]:
model_cnn_balanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[1.3415424176624844, 0.55133927, 0.5879464]

In [170]:
print(
    classification_report(
        data_container.validation.y, model_cnn_balanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        82
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       279
           3       0.00      0.00      0.00       501
           4       0.55      1.00      0.71      1247

    accuracy                           0.55      2265
   macro avg       0.11      0.20      0.14      2265
weighted avg       0.30      0.55      0.39      2265



  'precision', 'predicted', average, warn_for)


In [171]:
model_cnn_balanced_f1 = f1_score(data_container.validation.y, model_cnn_balanced.predict(validation_dataset).argmax(axis=1), average='macro')

  'precision', 'predicted', average, warn_for)


In [172]:
print(f'F1_macro score: {model_cnn_balanced_f1:.2%}')

F1_macro score: 14.20%


**Wavenet model**

In [0]:
def make_wavenet_model(n_categories, embedding_size, conv_size, dropout):
    model_layers = []
    for rate in (1, 2, 4, 8, 16) * 3:
        model_layers.append(
            keras.layers.Conv1D(
                filters=conv_size,
                kernel_size=4,
                padding="causal",
                dilation_rate=rate,
                activation=keras.activations.relu,
                kernel_initializer=keras.initializers.he_uniform(),
            )
        )
        model_layers.append(keras.layers.Dropout(dropout))
    return keras.models.Sequential(
        [
            keras.layers.Embedding(
                VOCABULARY_SIZE + n_oov_buckets,
                embedding_size,
                mask_zero=True,
                input_shape=[None],
            ),
            *model_layers,
            keras.layers.GlobalMaxPool1D(),
            keras.layers.Dropout(dropout),
            keras.layers.Dense(n_categories, activation=keras.activations.softmax),
        ]
    )

In [0]:
model_wavenet_balanced = make_wavenet_model(
    n_categories=n_categories, embedding_size=64, conv_size=64, dropout=0.5
)

In [0]:
model_wavenet_balanced.compile(
    loss=keras.losses.sparse_categorical_crossentropy,
    optimizer=keras.optimizers.Adam(lr=1e-3),
    metrics=[
        keras.metrics.sparse_categorical_accuracy,
        sparse_top_2_categorical_accuracy,
    ],
)

In [176]:
history = model_wavenet_balanced.fit(
    train_dataset,
    steps_per_epoch=train_dataset_steps,
    validation_data=validation_dataset,
    validation_steps=validation_dataset_steps,
    epochs=5,
    callbacks=[
        keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.3),
        keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True),
    ],
    class_weight=class_weights,
)

Train for 1456 steps, validate for 70 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [177]:
model_wavenet_balanced.evaluate(validation_dataset, steps=validation_dataset_steps)



[3.390112745761871, 0.55133927, 0.7705357]

In [178]:
print(
    classification_report(
        data_container.validation.y, model_wavenet_balanced.predict(validation_dataset).argmax(axis=1)
    )
)

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        82
           1       0.00      0.00      0.00       156
           2       0.00      0.00      0.00       279
           3       0.00      0.00      0.00       501
           4       0.55      1.00      0.71      1247

    accuracy                           0.55      2265
   macro avg       0.11      0.20      0.14      2265
weighted avg       0.30      0.55      0.39      2265



  'precision', 'predicted', average, warn_for)


In [179]:
model_wavenet_balanced_f1 = f1_score(data_container.validation.y, model_wavenet_balanced_f1.predict(validation_dataset).argmax(axis=1), average='macro')

NameError: ignored

In [0]:
print(f'F1_macro score: {model_wavenet_balanced_f1:.2%}')

## **SUMMARY**

In [0]:
summary_results = pd.DataFrame(columns = ["f1_macro avg"])

In [0]:
summary_results.loc["model_lstm_balanced_f1", 'f1_macro avg'] = model_lstm_balanced_f1
summary_results.loc["model_lstm_embedd_balanced_f1", 'f1_macro avg'] = model_lstm_embedd_balanced_f1
summary_results.loc["model_cnn_balanced_f1", 'f1_macro avg'] = model_cnn_balanced_f1
summary_results.loc["model_wavenet_balanced_f1", 'f1_macro avg'] = model_wavenet_balanced_f1

In [0]:
summary_results