## WordCNN
This is a model from github:
It is originally written in **Tensorflow**. In this Repo, I update the architecture using
**Keras**

**Author:** Lenin G. Falconi

**Date:** May 2020.

**email:** enteatenea@gmail.com, lenin.falconi@epn.edu.ec


In [5]:
pip install wget




In [6]:
pip install nltk



In [7]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

In [8]:
import tensorflow as tf
import os
import numpy as np
from data_utils import *
from sklearn.model_selection import  train_test_split
print(tf.__version__)

2.2.0


The following are constants in the project

In [0]:
NUM_CLASS = 14
BATCH_SIZE = 64
NUM_EPOCHS = 10
WORD_MAX_LEN = 100
CHAR_MAX_LEN = 1014

## Downloading the dataset
The script data_utils.py has some functions that allow to download the dataset

In [10]:
if not os.path.exists("dbpedia_csv"):
    print("Downloading dbpedia dataset...")
    download_dbpedia()
print("Creating dataset")
word_dict = build_word_dict()
vocabulary_size = len(word_dict)
x, y = build_word_dataset("train", word_dict, WORD_MAX_LEN)

train_x, valid_x, train_y, valid_y = train_test_split(x, y, test_size=0.15)
train_x = np.array(train_x)
valid_x = np.array(valid_x)
train_y = np.array(train_y)
valid_y = np.array(valid_y)

print("train and valid datasets created ...")
print("train x: {}, x[0]: {}, type:{}".format(train_x.shape, train_x[0], type(train_x[0])))
print("valid x: {}".format(np.shape(valid_x)))
print("train y: {}".format(np.shape(train_y)))

Downloading dbpedia dataset...
Creating dataset
train and valid datasets created ...
train x: (476000, 100), x[0]: [ 14822 256490      7      6    109      5      3     64  14822      4
      3   2116     36    201 256490     11    402      4    718     19
   9659   2029      4     69  18786   1652      2      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0
      0      0      0      0      0      0      0      0      0      0], type:<class 'numpy.ndarray'>
valid x: (84000, 100)
train y: (476000,)


## Declaring the ConvNet Model
Using the Keras Functional API, this section implements the
WordCNN model as a function.

In [0]:
def word_cnn_model_create(embedding_size=128,
                          num_filters=100,
                          filter_sizes=[3, 4, 5],
                          num_classes=14,
                          document_max_len=100):
    x = tf.keras.Input(shape=(100, ))
    embeddings = tf.keras.layers.Embedding(input_dim=vocabulary_size,
                                           output_dim=embedding_size,
                                           input_length=document_max_len,
                                           embeddings_initializer='uniform')(x)
    x_emb = tf.keras.layers.Reshape((100, 128, 1))(embeddings)
    pooled_outputs = []
    for filter_size in filter_sizes:
        conv = tf.keras.layers.Conv2D(input_shape=(None, 100, 128, 1),
                                      filters=num_filters,
                                      kernel_size=[filter_size, embedding_size],
                                      strides=(1, 1),
                                      padding="valid",
                                      activation="relu")(x_emb)
        pool = tf.keras.layers.MaxPooling2D(pool_size=[document_max_len - filter_size + 1, 1],
                                            strides=(1, 1),
                                            padding='valid')(conv)
        pooled_outputs.append(pool)

    h_pool = tf.keras.layers.concatenate(pooled_outputs)
    h_pool_flat = tf.keras.layers.Flatten()(h_pool)
    h_drop = tf.keras.layers.Dropout(rate=0.5)(h_pool_flat)
    output = tf.keras.layers.Dense(units=num_classes, activation="softmax")(h_drop)

    model = tf.keras.Model(inputs=x, outputs=output)
    return model

The previous defined function creates a Model CNN. Some constants are required
to the model work. The structure of the model is printed

In [12]:
embedding_size = 128
num_filters = 100
filter_sizes = [3, 4, 5]
num_class = 14
wordCNNModel = word_cnn_model_create(embedding_size=embedding_size,
                                     num_filters=num_filters,
                                     num_classes=num_class,
                                     filter_sizes=filter_sizes,
                                     document_max_len=WORD_MAX_LEN
                                     )
wordCNNModel.compile(optimizer=tf.keras.optimizers.Adam(lr=1e-3),
                     loss=tf.keras.losses.sparse_categorical_crossentropy,
                     metrics=['acc'])

wordCNNModel.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 100)]        0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, 100, 128)     72109312    input_1[0][0]                    
__________________________________________________________________________________________________
reshape (Reshape)               (None, 100, 128, 1)  0           embedding[0][0]                  
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 98, 1, 100)   38500       reshape[0][0]                    
______________________________________________________________________________________________

Training is started by calling the fit method

In [13]:
%%timeit
print("training started")
wordCNNModel.fit(x=train_x,
                 y=train_y,
                 batch_size=BATCH_SIZE,
                 epochs=NUM_EPOCHS,
                 verbose=1)
print("training finished")

training started
Epoch 1/10

KeyboardInterrupt: ignored