<h1>Content<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Load" data-toc-modified-id="Load-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Load</a></span></li><li><span><a href="#Tokenizing" data-toc-modified-id="Tokenizing-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Tokenizing</a></span></li><li><span><a href="#Building-model" data-toc-modified-id="Building-model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Building model</a></span></li><li><span><a href="#Training" data-toc-modified-id="Training-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Training</a></span></li><li><span><a href="#Conclusion" data-toc-modified-id="Conclusion-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Conclusion</a></span></li></ul></div>

## Load

In [1]:
import pandas as pd, numpy as np

In [2]:
train = pd.read_csv('data/train.csv', index_col='time')
test = pd.read_csv('data/test.csv', index_col='time')

In [3]:
X_train, y_train, X_valid, y_valid = train[[col for col in train.columns if col!='severity']], train['severity'],\
                                     test[[col for col in train.columns if col!='severity']], test['severity']

## Tokenizing

In [4]:
from keras.models import Model
from keras.layers import Dense, Embedding, Input
from keras.layers import Conv1D, MaxPooling1D, GlobalMaxPool1D, Dropout, concatenate
from keras.preprocessing import text as keras_text, sequence as keras_seq
from keras.callbacks import EarlyStopping, ModelCheckpoint

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [5]:
# define network parameters
max_features = 100
maxlen = 200

In [6]:
tokenizer = keras_text.Tokenizer(num_words=10000, filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~',
                                 lower=True, split=' ', char_level=True, oov_token=None)

tokenizer.fit_on_texts(list(X_train['message_encoding']))
# train data
list_tokenized_train = tokenizer.texts_to_sequences(X_train['message_encoding'])
X_t = keras_seq.pad_sequences(list_tokenized_train, maxlen=maxlen)

# test data
list_tokenized_test = tokenizer.texts_to_sequences(X_valid['message_encoding'])
X_te = keras_seq.pad_sequences(list_tokenized_test, maxlen=maxlen)

In [7]:
from keras import backend as K

def precision(y_true, y_pred):
    """Precision metric.

    Only computes a batch-wise average of precision.

    Computes the precision, a metric for multi-label classification of
    how many selected items are relevant.
    """
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision

def recall(y_true, y_pred):
    """Recall metric.

    Only computes a batch-wise average of recall.

    Computes the recall, a metric for multi-label classification of
    how many relevant items are selected.
    """
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

def f1(y_true, y_pred):
    def recall(y_true, y_pred):
        """Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall))

## Building model

In [8]:
def build_model(conv_layers = 1, 
                dilation_rates = [2, 4], 
                embed_size = 32):
    
    inp = Input(shape=(None, ))
    x = Embedding(input_dim = len(tokenizer.word_counts)+1, 
                  output_dim = embed_size)(inp)
    prefilt_x = Dropout(0.25)(x)
    out_conv = []
    # dilation rate lets us use ngrams and skip grams to process 
    for dilation_rate in dilation_rates:
        x = prefilt_x
        for i in range(2):
            if dilation_rate>0:
                x = Conv1D(16*2**(i), 
                           kernel_size = 3, 
                           dilation_rate = dilation_rate,
                          activation = 'relu',
                          name = 'ngram_{}_cnn_{}'.format(dilation_rate, i)
                          )(x)
            else:
                x = Conv1D(16*2**(i), 
                           kernel_size = 1,
                          activation = 'relu',
                          name = 'word_fcl_{}'.format(i))(x)
        out_conv += [Dropout(0.25)(GlobalMaxPool1D()(x))]
    x = concatenate(out_conv, axis = -1)    
    x = Dense(64, activation='relu')(x)
    x = Dropout(0.2)(x)
    x = Dense(32, activation='relu')(x)
    x = Dropout(0.2)(x)
    x = Dense(1, activation='sigmoid')(x)
    model = Model(inputs=inp, outputs=x)
    model.compile(loss='kullback_leibler_divergence',
                  optimizer='adam',
                  metrics=[precision, recall, f1])
    return model

In [9]:
model = build_model()
model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None)         0                                            
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, None, 32)     5152        input_1[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, None, 32)     0           embedding_1[0][0]                
__________________________________________________________________________________________________
ngram_2_cnn_0 (Conv1D)          (None, None, 16)     1552        dropout_1[0][0]                  
__________________________________________________________________________________________________
ngram_4_cn

## Training

Unfortunately, the memory in my GPU is not enough to apply one-hot to target, so try to train the model on all classes.

In [10]:
def train(X, y, batch_size = 16, epochs = 100, name='Test'):
    file_path= name + "best_weights.h5"
    checkpoint = ModelCheckpoint(file_path, monitor='val_f1', 
                                 verbose=1, save_best_only=True, mode='max')
        
    early = EarlyStopping(monitor="val_f1", mode="max", patience=30)
        
    callbacks_list = [checkpoint, early]
    model.fit(X, y,
              validation_split=0.2,
              batch_size=batch_size, 
              epochs=epochs, 
              callbacks=callbacks_list)
    model.load_weights(file_path)
    return model

In [11]:
keras_m = train(X_t, y_train, epochs=100, name='wieghts/second')

Train on 8291 samples, validate on 2073 samples
Epoch 1/100

Epoch 00001: val_f1 improved from -inf to 0.97041, saving model to wieghts/secondbest_weights.h5
Epoch 2/100

Epoch 00002: val_f1 did not improve
Epoch 3/100

Epoch 00003: val_f1 did not improve
Epoch 4/100

Epoch 00004: val_f1 did not improve
Epoch 5/100

Epoch 00005: val_f1 did not improve
Epoch 6/100

Epoch 00006: val_f1 did not improve
Epoch 7/100

Epoch 00007: val_f1 did not improve
Epoch 8/100

Epoch 00008: val_f1 did not improve
Epoch 9/100

Epoch 00009: val_f1 did not improve
Epoch 10/100

Epoch 00010: val_f1 did not improve
Epoch 11/100

Epoch 00011: val_f1 did not improve
Epoch 12/100

Epoch 00012: val_f1 did not improve
Epoch 13/100

Epoch 00013: val_f1 did not improve
Epoch 14/100

Epoch 00014: val_f1 did not improve
Epoch 15/100

Epoch 00015: val_f1 did not improve
Epoch 16/100

Epoch 00016: val_f1 did not improve
Epoch 17/100

Epoch 00017: val_f1 did not improve
Epoch 18/100

Epoch 00018: val_f1 did not improve



Epoch 00031: val_f1 did not improve


In [12]:
keras_m.evaluate(X_te,y_valid)



[4.7640592121629663e-07, 0.93058568329718, 1.0, 0.9629979288759045]

## Conclusion

Seems that we have the best score. F1-score - 0.9629979288759045, but I don't believe in it. Let's try another approach in another notebook. I also tried one-hot encoding for target variable, but kernel dies, because of memory limitation in GPU.