# Answer Sentence Reranking

In this notebook we are going to reimplement the network whose variants are used in:

- **Learning to rank short text pairs with convolutional deep neural networks (Severyn and Moschitti 2015)**
- **Modeling relational information in question-answer pairs with convolutional neural networks. (Severyn and Moschitti 2016)**
- **Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking (Tymoshenko, Bonadiman and Moschitti 2016)**
- **Learning to Rank Non-Factoid Answers: Comment Selection in Web Forums (Tymoshenko, Bonadiman and Moschitti 2016)**

![alt text](images/qa.png)

The data for this notebook can be found here: https://drive.google.com/open?id=0B8xjf4y9r8jCR2FHSGF5NUgtTDA


In [None]:
import pandas as pd
from nltk.tokenize import word_tokenize

We import the datasets in pandas dataframes

In [4]:
train = pd.read_csv('data/WikiQA-train.tsv', sep='\t')
dev = pd.read_csv('data/WikiQA-dev.tsv', sep='\t')
test = pd.read_csv('data/WikiQA-test.tsv', sep='\t')
train

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0
5,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-0,"In physics , circular motion is a movement of ...",0
6,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-1,"It can be uniform, with constant angular rate ...",0
7,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-2,The rotation around a fixed axis of a three-di...,0
8,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-3,The equations of motion describe the movement ...,0
9,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-4,Examples of circular motion include: an artifi...,0


Sentences are tokenized and lowercased

In [5]:
def preprocess(sent):
    return word_tokenize(sent.decode('utf-8').encode("ascii","ignore").lower())

train['Question_tok'] = train['Question'].map(preprocess)
train['Sentence_tok'] = train['Sentence'].map(preprocess)
dev['Question_tok'] = dev['Question'].map(preprocess)
dev['Sentence_tok'] = dev['Sentence'].map(preprocess)
test['Question_tok'] = test['Question'].map(preprocess)
test['Sentence_tok'] = test['Sentence'].map(preprocess)
train[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri..."
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h..."
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]"
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within..."
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave..."


Load the word embeddings and prepare the dictionary for our dataset

In [6]:
from itertools import chain
from gensim.models import Word2Vec

w2v = Word2Vec.load_word2vec_format('data/aquaint+wiki.txt.gz.ndim=50.bin', binary=True)
dictionary = {'PAD':0, 'UNK':1}

toks = (set(chain.from_iterable(train['Question_tok'])) | set(chain.from_iterable(train['Sentence_tok'])) | \
       set(chain.from_iterable(dev['Question_tok'])) | set(chain.from_iterable(dev['Sentence_tok']))     | \
       set(chain.from_iterable(test['Question_tok'])) | set(chain.from_iterable(test['Sentence_tok'])))

i = 2
for _, tok in enumerate(toks):
    if tok in w2v:
        dictionary[tok] = i
        i+=1
len(dictionary)

37662

Map the words to indexes

In [7]:
def word2id(sent):
    return map(lambda x: dictionary.get(x, 1), sent)

train['Question_'] = train['Question_tok'].map(word2id)
train['Sentence_'] = train['Sentence_tok'].map(word2id)
dev['Question_'] = dev['Question_tok'].map(word2id)
dev['Sentence_'] = dev['Sentence_tok'].map(word2id)
test['Question_'] = test['Question_tok'].map(word2id)
test['Sentence_'] = test['Sentence_tok'].map(word2id)
train[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[28162, 36754, 19112, 30738, 980, 23143]","[31708, 13078, 8603, 19112, 3713, 17358, 35615..."
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[28162, 36754, 19112, 30738, 980, 23143]","[11943, 16722, 24896, 14267, 2527, 26639, 2356..."
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[28162, 36754, 19112, 30738, 980, 23143]","[16722, 1897, 14278, 11943, 36141, 19112, 3713]"
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[28162, 36754, 19112, 30738, 980, 23143]","[31708, 19112, 3713, 14267, 31708, 3713, 980, ..."
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[28162, 36754, 19112, 30738, 980, 23143]","[19112, 30738, 36754, 34421, 37000, 16722, 307..."


In [8]:
max(len(sent) for sent in train['Question_'])

23

In [9]:
max(len(sent) for sent in train['Sentence_'])

320

Pad sentences to a fixed size lenght

In [10]:
import numpy as np
from keras.preprocessing.sequence import pad_sequences

max_len_q = 40
max_len_a = 40

train['Question_'] = train['Question_'].apply(lambda s: pad_sequences([s], max_len_q)[0])
train['Sentence_'] = train['Sentence_'].apply(lambda s: pad_sequences([s], max_len_a)[0])
dev['Question_'] = dev['Question_'].apply(lambda s: pad_sequences([s], max_len_q)[0])
dev['Sentence_'] = dev['Sentence_'].apply(lambda s: pad_sequences([s], max_len_a)[0])
test['Question_'] = test['Question_'].apply(lambda s: pad_sequences([s], max_len_q)[0])
test['Sentence_'] = test['Sentence_'].apply(lambda s: pad_sequences([s], max_len_a)[0])
train[0:5]

Using TensorFlow backend.


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."


Prepare the embedding matrix for model initialization

In [11]:
def emb_matrix(dictionary, model):
    embedding_matrix = np.zeros((len(dictionary), 50))
    for word in dictionary:
        if word in model:
            embedding_matrix[dictionary[word]] = model[word]
    return embedding_matrix

This is the simplest version of the model where the two sentence embedding (Created via Convolution+Pooling) are concatenated and classified using a MLP.

In [12]:
import numpy as np
np.random.seed(42)

from keras.models import Model, Sequential
from keras.layers import (Input,
                          Embedding,
                          Convolution1D,
                          Dropout,
                          SpatialDropout1D,
                          GlobalMaxPooling1D,
                          GlobalAveragePooling1D,
                          concatenate,
                          Dense)

from keras.optimizers import Adam


que = Input(shape=(max_len_q,))
ans = Input(shape=(max_len_a,))



que_model = Sequential()
que_model.add(Embedding(len(dictionary), 50 ,input_length=max_len_q, weights=[emb_matrix(dictionary, w2v)], trainable=True))
que_model.add(Convolution1D(100, 5, activation='tanh'))
que_model.add(GlobalAveragePooling1D())


ans_model = Sequential()
ans_model.add(Embedding(len(dictionary), 50,input_length=max_len_a, weights=[emb_matrix(dictionary, w2v)], trainable=True))
ans_model.add(Convolution1D(100, 5, activation='tanh'))
ans_model.add(GlobalAveragePooling1D())

que_emb = que_model(que)
ans_emb = ans_model(ans)

join = concatenate([que_emb, ans_emb])

classify = Sequential()
classify.add(Dense(100, activation='tanh', input_dim=200))
classify.add(Dense(1, activation='sigmoid'))
out = classify(join)

model = Model(inputs=[que, ans], outputs=[out])
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_1 (InputLayer)             (None, 40)            0                                            
____________________________________________________________________________________________________
input_2 (InputLayer)             (None, 40)            0                                            
____________________________________________________________________________________________________
sequential_1 (Sequential)        (None, 100)           1908200     input_1[0][0]                    
____________________________________________________________________________________________________
sequential_2 (Sequential)        (None, 100)           1908200     input_2[0][0]                    
___________________________________________________________________________________________

In [13]:
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0001),
              metrics=['accuracy'])

In this setting accuracy is not a relevant metric for classification, for this reason we create a custom Callback to implement early stopping and model saving.

In [14]:
from keras.callbacks import Callback
from metrics import map_score_filtered
from sklearn.metrics import roc_auc_score

class EpochEval(Callback):

    def __init__(self, validation_data, evaluate,
                 patience=np.Inf, save_model=False, score_c=None):
        super(Callback, self).__init__()

        (self.qids, self.aids), self.X, self.y = validation_data
        self.evaluate = evaluate
        self.best = -np.Inf
        self.patience = patience
        self.wait = 0
        self.waited = False
        self.save_model = save_model


    def on_epoch_end(self, epoch, logs={}):
        print
        prediction = self.model.predict(self.X)
        val = self.evaluate(self.qids, self.y, prediction)
        print("\t{0} = {1:.4f}".format(self.evaluate.__name__, val))
        if val*0.995 > self.best:
            self.model.save('qa.h5')
            print ('\tBest {0}: {1:.4f}'.format(self.evaluate.__name__, val))
            self.best = val
            self.wait = 0
            self.waited = False
        else:
            self.wait += 1
            if self.wait >= self.patience:
                self.model.stop_training = True
        print
        
def data(dataset):
    return (dataset['QuestionID'],dataset['SentenceID']), [np.vstack(dataset['Question_'].tolist()), np.vstack(dataset['Sentence_'].tolist())], np.vstack(dataset['Label'].tolist())

In [15]:
model.fit([np.vstack(train['Question_'].tolist()), np.vstack(train['Sentence_'].tolist())],
          np.vstack(train['Label'].tolist()),
          batch_size=100,
          epochs=100000,
          shuffle=True,
          callbacks=[EpochEval(data(dev), map_score_filtered, patience=5)])

Epoch 1/100000
	map_score_filtered = 0.4057
	Best map_score_filtered: 0.4057

Epoch 2/100000
	map_score_filtered = 0.4952
	Best map_score_filtered: 0.4952

Epoch 3/100000
	map_score_filtered = 0.6167
	Best map_score_filtered: 0.6167

Epoch 4/100000
	map_score_filtered = 0.6316
	Best map_score_filtered: 0.6316

Epoch 5/100000
	map_score_filtered = 0.6309

Epoch 6/100000
	map_score_filtered = 0.6405
	Best map_score_filtered: 0.6405

Epoch 7/100000
	map_score_filtered = 0.6323

Epoch 8/100000
	map_score_filtered = 0.6260

Epoch 9/100000
	map_score_filtered = 0.6288

Epoch 10/100000
	map_score_filtered = 0.6390

Epoch 11/100000
	map_score_filtered = 0.6343



<keras.callbacks.History at 0x10fc3df10>

In [16]:
(qid,_ ), X, lab = data(test)

In [17]:
from keras.models import load_model
model = load_model('qa.h5')
pred = model.predict(X)

In [18]:
map_score_filtered(qid, lab, pred)

0.5855936209513423

In [19]:
from metrics import map_score

map_score(qid, lab, pred)

0.22401279785598976

In [20]:
test['pred'] = pd.Series(y for y in pred)

In [21]:
test[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred
0,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-0,African immigration to the United States refer...,0,"[how, african, americans, were, immigrated, to...","[african, immigration, to, the, united, states...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0164832]
1,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-1,The term African in the scope of this article ...,0,"[how, african, americans, were, immigrated, to...","[the, term, african, in, the, scope, of, this,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0142648]
2,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-2,From the Immigration and Nationality Act of 19...,0,"[how, african, americans, were, immigrated, to...","[from, the, immigration, and, nationality, act...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[25042, 36456, 5487, 10238, 17364, 6105, 19738...",[0.0487929]
3,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-3,African immigrants in the United States come f...,0,"[how, african, americans, were, immigrated, to...","[african, immigrants, in, the, united, states,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0122906]
4,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-4,"They include people from different national, l...",0,"[how, african, americans, were, immigrated, to...","[they, include, people, from, different, natio...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0113492]


In [22]:
(qid,_ ), X, lab = data(train)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
train['pred'] = pd.Series(y for y in pred)
train

0.646524449072
0.266305772126


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0429119]
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0469051]
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0288006]
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0868418]
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0235198]
5,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-0,"In physics , circular motion is a movement of ...",0,"[how, are, the, directions, of, the, velocity,...","[in, physics, ,, circular, motion, is, a, move...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0295807]
6,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-1,"It can be uniform, with constant angular rate ...",0,"[how, are, the, directions, of, the, velocity,...","[it, can, be, uniform, ,, with, constant, angu...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 142...",[0.00999854]
7,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-2,The rotation around a fixed axis of a three-di...,0,"[how, are, the, directions, of, the, velocity,...","[the, rotation, around, a, fixed, axis, of, a,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0103329]
8,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-3,The equations of motion describe the movement ...,0,"[how, are, the, directions, of, the, velocity,...","[the, equations, of, motion, describe, the, mo...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00984223]
9,Q2,How are the directions of the velocity and for...,D2,Circular motion,D2-4,Examples of circular motion include: an artifi...,0,"[how, are, the, directions, of, the, velocity,...","[examples, of, circular, motion, include, :, a...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[29915, 19738, 31708, 22471, 36456, 14267, 102...",[0.0498128]


The second version of the network will use an additional features (i.e. word overlap count) this is a very informative feature that alone provides ~49 MAP

In [24]:
from nltk.corpus import stopwords
stop = set(stopwords.words('english'))
def count_feat(que, ans):
    return len((set(que)&set(ans))-stop)

train['count'] = pd.Series(count_feat(que, ans) for que, ans in zip(train['Question_tok'], train['Sentence_tok']))
test['count'] = pd.Series(count_feat(que, ans) for que, ans in zip(test['Question_tok'], test['Sentence_tok']))
dev['count'] = pd.Series(count_feat(que, ans) for que, ans in zip(dev['Question_tok'], dev['Sentence_tok']))
train[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0429119],1
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0469051],0
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0288006],1
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0868418],2
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0235198],2


In [25]:
import numpy as np
np.random.seed(42)

from keras.models import Model, Sequential
from keras.layers import (Input,
                          Embedding,
                          Convolution1D,
                          Dropout,
                          SpatialDropout1D,
                          GlobalMaxPooling1D,
                          GlobalAveragePooling1D,
                          concatenate,
                          Dense)

from keras.optimizers import Adam


que = Input(shape=(max_len_q,))
ans = Input(shape=(max_len_a,))
cnt = Input(shape=(1,))


que_model = Sequential()
que_model.add(Embedding(len(dictionary), 50 ,input_length=max_len_q, weights=[emb_matrix(dictionary, w2v)], trainable=True))
que_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform'))
que_model.add(GlobalMaxPooling1D())


ans_model = Sequential()
ans_model.add(Embedding(len(dictionary), 50 ,input_length=max_len_a, weights=[emb_matrix(dictionary, w2v)], trainable=True))
ans_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform'))
ans_model.add(GlobalMaxPooling1D())

que_emb = que_model(que)
ans_emb = ans_model(ans)

join = concatenate([que_emb, ans_emb, cnt])

classify = Sequential()
classify.add(Dense(100, activation='tanh', input_dim=201))
classify.add(Dense(1, activation='sigmoid'))
out = classify(join)

model = Model(inputs=[que, ans, cnt], outputs=[out])
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_3 (InputLayer)             (None, 40)            0                                            
____________________________________________________________________________________________________
input_4 (InputLayer)             (None, 40)            0                                            
____________________________________________________________________________________________________
sequential_7 (Sequential)        (None, 100)           1908200     input_3[0][0]                    
____________________________________________________________________________________________________
sequential_8 (Sequential)        (None, 100)           1908200     input_4[0][0]                    
___________________________________________________________________________________________

In [26]:
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0001),
              metrics=['accuracy'])

In [27]:
def data(dataset):
    return (dataset['QuestionID'],dataset['SentenceID']), [np.vstack(dataset['Question_'].tolist()), np.vstack(dataset['Sentence_'].tolist()), dataset['count'].as_matrix()], np.vstack(dataset['Label'].tolist())

model.fit([np.vstack(train['Question_'].tolist()), np.vstack(train['Sentence_'].tolist()), train['count'].as_matrix()],
          np.vstack(train['Label'].tolist()),
          batch_size=100,
          epochs=100000,
          shuffle=True,
          callbacks=[EpochEval(data(dev), map_score_filtered, patience=5)])



Epoch 1/100000
	map_score_filtered = 0.5344
	Best map_score_filtered: 0.5344

Epoch 2/100000
	map_score_filtered = 0.6093
	Best map_score_filtered: 0.6093

Epoch 3/100000
	map_score_filtered = 0.6913
	Best map_score_filtered: 0.6913

Epoch 4/100000
	map_score_filtered = 0.6951
	Best map_score_filtered: 0.6951

Epoch 5/100000
	map_score_filtered = 0.6934

Epoch 6/100000
	map_score_filtered = 0.6847

Epoch 7/100000
	map_score_filtered = 0.6927

Epoch 8/100000
	map_score_filtered = 0.6967

Epoch 9/100000
	map_score_filtered = 0.6950



<keras.callbacks.History at 0x193fd2a90>

In [28]:
del model
model = load_model('qa.h5')
(qid,_ ), X, lab = data(test)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
test['pred_cnt'] = pd.Series(y for y in pred)
test[0:5]

0.656012328878
0.250950748031


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,pred_cnt
0,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-0,African immigration to the United States refer...,0,"[how, african, americans, were, immigrated, to...","[african, immigration, to, the, united, states...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0164832],1,[0.0165003]
1,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-1,The term African in the scope of this article ...,0,"[how, african, americans, were, immigrated, to...","[the, term, african, in, the, scope, of, this,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0142648],1,[0.0118129]
2,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-2,From the Immigration and Nationality Act of 19...,0,"[how, african, americans, were, immigrated, to...","[from, the, immigration, and, nationality, act...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[25042, 36456, 5487, 10238, 17364, 6105, 19738...",[0.0487929],1,[0.0352618]
3,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-3,African immigrants in the United States come f...,0,"[how, african, americans, were, immigrated, to...","[african, immigrants, in, the, united, states,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0122906],1,[0.0212921]
4,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-4,"They include people from different national, l...",0,"[how, african, americans, were, immigrated, to...","[they, include, people, from, different, natio...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0113492],0,[0.0176515]


In [29]:
def overlap(que, ans):
    return (set(que)&set(ans))

train['overlap'] = pd.Series(overlap(que, ans) for que, ans in zip(train['Question_tok'], train['Sentence_tok']))
test['overlap'] = pd.Series(overlap(que, ans) for que, ans in zip(test['Question_tok'], test['Sentence_tok']))
dev['overlap'] = pd.Series(overlap(que, ans) for que, ans in zip(dev['Question_tok'], dev['Sentence_tok']))
train[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,overlap
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0429119],1,{glacier}
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0469051],0,{}
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0288006],1,{glacier}
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0868418],2,"{glacier, formed}"
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0235198],2,"{glacier, caves, are}"


Now we use word overlaps as binary feature at the embedding level 

In [30]:
def overlap_feats(st, overlapping):
    return [1 if word not in overlapping else 2 for word in st]

train['Question_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(train['Question_tok'], train['overlap']))
train['Sentence_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(train['Sentence_tok'], train['overlap']))
dev['Question_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(dev['Question_tok'], dev['overlap']))
dev['Sentence_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(dev['Sentence_tok'], dev['overlap']))
test['Question_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(test['Question_tok'], test['overlap']))
test['Sentence_ov'] = pd.Series(overlap_feats(que, ov) for que, ov in zip(test['Sentence_tok'], test['overlap']))


In [31]:
train['Question_ov'] = train['Question_ov'].map(lambda s: pad_sequences([s], max_len_q)[0])
train['Sentence_ov'] = train['Sentence_ov'].map(lambda s: pad_sequences([s], max_len_a)[0])
dev['Question_ov'] = dev['Question_ov'].map(lambda s: pad_sequences([s], max_len_q)[0])
dev['Sentence_ov'] = dev['Sentence_ov'].map(lambda s: pad_sequences([s], max_len_a)[0])
test['Question_ov'] = test['Question_ov'].map(lambda s: pad_sequences([s], max_len_q)[0])
test['Sentence_ov'] = test['Sentence_ov'].map(lambda s: pad_sequences([s], max_len_a)[0])
train[0:5]

Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,overlap,Question_ov,Sentence_ov
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0429119],1,{glacier},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0469051],0,{},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0288006],1,{glacier},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0868418],2,"{glacier, formed}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0235198],2,"{glacier, caves, are}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ..."


In [33]:
import numpy as np
np.random.seed(42)

from keras.models import Model, Sequential
from keras.layers import (Input,
                          Embedding,
                          Convolution1D,
                          Dropout,
                          SpatialDropout1D,
                          GlobalMaxPooling1D,
                          GlobalAveragePooling1D,
                          concatenate,
                          Dense)

from keras.optimizers import Adam


que = Input(shape=(max_len_q,))
ans = Input(shape=(max_len_a,))
que_ov = Input(shape=(max_len_q,))
ans_ov = Input(shape=(max_len_a,))
cnt = Input(shape=(1,))

que_ov_emb = Embedding(3, 5,input_length=max_len_q)(que_ov)
que_word_emb = Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_q, trainable=True)(que)

que_emb = concatenate([que_ov_emb, que_word_emb])

ans_ov_emb = Embedding(3, 5,input_length=max_len_a)(ans_ov)
ans_word_emb = Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_a, trainable=True)(ans)

ans_emb = concatenate([ans_ov_emb, ans_word_emb])

que_model = Sequential()
#que_model.add(Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_q, trainable=True))
que_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform', input_shape=(max_len_a, 55)))
que_model.add(GlobalMaxPooling1D())


ans_model = Sequential()
#ans_model.add(Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_a, trainable=True))
ans_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform', input_shape=(max_len_a, 55)))
ans_model.add(GlobalMaxPooling1D())

que_emb = que_model(que_emb)
ans_emb = ans_model(ans_emb)

join = concatenate([que_emb, ans_emb, cnt])

classify = Sequential()
classify.add(Dense(100, activation='tanh', input_dim=201))
classify.add(Dense(1, activation='sigmoid'))
out = classify(join)

model = Model(inputs=[que, ans,que_ov, ans_ov, cnt], outputs=[out])
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_13 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
input_11 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
input_14 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
input_12 (InputLayer)            (None, 40)            0                                            
___________________________________________________________________________________________

In [34]:
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0001),
              metrics=['accuracy'])

In [35]:
def data(dataset):
    return (dataset['QuestionID'],dataset['SentenceID']), [np.vstack(dataset['Question_'].tolist()), np.vstack(dataset['Sentence_'].tolist()),np.vstack(dataset['Question_ov'].tolist()), np.vstack(dataset['Sentence_ov'].tolist()), dataset['count'].as_matrix()], np.vstack(dataset['Label'].tolist())

model.fit([np.vstack(train['Question_'].tolist()), np.vstack(train['Sentence_'].tolist()), np.vstack(train['Question_ov'].tolist()), np.vstack(train['Sentence_ov'].tolist()), train['count'].as_matrix()],
          np.vstack(train['Label'].tolist()),
          batch_size=100,
          epochs=100000,
          shuffle=True,
          callbacks=[EpochEval(data(dev), map_score_filtered, patience=5)])

Epoch 1/100000
	map_score_filtered = 0.5484
	Best map_score_filtered: 0.5484

Epoch 2/100000
	map_score_filtered = 0.6311
	Best map_score_filtered: 0.6311

Epoch 3/100000
	map_score_filtered = 0.6912
	Best map_score_filtered: 0.6912

Epoch 4/100000
	map_score_filtered = 0.7076
	Best map_score_filtered: 0.7076

Epoch 5/100000
	map_score_filtered = 0.7115
	Best map_score_filtered: 0.7115

Epoch 6/100000
	map_score_filtered = 0.7148

Epoch 7/100000
	map_score_filtered = 0.7180
	Best map_score_filtered: 0.7180

Epoch 8/100000
	map_score_filtered = 0.7256
	Best map_score_filtered: 0.7256

Epoch 9/100000
	map_score_filtered = 0.7208

Epoch 10/100000
	map_score_filtered = 0.7156

Epoch 11/100000
	map_score_filtered = 0.7089

Epoch 12/100000
	map_score_filtered = 0.7046

Epoch 13/100000
	map_score_filtered = 0.6958



<keras.callbacks.History at 0x1a5e5bcd0>

In [36]:
del model
model = load_model('qa.h5')
(qid,_ ), X, lab = data(test)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
test['pred_ov'] = pd.Series(y for y in pred)
test[0:5]


0.696658273069
0.266499434619


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,pred_cnt,overlap,Question_ov,Sentence_ov,pred_ov
0,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-0,African immigration to the United States refer...,0,"[how, african, americans, were, immigrated, to...","[african, immigration, to, the, united, states...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0164832],1,[0.0165003],"{to, the, were, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0063144]
1,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-1,The term African in the scope of this article ...,0,"[how, african, americans, were, immigrated, to...","[the, term, african, in, the, scope, of, this,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0142648],1,[0.0118129],"{to, the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00171064]
2,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-2,From the Immigration and Nationality Act of 19...,0,"[how, african, americans, were, immigrated, to...","[from, the, immigration, and, nationality, act...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[25042, 36456, 5487, 10238, 17364, 6105, 19738...",[0.0487929],1,[0.0352618],"{to, the, immigrated}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, ...",[0.0164284]
3,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-3,African immigrants in the United States come f...,0,"[how, african, americans, were, immigrated, to...","[african, immigrants, in, the, united, states,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0122906],1,[0.0212921],"{the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0022005]
4,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-4,"They include people from different national, l...",0,"[how, african, americans, were, immigrated, to...","[they, include, people, from, different, natio...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0113492],0,[0.0176515],{},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00173804]


In [37]:
(qid,_ ), X, lab = data(train)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
train['pred_ov'] = pd.Series(y for y in pred)
train[0:5]

0.769095329131
0.31679316344


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,overlap,Question_ov,Sentence_ov,pred_ov
0,Q1,how are glacier caves formed?,D1,Glacier cave,D1-0,A partly submerged glacier cave on Perito More...,0,"[how, are, glacier, caves, formed, ?]","[a, partly, submerged, glacier, cave, on, peri...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0429119],1,{glacier},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0446032]
1,Q1,how are glacier caves formed?,D1,Glacier cave,D1-1,The ice facade is approximately 60 m high,0,"[how, are, glacier, caves, formed, ?]","[the, ice, facade, is, approximately, 60, m, h...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0469051],0,{},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0522293]
2,Q1,how are glacier caves formed?,D1,Glacier cave,D1-2,Ice formations in the Titlis glacier cave,0,"[how, are, glacier, caves, formed, ?]","[ice, formations, in, the, titlis, glacier, cave]","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0288006],1,{glacier},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0337728]
3,Q1,how are glacier caves formed?,D1,Glacier cave,D1-3,A glacier cave is a cave formed within the ice...,1,"[how, are, glacier, caves, formed, ?]","[a, glacier, cave, is, a, cave, formed, within...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0868418],2,"{glacier, formed}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.326419]
4,Q1,how are glacier caves formed?,D1,Glacier cave,D1-4,"Glacier caves are often called ice caves , but...",0,"[how, are, glacier, caves, formed, ?]","[glacier, caves, are, often, called, ice, cave...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0235198],2,"{glacier, caves, are}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0800012]


In this version of the network semantic matching layer is implemented

In [39]:
import numpy as np
np.random.seed(42)

from keras.models import Model, Sequential
from keras.layers import (Input,
                          Embedding,
                          Convolution1D,
                          Dropout,
                          SpatialDropout1D,
                          GlobalMaxPooling1D,
                          GlobalAveragePooling1D,
                          concatenate,
                          dot,
                          Dense)

from keras.optimizers import Adam


que = Input(shape=(max_len_q,))
ans = Input(shape=(max_len_a,))
que_ov = Input(shape=(max_len_q,))
ans_ov = Input(shape=(max_len_a,))
cnt = Input(shape=(1,))

que_ov_emb = Embedding(3, 5,input_length=max_len_q)(que_ov)
que_word_emb = Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_q, trainable=True)(que)

que_emb = concatenate([que_ov_emb, que_word_emb])

ans_ov_emb = Embedding(3, 5,input_length=max_len_a)(ans_ov)
ans_word_emb = Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_a, trainable=True)(ans)

ans_emb = concatenate([ans_ov_emb, ans_word_emb])

que_model = Sequential()
#que_model.add(Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_q, trainable=True))
que_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform', input_shape=(max_len_a, 55)))
que_model.add(GlobalMaxPooling1D())


ans_model = Sequential()
#ans_model.add(Embedding(len(dictionary), 50 , weights=[emb_matrix(dictionary, w2v)],input_length=max_len_a, trainable=True))
ans_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform', input_shape=(max_len_a, 55)))
ans_model.add(GlobalMaxPooling1D())

que_emb = que_model(que_emb)
ans_emb = ans_model(ans_emb)

_que_emb = Dense(100, use_bias=False)(que_emb)
sim = dot([_que_emb, ans_emb], axes=1)

join = concatenate([que_emb, sim, ans_emb, cnt])

classify = Sequential()
classify.add(Dense(100, activation='tanh', input_dim=202))
classify.add(Dense(1, activation='sigmoid'))
out = classify(join)

model = Model(inputs=[que, ans,que_ov, ans_ov, cnt], outputs=[out])
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_23 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
input_21 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
embedding_17 (Embedding)         (None, 40, 5)         15          input_23[0][0]                   
____________________________________________________________________________________________________
embedding_18 (Embedding)         (None, 40, 50)        1883100     input_21[0][0]                   
___________________________________________________________________________________________

In [40]:
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0001),
              metrics=['accuracy'])

In [41]:
def data(dataset):
    return (dataset['QuestionID'],dataset['SentenceID']), [np.vstack(dataset['Question_'].tolist()), np.vstack(dataset['Sentence_'].tolist()),np.vstack(dataset['Question_ov'].tolist()), np.vstack(dataset['Sentence_ov'].tolist()), dataset['count'].as_matrix()], np.vstack(dataset['Label'].tolist())

model.fit([np.vstack(train['Question_'].tolist()), np.vstack(train['Sentence_'].tolist()), np.vstack(train['Question_ov'].tolist()), np.vstack(train['Sentence_ov'].tolist()), train['count'].as_matrix()],
          np.vstack(train['Label'].tolist()),
          batch_size=100,
          epochs=100000,
          shuffle=True,
          callbacks=[EpochEval(data(dev), map_score_filtered, patience=5)])

Epoch 1/100000
	map_score_filtered = 0.5075
	Best map_score_filtered: 0.5075

Epoch 2/100000
	map_score_filtered = 0.6442
	Best map_score_filtered: 0.6442

Epoch 3/100000
	map_score_filtered = 0.7026
	Best map_score_filtered: 0.7026

Epoch 4/100000
	map_score_filtered = 0.7166
	Best map_score_filtered: 0.7166

Epoch 5/100000
	map_score_filtered = 0.7140

Epoch 6/100000
	map_score_filtered = 0.7168

Epoch 7/100000
	map_score_filtered = 0.7160

Epoch 8/100000
	map_score_filtered = 0.7154

Epoch 9/100000
	map_score_filtered = 0.6934



<keras.callbacks.History at 0x1b4b56d50>

In [42]:
del model
model = load_model('qa.h5')
(qid,_ ), X, lab = data(test)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
test['pred_alexi'] = pd.Series(y for y in pred)
test[0:5]

0.674579589505
0.258053462017


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,pred_cnt,overlap,Question_ov,Sentence_ov,pred_ov,pred_alexi
0,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-0,African immigration to the United States refer...,0,"[how, african, americans, were, immigrated, to...","[african, immigration, to, the, united, states...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0164832],1,[0.0165003],"{to, the, were, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0063144],[0.0175954]
1,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-1,The term African in the scope of this article ...,0,"[how, african, americans, were, immigrated, to...","[the, term, african, in, the, scope, of, this,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0142648],1,[0.0118129],"{to, the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00171064],[0.00922345]
2,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-2,From the Immigration and Nationality Act of 19...,0,"[how, african, americans, were, immigrated, to...","[from, the, immigration, and, nationality, act...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[25042, 36456, 5487, 10238, 17364, 6105, 19738...",[0.0487929],1,[0.0352618],"{to, the, immigrated}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, ...",[0.0164284],[0.0271548]
3,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-3,African immigrants in the United States come f...,0,"[how, african, americans, were, immigrated, to...","[african, immigrants, in, the, united, states,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0122906],1,[0.0212921],"{the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0022005],[0.0114225]
4,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-4,"They include people from different national, l...",0,"[how, african, americans, were, immigrated, to...","[they, include, people, from, different, natio...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0113492],0,[0.0176515],{},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00173804],[0.00947171]


Weight sharing between sentence encoders

In [43]:
import numpy as np
np.random.seed(42)

from keras.models import Model, Sequential
from keras.layers import (Input,
                          Embedding,
                          Convolution1D,
                          Dropout,
                          SpatialDropout1D,
                          GlobalMaxPooling1D,
                          GlobalAveragePooling1D,
                          concatenate,
                          Dense)

from keras.optimizers import Adam


que = Input(shape=(max_len_q,))
ans = Input(shape=(max_len_a,))
cnt = Input(shape=(1,))


snt_model = Sequential()
snt_model.add(Embedding(len(dictionary), 50 ,input_length=max_len_q, weights=[emb_matrix(dictionary, w2v)], trainable=True))
snt_model.add(Convolution1D(100, 5, activation='tanh', kernel_initializer='lecun_uniform'))
snt_model.add(GlobalMaxPooling1D())


que_emb = snt_model(que)
ans_emb = snt_model(ans)

join = concatenate([que_emb, ans_emb, cnt])

classify = Sequential()
classify.add(Dense(100, activation='tanh', input_dim=201))
classify.add(Dense(1, activation='sigmoid'))
out = classify(join)

model = Model(inputs=[que, ans, cnt], outputs=[out])
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_26 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
input_27 (InputLayer)            (None, 40)            0                                            
____________________________________________________________________________________________________
sequential_30 (Sequential)       (None, 100)           1908200     input_26[0][0]                   
                                                                   input_27[0][0]                   
____________________________________________________________________________________________________
input_28 (InputLayer)            (None, 1)             0                                   

In [44]:
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0001),
              metrics=['accuracy'])

In [45]:
def data(dataset):
    return (dataset['QuestionID'],dataset['SentenceID']), [np.vstack(dataset['Question_'].tolist()), np.vstack(dataset['Sentence_'].tolist()), dataset['count'].as_matrix()], np.vstack(dataset['Label'].tolist())

model.fit([np.vstack(train['Question_'].tolist()), np.vstack(train['Sentence_'].tolist()), train['count'].as_matrix()],
          np.vstack(train['Label'].tolist()),
          batch_size=100,
          epochs=100000,
          shuffle=True,
          callbacks=[EpochEval(data(dev), map_score_filtered, patience=5)])

Epoch 1/100000
	map_score_filtered = 0.4289
	Best map_score_filtered: 0.4289

Epoch 2/100000
	map_score_filtered = 0.5653
	Best map_score_filtered: 0.5653

Epoch 3/100000
	map_score_filtered = 0.6411
	Best map_score_filtered: 0.6411

Epoch 4/100000
	map_score_filtered = 0.6738
	Best map_score_filtered: 0.6738

Epoch 5/100000
	map_score_filtered = 0.6814
	Best map_score_filtered: 0.6814

Epoch 6/100000
	map_score_filtered = 0.6833

Epoch 7/100000
	map_score_filtered = 0.6918
	Best map_score_filtered: 0.6918

Epoch 8/100000
	map_score_filtered = 0.6978
	Best map_score_filtered: 0.6978

Epoch 9/100000
	map_score_filtered = 0.6814

Epoch 10/100000
	map_score_filtered = 0.6847

Epoch 11/100000
	map_score_filtered = 0.6824

Epoch 12/100000
	map_score_filtered = 0.6910

Epoch 13/100000
	map_score_filtered = 0.6964



<keras.callbacks.History at 0x1aea238d0>

In [46]:
del model
model = load_model('qa.h5')
(qid,_ ), X, lab = data(test)
pred = model.predict(X)
print(map_score_filtered(qid, lab, pred))
print(map_score(qid, lab, pred))
test['pred_sms'] = pd.Series(y for y in pred)
test[0:5]

0.681573031228
0.260728730994


Unnamed: 0,QuestionID,Question,DocumentID,DocumentTitle,SentenceID,Sentence,Label,Question_tok,Sentence_tok,Question_,Sentence_,pred,count,pred_cnt,overlap,Question_ov,Sentence_ov,pred_ov,pred_alexi,pred_sms
0,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-0,African immigration to the United States refer...,0,"[how, african, americans, were, immigrated, to...","[african, immigration, to, the, united, states...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0164832],1,[0.0165003],"{to, the, were, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0063144],[0.0175954],[0.00757051]
1,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-1,The term African in the scope of this article ...,0,"[how, african, americans, were, immigrated, to...","[the, term, african, in, the, scope, of, this,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0142648],1,[0.0118129],"{to, the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00171064],[0.00922345],[0.00612904]
2,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-2,From the Immigration and Nationality Act of 19...,0,"[how, african, americans, were, immigrated, to...","[from, the, immigration, and, nationality, act...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[25042, 36456, 5487, 10238, 17364, 6105, 19738...",[0.0487929],1,[0.0352618],"{to, the, immigrated}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, ...",[0.0164284],[0.0271548],[0.0254255]
3,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-3,African immigrants in the United States come f...,0,"[how, african, americans, were, immigrated, to...","[african, immigrants, in, the, united, states,...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0122906],1,[0.0212921],"{the, african}","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0022005],[0.0114225],[0.00975198]
4,Q0,HOW AFRICAN AMERICANS WERE IMMIGRATED TO THE US,D0,African immigration to the United States,D0-4,"They include people from different national, l...",0,"[how, african, americans, were, immigrated, to...","[they, include, people, from, different, natio...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.0113492],0,[0.0176515],{},"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",[0.00173804],[0.00947171],[0.0071074]
