Using MaLSTM model(Siamese networks + LSTM with Manhattan distance) to detect semantic similarity between question pairs. Training dataset used is a subset of the original Quora Question Pairs Dataset(~363K pairs used).
- Paper, Articles
test.csv is too big, so I had extracted only the top 20 questions and created a file called
test-20.csv and It is used in the
You should put all data files to
How to Run
$ python3 train.py
test-20.csv file mentioned above.
$ python3 predict.py
I have tried with various parameters such as number of hidden states of LSTM cell, activation function of LSTM cell and repeated count of epochs. I have used NVIDIA Tesla P40 GPU x 2 for training and 10% data was used as the validation set(batch size=1024*2). As a result, I have reached about 82.29% accuracy after 50 epochs about 10 mins later.
Epoch 50/50 363861/363861 [==============================] - 12s 33us/step - loss: 0.1172 - acc: 0.8486 - val_loss: 0.1315 - val_acc: 0.8229 Training time finished. 50 epochs in 601.24