Pytorch-Neural-Tensor-Network

This is pytorch implementation of Neural Tensor Network (NTN) with Attention Model

Dataset

To test model, I use a dataset of 50,000 movie reviews taken from IMDb. It is divied into 'train', 'test' dataset and each data has 25,000 movie reviews and labels(positive, negetive). You can access to dataset with this link

Reference

This project is highly depend on "SVO extractor". Thanks to wonderful work from "peter3125" which can extract triple set(subject, object, verb) very easily, I can make pipeline by including his open source. So someone who try this project should contact his repository to download resource.

How to use it?

Follow the example

1 Download "SVO extractor"

Check this link to download "SVO extractor" and make folder "extractor" to put downloaded resources.

2 Generate Word2Vec Embeddings

I implement "gensim" library to generate Word2vec embeddings. To generate word2vec embeddings, follow the sample

python word_embeder.py --train_path source/train.csv --dict_path word2vec --size 200 --window 5 --min_count 3

3 Train Model

There is a lot of options to check.

train_path : A File to train model
valid_path : A File to valid model
dict_path : A Path of Word2vec model for embeddings of HAN model
save_path : A Path to save result of HAN model
max_sent_len : Maximum length of sentence to analysis ( Sentences of each document which is exceed to the limit is eliminated to train model )
max_svo_len : Maximum length of word for each SVO(subject/object/verb) to analysis ( Words of each sentence which is exceed to the limit is eliminated to train model )
tensor_dim : A Tensor size of model
atten_size : A Attention size of model
hidden_size : A hidden size of GRU model
n_layers : A number of layers of GRU model
n_epochs : A number of epoches to train
dropout_p : dropout probability
lr : learning rate
early_stop : A early_stop condition. If you don't want to use this options, put -1
batch_size : Batch size to train

python train.py --train_path source/train.csv --valid_path source/test.csv --dict_path word2vec/1 --hidden_size 256 --atten_size 128 --batch_size 16

Result

Result with hyper parameter settings

word2vec dimention	hidden size	atten size	tesor size	Best Epoch	lr	train loss	valid loss	valid accuracy
200	64	64	64	1	0.0001	0.0363	0.0341	0.7230
200	128	128	128	1	0.0001	0.0361	0.0339	0.7246
200	256	256	256	1	0.0001	0.0363	0.0344	0.7192

Repo available online

Neural Tensor network
- Reimplementing Neural Tensor Networks for Knowledge Base Completion (KBC) in the TensorFlow framework
Stock Prediction with Deep Learning
- Event-Driven-Stock-Prediction-using-Deep-Learning
- Sentiment-Analysis-in-Event-Driven-Stock-Price-Movement-Prediction
Text Classification with CNN
- cnn-text-classification-pytorch

Comment

We got bad result with Neural Tensor Network(NTN) compare to other works like HAN, BERT. The reason we can find easily is that this model is highly depend on the 'SVO extractor'. If 'SVO extractor' is not good enough to catch valuable SVO(Subject/Verb/Object) in the sentences, other model in the pipeline will not guarante good result neither. Another reason we can guess is insufficient data to train model. NTN model have a lot of parameters, because of unique structure(bidirectional batch multiplication). And NTN model decrease data from dataset to extract inputs, because of unique input structure(Subject/Verb/Object). Above reasons make it hard to get good result.

You can see the detail review of mine, if you are korean.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
extractor		extractor
images		images
README.md		README.md
REVIEW.md		REVIEW.md
data_loader.py		data_loader.py
model.py		model.py
modified_extractor.py		modified_extractor.py
tools.py		tools.py
train.py		train.py
trainer.py		trainer.py
word_embeder.py		word_embeder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pytorch-Neural-Tensor-Network

Dataset

Reference

How to use it?

1 Download "SVO extractor"

2 Generate Word2Vec Embeddings

3 Train Model

Result

Repo available online

Comment

About

Releases

Packages

Languages

JoungheeKim/Pytorch-Neural-Tensor-Network

Folders and files

Latest commit

History

Repository files navigation

Pytorch-Neural-Tensor-Network

Dataset

Reference

How to use it?

1 Download "SVO extractor"

2 Generate Word2Vec Embeddings

3 Train Model

Result

Repo available online

Comment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages