This is pytorch implementation of Neural Tensor Network (NTN) with Attention Model
To test model, I use a dataset of 50,000 movie reviews taken from IMDb. It is divied into 'train', 'test' dataset and each data has 25,000 movie reviews and labels(positive, negetive). You can access to dataset with this link
This project is highly depend on "SVO extractor". Thanks to wonderful work from "peter3125" which can extract triple set(subject, object, verb) very easily, I can make pipeline by including his open source. So someone who try this project should contact his repository to download resource.
Follow the example
Check this link to download "SVO extractor" and make folder "extractor" to put downloaded resources.
I implement "gensim" library to generate Word2vec embeddings. To generate word2vec embeddings, follow the sample
python word_embeder.py --train_path source/train.csv --dict_path word2vec --size 200 --window 5 --min_count 3
There is a lot of options to check.
- train_path : A File to train model
- valid_path : A File to valid model
- dict_path : A Path of Word2vec model for embeddings of HAN model
- save_path : A Path to save result of HAN model
- max_sent_len : Maximum length of sentence to analysis ( Sentences of each document which is exceed to the limit is eliminated to train model )
- max_svo_len : Maximum length of word for each SVO(subject/object/verb) to analysis ( Words of each sentence which is exceed to the limit is eliminated to train model )
- tensor_dim : A Tensor size of model
- atten_size : A Attention size of model
- hidden_size : A hidden size of GRU model
- n_layers : A number of layers of GRU model
- n_epochs : A number of epoches to train
- dropout_p : dropout probability
- lr : learning rate
- early_stop : A early_stop condition. If you don't want to use this options, put -1
- batch_size : Batch size to train
python train.py --train_path source/train.csv --valid_path source/test.csv --dict_path word2vec/1 --hidden_size 256 --atten_size 128 --batch_size 16
Result with hyper parameter settings
word2vec dimention | hidden size | atten size | tesor size | Best Epoch | lr | train loss | valid loss | valid accuracy |
---|---|---|---|---|---|---|---|---|
200 | 64 | 64 | 64 | 1 | 0.0001 | 0.0363 | 0.0341 | 0.7230 |
200 | 128 | 128 | 128 | 1 | 0.0001 | 0.0361 | 0.0339 | 0.7246 |
200 | 256 | 256 | 256 | 1 | 0.0001 | 0.0363 | 0.0344 | 0.7192 |
- Neural Tensor network
- Stock Prediction with Deep Learning
- Text Classification with CNN
We got bad result with Neural Tensor Network(NTN) compare to other works like HAN, BERT. The reason we can find easily is that this model is highly depend on the 'SVO extractor'. If 'SVO extractor' is not good enough to catch valuable SVO(Subject/Verb/Object) in the sentences, other model in the pipeline will not guarante good result neither. Another reason we can guess is insufficient data to train model. NTN model have a lot of parameters, because of unique structure(bidirectional batch multiplication). And NTN model decrease data from dataset to extract inputs, because of unique input structure(Subject/Verb/Object). Above reasons make it hard to get good result.
You can see the detail review of mine, if you are korean.