Tiny implementation of deep learning models for NLP with Sony's NNabla
Switch branches/tags
Nothing to show
Clone or download
Latest commit e573dcb Nov 24, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
img update README Jan 28, 2018
language-models argparser Nov 24, 2018
seq2seq argparser Nov 24, 2018
LICENSE add LICENSE Dec 27, 2017
README.md argparser Nov 24, 2018

README.md

Deep learning implementation for NLP with NNabla

Tiny implementation of deep learning models for NLP with Sony's NNabla.

Test environment

  • Python 3.6.7
  • NNabla v1.0.9

New functions (different from the NNabla v0.9.7)

Parametric functions

  • simple_rnn
  • lstm
  • highway

Functions

  • time_distributed
  • time_distributed_softmax_cross_entropy

Models

Language models

Usage

To start training of the model:

cd language-models
python char-cnn-lstm.py

If you can use cudnn,

python char-cnn-lstm.py -c cudnn

After training, you can get the similar words to the query word:

In [3]: get_top_k('looooook', k=5)
Out[3]: ['look', 'looks', 'looked', 'loose', 'looking']

In [4]: get_top_k('while', k=5)
Out[4]: ['chile', 'whole', 'meanwhile', 'child', 'wholesale']

In [5]: get_top_k('richard', k=5)
Out[5]: ['richer', 'rich', 'michael', 'richter', 'richfield']

which is similar to the paper "Character-Aware Neural Language Models".

Pre-trained model is available (here).

Seq2Seq models

Usage

To start training of the model:

cd seq2seq
./download.sh
ipython
run attention.py

You can use pre-trained attention model:

cd seq2seq
./download.sh
ipython
run attention.py
Ctrl+C (interrupt)

!wget https://github.com/satopirka/nlp-nnabla/releases/download/v0.0.1-alpha/attention_en2ja.h5
nn.load_parameters('attention_en2ja.h5')

And you can try to translate Japanese sentence into English by the model like below:

nn.load_parameters('attention_en2ja.h5')

In [00]: translate("i was unable to look her in the face .")
Out[00]: '彼女の顔をまともに見ることが出来なかった。'

In [00]: translate("how far is it to the station ?")
Out[00]: '駅までどのくらいありますか。'

Future work

  • Skip-gram model
  • Continuous-BoW model
  • Encoder-decoder + local attention
  • Peephole LSTM
  • GRU
  • etc.