Neural Programmer Interpreter

This was an attempt to implement Neural Programmer Interpreter using Keras. Unfortunately, I wasn't able to reproduce the paper's results. Maybe this code will be helpful for someone who wants to implement this type of network in Keras. I found the lstm seq2seq example helpful.

Usage

Generate data

python -m npi.generate_inputs addition --min 1 --max 20 --examples-per 32 | python -m npi.generate_data --task addition > train.json

Train model

python -m npi.train --train train.json --encoder addition_encoder.h5 --npi-core npi_core.h5

Inference

python -m npi.inference --encoder addition_encoder.h5 --npi npi_core.h5 --out inference.json addition --input0 579 --input1 1221

Animate the inference output
```
python -m npi.animate inference.json
```
You can also animate a training example since they're in the same format.
```
python -m npi.animate train.json --idx 1
```

Notes

The authors were able to get their model to learn addition up to 3000 digits with 100% accuracy from examples of 1-20 digits. I wasn't able to get all validation accuracies to 100% when training on examples of 1-20 digits:

Epoch 18/20
548/548 [==============================] - 381s 695ms/sample - loss: 0.0095 - stop_layer_loss: 9.4365e-06 - program_key_embedding_layer_loss: 2.5972e-05 - arguments_layer_loss: 0.0084 - stop_layer_weighted_acc: 1.0000 - program_key_embedding_layer_weighted_acc: 1.0000 -
 arguments_layer_weighted_acc: 0.9971 - val_loss: 0.0068 - val_stop_layer_loss: 2.5042e-06 - val_program_key_embedding_layer_loss: 1.6411e-05 - val_arguments_layer_loss: 0.0068 - val_stop_layer_weighted_acc: 1.0000 - val_program_key_embedding_layer_weighted_acc: 1.0000
- val_arguments_layer_weighted_acc: 0.9974
Epoch 19/20
548/548 [==============================] - 381s 695ms/sample - loss: 0.0045 - stop_layer_loss: 3.2207e-06 - program_key_embedding_layer_loss: 1.2200e-05 - arguments_layer_loss: 0.0034 - stop_layer_weighted_acc: 1.0000 - program_key_embedding_layer_weighted_acc: 1.0000 -
 arguments_layer_weighted_acc: 0.9990 - val_loss: 9.0034e-04 - val_stop_layer_loss: 1.4168e-06 - val_program_key_embedding_layer_loss: 7.5872e-06 - val_arguments_layer_loss: 8.8430e-04 - val_stop_layer_weighted_acc: 1.0000 - val_program_key_embedding_layer_weighted_acc:
l_arguments_layer_weighted_acc: 0.9787

I just used a Dense layer with a softmax activation for the program embedding layer.
Using the adaptive sampling from the paper didn't make a difference for me.
Training on the addition dataset (32 examples for each input length from 1 to 20) takes a few hours.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
npi		npi
test		test
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Programmer Interpreter

Usage

Notes

About

Releases

Packages

Languages

dkamm/npi

Folders and files

Latest commit

History

Repository files navigation

Neural Programmer Interpreter

Usage

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages