Skip to content
forked from amasky/ram

Recurrent Attention Model with Chainer

Notifications You must be signed in to change notification settings

RaffEdwardBAH/ram

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reccurent Attention Model

Reccurent Attention Model with Chainer based on the following paper
arXiv:1406.6247: Recurrent Models of Visual Attention [Volodymyr Mnih+ 2014]

Features

  • RAM model difinition file (Chainer)
  • script for training the model on MNIST
  • script to run the model on MNIST

not yet implemented

  • hyper-params to get the best accuracy in the paper
  • multi-scale glimpse
  • models to solve "Translated MNIST" task

Examples

Ex.1 Ex.2 Ex.3

Training the model without LSTM takes a day with CPU (reaches 96% accuracy)
loss and accuracy

Training the model with LSTM takes ??? with CPU
(still searching for the hyper-parameters to get the best accuracy in the paper...)

Dependencies

Python(2 or 3), Chainer, scikit-learn, PIL, tqdm

Usage

python train.py   

If you use a GPU, add the option "-g deviceID".
When you use LSTM units in core RNN layer, add the option "--lstm".
(better performance but a little time consuming with LSTMs)

python train.py -g 0 --lstm  

After training, you can get predictions by the trained model.

python predict.py -m ram_wolstm.chainermodel  

About

Recurrent Attention Model with Chainer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%