Skip to content
Code for the paper "Long Short-Term Memory with Dynamic Skip Connections".
Branch: master
Clone or download
Pull request Compare This branch is even with lecholin:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Long Short-Term Memory with Dynamic Skip Connections

TensorFlow implementation of Long Short-Term Memory with Dynamic Skip Connections (AAAI 2019).


The code is partially referred to


  • Python 2.7 or higher
  • Numpy
  • TensorFlow 1.0


Language Modeling Task

To evaluate the model on the Penn Treebank language model corpus, you can run the following command under the directory of ./LanguageModeling.

$ python --rnn_size 650 --rnn_layers 2 --lamda 1 --n_actions 5

The hyper-parameters in this model are:

Parameter Meaning Default
rnn_size size of LSTM internal state 650
rnn_layers number of layers in the LSTM 2
dropout dropout rate 0.5
lamda lambda value of RL 1
learning_rate initial learning rate 1.0
lr_threshold learning rate stops decreasing after threshold 0.005
param_init initialize parameter range of weights 0.05
n_actions number of actions to take 5

You can change these values by passing the arguments explicitly.

Number Prediction Task

Given a sequence of L positive integers $x_{T-1}$ , and the label $y = x_{x_{T-1}}​$, the ideal network should learn to ignore the remaining useless numbers and learn how to skip from the input data. Here are two examples to illustrate the task:


To investigate this task of number prediction, you should produce datasets first under the directory of ./NumberPrediction by running:

$ python

Then, you can reproduce the results on number prediction dataset (with sequence length = 10+1) by executing:

$ python


If you find this code useful, please cite us in your work:

  title={Long Short-Term Memory with Dynamic Skip Connections},
  author={Gui, Tao and Zhang, Qi and Zhao, Lujun and Lin, Yaosong and Peng, Minlong and Gong, Jingjing and Huang, Xuanjing},
  journal={arXiv preprint arXiv:1811.03873},
You can’t perform that action at this time.