GitHub - rtaubes/lstm-1: How to define a prediction interval

How to define a prediction interval for LSTM network for time series.

This is the desciption of software used in the How to define a predictioni interval based on training set for an LSTM network for time-series.

One of the possible types of appications for Long-short term neural networks is forecasting. There are a lot of articles and books describing how to implement and use LSTM for prediction of count of passengeres in airlines, trends in a stock market. Having forecasted value, it is important to answer question:
What is a possible range for predicted value? In other words, if predicted value is X, actually it means that actual value expected to be in the range [X-x0, X+x1], where
X - predicted value,
x0 and x1 - lower and upper bounds of prediction

The LSTM model is based on Tensorflow. The source of data is the Kaggle competition Bitcoin Historical Data.

The data generator in the 'src' folder can generate not only batches from CSV files, but also some simple sequences like Sin for debugging. The main algorithm of using the data source is:

create a data generator

  import data_generator
  gen = data_generator.BatchGenerator(...)

do training of a model use the code

  for idx, x_batches, y_batches in gen:
    do something

if it is required, it is possible to reset a data generator to begin using the method reset(). For example:

  gen.reset()
  for idx, x_batches, y_batches is gen:
    do something

This method is used by the evaluate() method of Model:

- temporary set a batch size to 1
- reset generator using the reset()
- make predictions using data of generator
- returns actual and predicted values

The results of estimate() can be used to estimate a quality of a trained algorithm.

When the previous cycle has been finished, it is possible to make prediction of a future value using the last batch with length 1
Having a new input value, it can be added to a generator, and a generator will be ready to create a new batch of data for a model.
Another possibility is to add a predicted data to a generator, and predict new value. This method can be useful if more than one value should be predicted.

I want to notice that for currently prediction is used only for estimation of a prediction interval. Forecasting of future values is out of the current topic and will be implemented later.

The LSTM model saves checkpoints each 10 steps(this value can be changed), and at the end of each traing. This allows to make an interactive training:

Define in settings how many epochs is used for training
create model instanse as model = Model(SETS)
call the create() method which removes all previous checkpoints for this model
call the train() method of the model.
estimate quality of the model
if model should be trained later, call the train() method
repeat the two last steps as much as you need

Another possibility is to call the method from_latests_point() when a new model has been created. In this case, rather then starting training from scratch, model will start from the latest point.

The third method allows to start training or evaluation from some saved point. Use the method from_custom_point() where the argument is the path to a checkpoint. The path should not use extension. For example, if a checkpoint includes 3 files 'ckpt-1.data', 'cpkt-1.index', 'cpkt-1.data-...', use 'ckpt-1' as the path to the checkpoint.

There are two set of settings in the notebook. One is used for debugging an algorithm and settings on a laptop without using a GPU.
Another set is used for a computer with GPU. This algoritm was trained on a Google computer with GPU.

TODO:

implement the predict() method of Model
move the Model class outside of the notebook
implement an algorithm for retraining using when a received value is out of calculated confidence interval.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
Readme.md		Readme.md
jjz.tgz		jjz.tgz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

src

src

.gitignore

.gitignore

Readme.md

Readme.md

jjz.tgz

jjz.tgz

Repository files navigation

How to define a prediction interval for LSTM network for time series.

About

Releases

Packages

Languages

rtaubes/lstm-1

Folders and files

Latest commit

History

Repository files navigation

How to define a prediction interval for LSTM network for time series.

About

Topics

Resources

Stars

Watchers

Forks

Languages