Time series prediction #862

ddovod · 2017-01-30T20:28:56Z

Hello guys.
I'm trying to reproduce this simple keras tutorial (only "sinwave" part).
So I have the following model:

    const size_t rho = 50;
    RNN<MeanSquaredError<>> model(rho);
    
    model.Add< LSTM<> >(1, 50, rho);
    model.Add< Dropout<> >(0.2);
    model.Add< LSTM<> >(50, 100, rho);
    model.Add< Dropout<> >(0.2);
    model.Add< Linear<> >(100, 1);

    RMSprop<decltype(model)> opt(model);

The main issue in the training phase. I would like to predict the next value after feeding some (50) previous. The model.Train(...) function gets 2 matrices - inputs and expected outputs, and it will throw "error: Mat::rows(): indices out of bounds or incorrectly used" if these matrices both will not be of shape (rho, sequences_count). In the keras tutorial they have only one value as output per input sequence.
So the main question: how can I reproduce keras example with mlpack? I can PR this as an example, test or something if you help me a bit.
Thank you a lot!

The text was updated successfully, but these errors were encountered:

zoq · 2017-01-31T02:07:43Z

Hello,

as you already pointed out the model from the keras tutorial uses a single response for the complete sequence; but the model as implemented right now expects one output for each input in your case something like:

data: [x0, x1, x2, x3, x4, x5, x6, x7, x8]

input seq [x0, x1, x2]
response [x1, x2, x3]

you could also use

input seq [x0, x1, x2]
response [x3, x3, x3]

so the data should look like:

data = arma::zeros(points, sequences);
labels = arma::zeros<arma::mat>(rho, sequences);

On the other hand, it's fairly easy to replicate the behavior, in fact, it worked before the ann revamp. I will open an issue maybe someone likes to implement the feature if not I'll do it.

I hope this was helpful.

ddovod · 2017-01-31T07:04:15Z

Hi @zoq. Thank you for response!
So, if I'll use x0, x1, x2 as input and x1, x2, x3 as output, it will train to predict the next element in the sequence on every learning step? Is model preserving state (internal state of lstm layers) between passing values of the sequence?
And can I use something like incremental prediction? I mean I want to feed some values to the model, one by one, and get one output per one input, so the model will not reset its internal state?

zoq · 2017-01-31T16:02:33Z

Yes, that's right.
I'm not sure what you mean here.
By "one by one" you mean one input from the sequence? Predict returns a matrix that holds the prediction for each step. But since, the RNN class doesn't implement a Forward function you can't just call the model with some element from the input sequence and get the intermediate output. You could use the Sequential layer to do that, but in this case, you have to call the Forward, Backward and Gradient function by yourself. Do you think it would be useful to provide such an interface for the RNN class?

ddovod · 2017-01-31T20:09:27Z

Saying "state" I mean this, so I can make something like online training/predictions with different sequence lengths.
One of the common LSTM feature is the ability to predict time series and other sequential data, and I think it could be damn useful to have some regression-like api for lstm (and other rnns) to solve forecasting tasks. What do you think about it?

zoq · 2017-02-01T16:11:40Z

The implemented LSTM layer can handle data points of variable length. But, since the cell states are reset at each sequence it is stateless as per Keras definition. In a stateful model, you have to specify the batch size in the LSTM layer, which isn't implemented at the moment.

I agree, a simple regression interface would be great, especially if you're doing time series forecasting a stateful model would be super helpful. I can open an issue for that if you like.

ddovod · 2017-02-01T19:31:50Z

Sure, thank you very much!
By the way, I could not even think that it would be so difficult to find working lstm implementation with good api in c++ (I'm not so experienced in deep learning to build it from scratch). I've tried caffe (very poor and absolutely not idiomatic lstm, works bad), dynet (very low level api and cryptic examples), mxnet (how it could be used from c++?), nnetcpp (just doesn't work), rnnlib (a big piece of spaghetti code, with all due respect to Alex Graves and his work), keras2cpp (good idea, but it doesn't support recurrent layers). But mlpack seems to be very nice and clean ml library, and I hope I will be able to use it for my experiments.
Thank you again!

zoq · 2017-02-02T18:43:24Z

Sorry for the slow response, I agree it's sometimes difficult to use all these libraries. If you tell us more about what you like to achive, maybe we can push forward in that direction; And it's just an idea but I think improving the recurrent neural network infrastructure could be an interesting GSoC project. Let me know what you think.

ddovod · 2017-02-05T11:58:03Z

My main task right now is to build and train model which should predict multivariate time series "step by step", I mean I have some continuous running process which generates some data, and I want to predict some parameters of this process for several steps forward. Stateless lstm implementation can be used for this kind of tasks, but I don't want to pass 50 (or so) previous data slices to predict next on each step, thats why I need a stateful lstm to pass only new data and read network output on each new time step.
About common nn/rnn improvements: I think that gpu support would be cool, so models, which are learning for one week on the (near top) cpu, will have the ability to do the same job for several hours.
https://github.com/jcjohnson/cnn-benchmarks

zoq · 2017-02-07T21:47:02Z

Thanks for this valuable notes, I will definitely take a closer look at how we could design a regression API on top the recurrent network class. Regarding GPU acceleration, you can link with NVIDIA NVBLAS which is a GPU-accelerated implementation of BLAS it can accelerate most BLAS Level-3 routine. However, I agree it would be nice to have a fully GPU accelerated interface which also includes the convolution operator, which I guess would come in the near feature.

rcurtin · 2017-04-24T16:46:51Z

Closing for inactivity.

rcurtin closed this as completed Apr 24, 2017

rcurtin added the s: inactive label Apr 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time series prediction #862

Time series prediction #862

ddovod commented Jan 30, 2017 •

edited

Loading

zoq commented Jan 31, 2017

ddovod commented Jan 31, 2017

zoq commented Jan 31, 2017

ddovod commented Jan 31, 2017

zoq commented Feb 1, 2017

ddovod commented Feb 1, 2017

zoq commented Feb 2, 2017

ddovod commented Feb 5, 2017

zoq commented Feb 7, 2017

rcurtin commented Apr 24, 2017

Time series prediction #862

Time series prediction #862

Comments

ddovod commented Jan 30, 2017 • edited Loading

zoq commented Jan 31, 2017

ddovod commented Jan 31, 2017

zoq commented Jan 31, 2017

ddovod commented Jan 31, 2017

zoq commented Feb 1, 2017

ddovod commented Feb 1, 2017

zoq commented Feb 2, 2017

ddovod commented Feb 5, 2017

zoq commented Feb 7, 2017

rcurtin commented Apr 24, 2017

ddovod commented Jan 30, 2017 •

edited

Loading