Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time series prediction #862

Closed
ddovod opened this issue Jan 30, 2017 · 10 comments
Closed

Time series prediction #862

ddovod opened this issue Jan 30, 2017 · 10 comments

Comments

@ddovod
Copy link

ddovod commented Jan 30, 2017

Hello guys.
I'm trying to reproduce this simple keras tutorial (only "sinwave" part).
So I have the following model:

    const size_t rho = 50;
    RNN<MeanSquaredError<>> model(rho);
    
    model.Add< LSTM<> >(1, 50, rho);
    model.Add< Dropout<> >(0.2);
    model.Add< LSTM<> >(50, 100, rho);
    model.Add< Dropout<> >(0.2);
    model.Add< Linear<> >(100, 1);

    RMSprop<decltype(model)> opt(model);

The main issue in the training phase. I would like to predict the next value after feeding some (50) previous. The model.Train(...) function gets 2 matrices - inputs and expected outputs, and it will throw "error: Mat::rows(): indices out of bounds or incorrectly used" if these matrices both will not be of shape (rho, sequences_count). In the keras tutorial they have only one value as output per input sequence.
So the main question: how can I reproduce keras example with mlpack? I can PR this as an example, test or something if you help me a bit.
Thank you a lot!

@zoq
Copy link
Member

zoq commented Jan 31, 2017

Hello,

as you already pointed out the model from the keras tutorial uses a single response for the complete sequence; but the model as implemented right now expects one output for each input in your case something like:

data: [x0, x1, x2, x3, x4, x5, x6, x7, x8]
input seq [x0, x1, x2]
response [x1, x2, x3]

you could also use

input seq [x0, x1, x2]
response [x3, x3, x3]

so the data should look like:

data = arma::zeros(points, sequences);
labels = arma::zeros<arma::mat>(rho, sequences);

On the other hand, it's fairly easy to replicate the behavior, in fact, it worked before the ann revamp. I will open an issue maybe someone likes to implement the feature if not I'll do it.

I hope this was helpful.

@ddovod
Copy link
Author

ddovod commented Jan 31, 2017

Hi @zoq. Thank you for response!
So, if I'll use x0, x1, x2 as input and x1, x2, x3 as output, it will train to predict the next element in the sequence on every learning step? Is model preserving state (internal state of lstm layers) between passing values of the sequence?
And can I use something like incremental prediction? I mean I want to feed some values to the model, one by one, and get one output per one input, so the model will not reset its internal state?

@zoq
Copy link
Member

zoq commented Jan 31, 2017

  1. Yes, that's right.
  2. I'm not sure what you mean here.
  3. By "one by one" you mean one input from the sequence? Predict returns a matrix that holds the prediction for each step. But since, the RNN class doesn't implement a Forward function you can't just call the model with some element from the input sequence and get the intermediate output. You could use the Sequential layer to do that, but in this case, you have to call the Forward, Backward and Gradient function by yourself. Do you think it would be useful to provide such an interface for the RNN class?

@ddovod
Copy link
Author

ddovod commented Jan 31, 2017

Saying "state" I mean this, so I can make something like online training/predictions with different sequence lengths.
One of the common LSTM feature is the ability to predict time series and other sequential data, and I think it could be damn useful to have some regression-like api for lstm (and other rnns) to solve forecasting tasks. What do you think about it?

@zoq
Copy link
Member

zoq commented Feb 1, 2017

The implemented LSTM layer can handle data points of variable length. But, since the cell states are reset at each sequence it is stateless as per Keras definition. In a stateful model, you have to specify the batch size in the LSTM layer, which isn't implemented at the moment.

I agree, a simple regression interface would be great, especially if you're doing time series forecasting a stateful model would be super helpful. I can open an issue for that if you like.

@ddovod
Copy link
Author

ddovod commented Feb 1, 2017

Sure, thank you very much!
By the way, I could not even think that it would be so difficult to find working lstm implementation with good api in c++ (I'm not so experienced in deep learning to build it from scratch). I've tried caffe (very poor and absolutely not idiomatic lstm, works bad), dynet (very low level api and cryptic examples), mxnet (how it could be used from c++?), nnetcpp (just doesn't work), rnnlib (a big piece of spaghetti code, with all due respect to Alex Graves and his work), keras2cpp (good idea, but it doesn't support recurrent layers). But mlpack seems to be very nice and clean ml library, and I hope I will be able to use it for my experiments.
Thank you again!

@zoq
Copy link
Member

zoq commented Feb 2, 2017

Sorry for the slow response, I agree it's sometimes difficult to use all these libraries. If you tell us more about what you like to achive, maybe we can push forward in that direction; And it's just an idea but I think improving the recurrent neural network infrastructure could be an interesting GSoC project. Let me know what you think.

@ddovod
Copy link
Author

ddovod commented Feb 5, 2017

My main task right now is to build and train model which should predict multivariate time series "step by step", I mean I have some continuous running process which generates some data, and I want to predict some parameters of this process for several steps forward. Stateless lstm implementation can be used for this kind of tasks, but I don't want to pass 50 (or so) previous data slices to predict next on each step, thats why I need a stateful lstm to pass only new data and read network output on each new time step.
About common nn/rnn improvements: I think that gpu support would be cool, so models, which are learning for one week on the (near top) cpu, will have the ability to do the same job for several hours.
https://github.com/jcjohnson/cnn-benchmarks

@zoq
Copy link
Member

zoq commented Feb 7, 2017

Thanks for this valuable notes, I will definitely take a closer look at how we could design a regression API on top the recurrent network class. Regarding GPU acceleration, you can link with NVIDIA NVBLAS which is a GPU-accelerated implementation of BLAS it can accelerate most BLAS Level-3 routine. However, I agree it would be nice to have a fully GPU accelerated interface which also includes the convolution operator, which I guess would come in the near feature.

@rcurtin
Copy link
Member

rcurtin commented Apr 24, 2017

Closing for inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants