Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-dimensional Sequential Inputs / unflattened float32 arrays #37

Closed
LRonHubs opened this issue Nov 28, 2016 · 9 comments
Closed

Multi-dimensional Sequential Inputs / unflattened float32 arrays #37

LRonHubs opened this issue Nov 28, 2016 · 9 comments

Comments

@LRonHubs
Copy link

LRonHubs commented Nov 28, 2016

This library appears to be focused on the Model structure, which is okay, but most of my trained networks are in Sequential form. Specifically LSTM Sequentials.

LSTM Sequentials generally expect a three-dimensional input (batchSize, X,Y). However this repo requires there to be only one flattened input for Sequential, named "input". Has anyone found a good way to convert Sequentials to a Model or a way around this?

I would like to just do prediction in javascript. That means the model shouldn't even need to be compiled, so maybe I can just hack something together myself.

@LRonHubs
Copy link
Author

LRonHubs commented Nov 28, 2016

  • Add an embedding layer
  • Re-train your networks

If your embedding layer is too big, your network will be too slow, even for just predicting. So, you will probably want to test the latency on an untrained network before you train your networks to make sure everything runs smoothly.

@unsalted
Copy link

@LRonHubs

Sorry to bug, but just caught this and you are describing precisely the issue I spent a few hours dealing with last night. Which example are you going off of for the embedding/ do you have one you would mind sharing.

@LRonHubs
Copy link
Author

LRonHubs commented Nov 29, 2016

Sure. I'm actually analyzing brainwaves recorded from an EEG, but it's probably easier to think of what I'm doing as analyzing price data from a stock.

Right now I have one stock's price that I'm analyzing. Obviously the stock's price changes with each trade. I have a one-dimensional input that varies over time:
[125, 200,113, 93, 16 ... ]

... where each value represents a stock price.

Previously I had hard-coded a way to convert this into a way suitable for the LSTM networks to receive it (batch,X,Y). I really don't have a good artificial neural network background; I like Keras because you can get pretty far without really knowing what you're doing. So I really had no idea what an embedding layer was or how to use it.

It turns out an embedding layer's sole utility is basically to convert a one-dimensional time-signal --> LSTM acceptable input in the most effective way possible. Although I've yet to actually test this, I'm expecting the embedding layer to have better results than my custom hard-coded way. This is because the embedding layer uses trained weights to make the conversion, while my hard-coded way was basically guesswork. It's a little annoying because the embedding layer made my predictions a lot slower, but this is so much easier than the alternatives that it's worth it.

The first thing you'll have to do is make sure your input has the same length each time. You'll want to pad your input with 0s if the length of the relevant information isn't consistent. After that, you can hardcode your embedding layer with a length of constEmbedLength. You'll also need to figure out the maximum possibile value that your input can be. Fortunately for me, I'm only recording 8-bit numbers, so my values range between 0 and 256.

from keras.layers import Embedding
from keras.layers.recurrent import LSTM
model = Sequential();
model.add(Embedding(**256**,LSTMneuronval1,input_length=constEmbedLength));
model.add(LSTM(input_dim=LSTMneuronval1,output_dim=LSTMoutput1));
// compile, train, inside python.

https://keras.io/layers/embeddings/ for more information

Once I trained my networks, I used the encoder.py and model.to_json() to get the modelJSON, weights buffer, and metadata information as described in the readme. I moved these into a new folder called "models" in the root directory of keras-js.

Next you can basically get rid of most of the junk in demos/src/index.js. One thing not included in the readme is how to build the demos folder. So if you make changes to demos/src/index.js, you'll need to run 'npm run build:demos' for it to be updated on the webpage.

Here's my demos/src/index.js:

/* global Vue, VueMdl, WebGLRenderingContext */
import axios from 'axios'
import debounce from 'lodash/debounce'
import random from 'lodash/random'
import findIndex from 'lodash/findIndex'
import ops from 'ndarray-ops'
const modelID ="1480370035_0";
const MODEL_FILEPATHS_PROD = {
   model:'/models/'+modelID+'.json',
  weights: '/models/'+modelID+'.buf',
  metadata: '/models/'+modelID+'_metadata.json'
}
const MODEL_CONFIG = {
  filepaths: MODEL_FILEPATHS_PROD
}
const model= new KerasJS.Model(Object.assign({ gpu: false }, MODEL_CONFIG))
alert("here");
model.ready()
  .then(() => {
	alert("ready");
   // array will just be 0s, but that's okay for testing:
	var array = new Float32Array(constEmbedLength);
       const inputData = {
	      'input': array
	    }
    
    model.predict(inputData)
      .then(outputData => {
	alert("predicted1 succesful");
        console.log(outputData);
	})
      .catch(err => {
        // handle error
	console.log(err);
	alert("error predicting; see console");
      })
  }).catch(err => {
    // handle error
	console.log(err);
	alert("error compiling; see console");
  })

After building using npm run build:demos without any errors, you can run npm run server and go to localhost:3000 in Google Chrome. You should get an alert window when your model compiles and again when it completes a prediction. If you get an error, or if you want to see the output, right-click, go to "Inspect Element", then click "Console". You have to be using Chrome for this part. Outputdata should be printed if the prediction was successful. Again I recommend doing this with an untrained network to test the latency. I also have yet to set gpu to true but maybe that will speed things up.

To make the network faster, decrease either constEmbedLength or LSTMneuronval1 or LSTMoutput1.

@stekaiser
Copy link
Contributor

I haven't tested this myself thoroughly but I think you don't need an embedding layer in your case. Just supply your first LSTM layer with (batchSize, samplePoints, EEGchannels) and train the model in Keras.

In Keras.js you would then predict one input, e.g. take the array (1, samplePoints, EEGchannels), flatten it out and run the prediction.

@LRonHubs
Copy link
Author

So to test this I'm basically putting in a new Float32Array(samplePoints*EEGchannels) and I'm getting the error:

Error: Specified shape incompatible with data.

using networks without an embedding layer.

@unsalted
Copy link

Yeah I ran in to the same issue ^^^ i'll have to try to implement embedding when I'm no longer on deadline. Thanks so much for writing up your whole process! It will certainly help when I try to reconfigure my models. I'm definitely in a similar boat when it comes to Keras, but i feel like i'm picking it up. Good luck!

@stekaiser
Copy link
Contributor

stekaiser commented Nov 30, 2016

I have tested this again and it works for me. It is documented here:

https://github.com/stekaiser/keras-js-lstm-demo/blob/master/LSTM%20test.ipynb
https://jsfiddle.net/stekaiser/p0624q65/15/ (make sure to open the Developer tools to see the output)

I hope that the example comes close to what you are trying to do.

The Keras.js implementation yields the same prediction as in the notebook. Please note that only the first input is predicted. For predicting the whole batch you'd have to look at this #27 (comment).

@LRonHubs
Copy link
Author

LRonHubs commented Dec 14, 2016

Wow thank you a lot. It took me a while to get through, but I never would have had the conviction that this actually worked without your code.

For anyone working from my example, you can implement LSTM by basically making this change in your Python set-up:

model.add(LSTM(input_dim=LSTMneuronval1,output_dim=LSTMoutput1));
to

model.add(LSTM(batch_input_shape=(None, Xtimestep, LSTMneuronval1),output_dim=LSTMoutput1));

... where Xtimestep is how many timesteps you'd like to evaluate per sample. You might be able to get away with setting this to "None" but I doubt it since that's probably just what input_dim is in the old line.

It's a simple change but it's hard to grasp without looking at @stekaiser's example. Pay attention to the model.summary() function call for getting more info on your layers.

@aguerra7002
Copy link

Hello. I was hoping to do something similar to this but wasn't sure how to go about it. What my model does is take in an 3d input of size nxmx3 (an image file with r, g, and b channels) and then it goes through a Conv2d layer, where the output is 3d and dimension of nxmx1. I have been trying to figure out how to implement this without the use of an embedding layer, as ideally the input would be of an unknown size. I have looked at the mnist_cnn example but it was a bit hard to follow how the input was handled. From what I gathered from that example, it seems like the 28x28 data was flattened to a 1d size-784 array. However, I don't know what the best way to do this would be with my model. I would very much appreciate it if you could give some insight as to potential solutions to my problem. Let me know if you need more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants