Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical differentiation and Improvements to RNN #13

Merged
merged 1 commit into from
Apr 22, 2016

Conversation

BenjiJack
Copy link
Collaborator

Hi all, here is some additional work on the Kato "pipeline" and recurrent neural networks.

Kato pipeline

  • Implement numerical differentiation (Chartrand, et al.) as cited in Kato, using off-the-shelf script by Chartrand available here. Numerical differentiation is available as an Analyzer method: tvd
  • Demonstrate the use of numerical differentiation in a Jupyter notebook (KatoPipeline.ipynb). The notebook compares the results of PCA on Kato's provided derivative data vs computed derivative data

Note: I am quite unfamiliar with the technique developed by Chartrand and used in this off-the-shelf script. Nonetheless I am trying to use it here to see if we can reproduce Kato's analysis.

In this image from the Jupyter notebook, you see Kato's raw data, Kato's derivative data, and our PCA on Kato's derivative data; followed by Kato's raw data, our computation of the derivative, and our computation of PCA on our derivative data.
image

  • Implement Analyzer methods to quickly generate 'Kato-style' PCA plots and timeseries plots (used to generate the plots above)

RNNs

  • Based on feedback from @jrieke, attempt to implement GMMActivation and GMM Loss functions on RNN. However, the code is not currently working correctly, and is returning only NaNs
    when attempting to fit the model.

Much of the code is taken directly from @jrieke's implementation here - thank you for allowing me to use your GMM functions as I try to make this work.

I tried various configurations of the model, with and without a stateful LSTM layer, feeding in 1 data point at a time vs longer chunks, changing the number of mixture_components, changing the learning rate, using different optimizers, but to no avail -- still getting NaNs. As I am new to RNNs, and in particular don't understand all of the GMM code, I may be making an egregious error in how I'm using the model, or it might be a more subtle problem.

A demo of how I am using the model can be found at laboratory/RNNDemo.ipynb

Next steps
My goal is to get the RNN working, and then add a 3rd row to the image above, in which we compare data we generate from the RNN, differentiate, and then perform PCA on. In this way, we would reproduce Kato's analysis using Kato's own data, and also do a similar analysis using data that we generate. Future analyses and simulations could take a similar approach -- analyze the Kato data using some tool we build, generate similar data with a simulation, and then use the same analytic pipeline to compare our generated data to Kato's.

- Implements numerical differentiation (Chartrand, et al.)
as cited in Kato, using off-the-shelf script by Chartrand.
Numerical differentiation is available as a Analyzer method 'tvd'

- Demonstrates the use of numerical differentiation in a Jupyter
notebook (KatoPipeline.ipynb). The notebook compares the results
of PCA on Kato's provided derivative data vs computed derivative data

- Implements Analyzer methods to quickly generate 'Kato-style'
PCA plots and timeseries plots

- RNN: based on feedback from @jrieke, attempts to implement
GMMActivation and GMM Loss functions on RNN. However, the code
is not currently working correctly, and is returning only NaNs
when attempting to fit the model.
@theideasmith theideasmith merged commit e24f361 into openworm:master Apr 22, 2016
@jrieke
Copy link

jrieke commented Apr 23, 2016

@BenjiJack About the nan's: In which range is your data? You should normalize it to ~ [-1, 1] (an even smaller range might help if you still get nan). If you have already taken care of this, try again to play around with the learning rate (and the other parameters). The GMM layer makes the whole thing very prone to numeric errors, so you sometimes have to reduce the learning rate by a few orders of magnitude to make the whole thing work (I know it's annoying..). See also the discussion here (the GMM layer there is based on the same code as mine).

Also, I've been refactoring a bit of my code into a separate project during the last few days, which should be completely agnostic of the type of data you use for the network: timeseries-rnn. It's still experimental and poorly documented, but maybe you can give it a try to see if there's a mistake in your code (it's written as python scripts, but you can easily pull the code into a jupyter notebook or run the scripts via the %run script.py magic from a notebook).

@BenjiJack
Copy link
Collaborator Author

Thank you @jrieke. It sounds like I may not have normalized the data
properly. I did try to vary the learning rate without success. I will try
the normalization and see what happens.

The model did sometimes converge using random normally distributed data in
the (0,1) range (although it still did so unreliably), so maybe it is
indeed the normalization that's the problem.

Your new codebase looks exciting and should be very helpful as well. Thank
you for sharing it.

Traveling the next few days, will come back soon once I have a chance to
dig in further.
On Apr 22, 2016 8:14 PM, "Johannes Rieke" notifications@github.com wrote:

@BenjiJack https://github.com/BenjiJack About the nan's: In which range
is your data? You should normalize it to ~ [-1, 1](an even smaller range
might help if you still get nan). If you have already taken care of this,
try again to play around with the learning rate (and the other parameters).
The GMM layer makes the whole thing very prone to numeric errors, so you
sometimes have to reduce the learning rate by a few orders of magnitude to
make the whole thing work (I know it's annoying..). See also the discussion
here keras-team/keras#1608 (the GMM layer there
is based on the same code as mine).

Also, I've been refactoring a bit of my code into a separate project
during the last few days, which should be completely agnostic of the type
of data you use for the network: timeseries-rnn
https://github.com/jrieke/timeseries-rnn. It's still experimental and
poorly documented, but maybe you can give it a try to see if there's a
mistake in your code (it's written as python scripts, but you can easily
pull the code into a jupyter notebook or run the scripts via the %run
script.py magic from a notebook).


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#13 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants