Replicating Generating Sequences by Alex Graves Handwriting Section #1608

dragon271828 · 2016-02-01T04:49:09Z

I'm trying to replicate Alex Graves' paper: http://arxiv.org/pdf/1308.0850v5.pdf

The part with handwriting generation, I'm having trouble defining the objective function as a function of y_true and y_pred. In the paper, y_true takes the form of a 3-tuple and y_pred takes the form of a (e, {w_i, mu_i, sigma_i, rho_i}) where w_i, mu_i, sigma_i, and rho_i are the parameterization of the Gaussian mixture and e_i is the probability of whether the pen is down or not.

First off, y_true and y_pred have different dimensions, is that allowed?

Secondly, the different elements of y_pred must be treated individually in the custom loss function. So for instance let's say that there's two Gaussians in the mixture so that the dimension of y_pred is 9. Then the loss function will use all these individual components and do something different with each of them as shown on page 20 of the reference paper above

e = y_pred[0]
w1 = y_pred[1]
mu1 = y_pred[2]
sigma1 = y_pred[3]
rho1 = y_pred[4]
w2 = y_pred[5]
mu2 = y_pred[6]
sigma2 = y_pred[7]
rho2 = y_pred[8]

Is splitting up the components of y_pred a permitted operation in the custom loss function? I've written an implementation of the function but seem to be getting NaNs for the loss. I'm not sure whether I am doing something wrong or whether they are simply not allowed in keras or theano.

rpinsler · 2016-02-03T06:23:45Z

I've done something similar in #1061. Does that help?

dragon271828 · 2016-02-04T16:33:51Z

Thanks! This is incredibly helpful. I implemented a layer similar to yours, however I'm now receiving NaNs in the training loss after a couple iterations. I noticed that you had the same issue, would you mind sharing what kind of numeric optimization problems you had and how you solved them?

rpinsler · 2016-02-04T17:12:28Z

I can't remember exactly what the source of the problem was. I tried different things to avoid those numeric problems, e.g. another optimizer, gradient clipping and batch normalization. Now, it is pretty stable. Let me know if you can get it to work!

jrieke · 2016-03-20T02:50:47Z

I ran into the same issue, a few notes that might be helpful: I fixed it eventually by choosing a way smaller learning rate (with RMSprop; it's still quite prone to errors though so don't expect too much..). Gradient clipping had no effect for me (only made things worse); didn't try batch normalization. I also found that the parameter M (i. e. the number of Gaussian distributions per mixture) has quite an effect on the nan issue, so try to play around with that as well.

el3ment · 2016-04-13T22:56:23Z

I'm also getting nans after a few hundred iterations.

This implementation in tensorflow: http://blog.otoro.net/2015/11/24/mixture-density-networks-with-tensorflow/

Didn't seem to have the same nan issue, but in trying to debug the objective function in @rpinsler 's implementation I can't see any reason what is causing the nan unless one of the sigmas strays too close to zero and causes a divide by zero or perhaps the "sum" in return (alpha/sigma) * T.exp(-T.sum(T.sqr(mu-y_true),-1)/(2*sigma**2)) is misplaced. Did either of you solve the problems?

stale · 2017-05-23T23:31:10Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

jrieke mentioned this issue Apr 23, 2016

Numerical differentiation and Improvements to RNN openworm/neuronal-analysis#13

Merged

stale bot added the stale label May 23, 2017

stale bot closed this as completed Jun 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating Generating Sequences by Alex Graves Handwriting Section #1608

Replicating Generating Sequences by Alex Graves Handwriting Section #1608

dragon271828 commented Feb 1, 2016

rpinsler commented Feb 3, 2016

dragon271828 commented Feb 4, 2016

rpinsler commented Feb 4, 2016

jrieke commented Mar 20, 2016

el3ment commented Apr 13, 2016

stale bot commented May 23, 2017

Replicating Generating Sequences by Alex Graves Handwriting Section #1608

Replicating Generating Sequences by Alex Graves Handwriting Section #1608

Comments

dragon271828 commented Feb 1, 2016

rpinsler commented Feb 3, 2016

dragon271828 commented Feb 4, 2016

rpinsler commented Feb 4, 2016

jrieke commented Mar 20, 2016

el3ment commented Apr 13, 2016

stale bot commented May 23, 2017