Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN panics #4

Open
qingyuanxingsi opened this issue Jan 3, 2016 · 3 comments
Open

NaN panics #4

qingyuanxingsi opened this issue Jan 3, 2016 · 3 comments

Comments

@qingyuanxingsi
Copy link

During my training, I find that the *sVal(https://github.com/fumin/ntm/blob/master/addressing.go#L194) can easily reach NaN, can this be properly solved, instead of panicing???

@fumin
Copy link
Owner

fumin commented Jan 3, 2016

Having NaNs suggests that we might have a vanishing gradient problem here.
In this case, what I found helpful was to change the random seed in https://github.com/fumin/ntm/blob/master/poem/train/main.go#L63 .
In addition, I also found it helpful to use the pure-go implementation https://github.com/fumin/ntm/releases/tag/pure-go instead of the BLAS based one in the master branch.
Although, BLAS is faster, it seems to be more numerical unstable than using pure go with for loops.
I tried to ask one numerical expert whether this is expected, but couldn't reach a satisfying conclusion yet.

@qingyuanxingsi
Copy link
Author

Why changing the random seed do help in this scenario?

@fumin
Copy link
Owner

fumin commented Jan 3, 2016

As I understand it, the vanishing/exploding gradient problem really depends on the particular training dynamics, as in most non-linear systems. So changing the random seed may OR may not help.
In terms of what specific aspects are affected by the seed, first of all the order in which we present the data depends on it in https://github.com/fumin/ntm/blob/master/poem/poem.go#L127 .
In addition, the initialization of the network also depends on the seed https://github.com/fumin/ntm/blob/master/poem/train/main.go#L77 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants