Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Tensor had NaN values" when encountering improbable label values #12

Open
obarnstedt opened this issue Apr 3, 2021 · 1 comment
Open

Comments

@obarnstedt
Copy link

Hi and thanks for all the great work!
I just wanted to point to a potential problem we've encountered during training of DGP. With our dataset, we could successfully run the first 50k iterations of "DGP on labeled frames only", but then for "Running DGP" encountered

tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had NaN values
[[{{node VerifyFinite/CheckNumerics}}]]

This occurs in line 818 of fitdgp.py:
[loss_eval, _] = sess.run([loss, train_op], feed_dict)
After some debugging, I could trace the error to labeled frames in which the labels were accidentally set out of the normal range (DLC deletes markers set at x=0, y=0, but here they were accidentally at x=1, y=4; normally, labels were x/y>200). After removing these improbable labels, training continued normally.
It's great that we now had a chance to clean our training dataset, but it would be better if there was a way for DGP to maybe just ignore such labels while giving a precise Warning message to alert the user. Otherwise, it's quite hard for the user to figure out where the actual problem is.
Thanks,
Oliver

@waq1129
Copy link
Collaborator

waq1129 commented May 14, 2021

Thanks for sharing this finding! This is very helpful to improve the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants