Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial 1 training still get nan loss #8

Closed
xiaogangzhu opened this issue Aug 8, 2023 · 4 comments
Closed

Tutorial 1 training still get nan loss #8

xiaogangzhu opened this issue Aug 8, 2023 · 4 comments

Comments

@xiaogangzhu
Copy link

Hi,
for short tutorial 1 I still get training loss == nan
Screenshot from 2023-08-08 13-45-50

@xiaogangzhu
Copy link
Author

And also Tutorial 3: Semi-parametric extensions to TARNet get nan loss.

Why get nan loss? Is the tensorflow version or GPU problem?

@kochbj
Copy link
Owner

kochbj commented Aug 10, 2023

Thanks for letting me know. I'll look into it as soon as I get the chance!

Best,
Bernie

@kochbj
Copy link
Owner

kochbj commented Aug 16, 2023

Hi Xiaogang,

I just tried the short tutorial 1 again and get no issues, using either CPU or the T4 GPU on collab. In general, using GPUs is probably slower than using CPU for these small networks anyways. Are you not running it on collab?

Best,
Bernie

@xiaogangzhu
Copy link
Author

I tried different version of tensorflow and find that tensorflow>2.10 will get the error. I use tensorflow=2.10.0 and it is fine now!

@kochbj kochbj closed this as completed Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants