-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving and restoring checkpoint hangs in Jupyter Notebook. #263
Comments
I have a suspicion it may be due to a bad interaction between different dependencies. Could you try to strip out the extra stuff unrelated to checkpointing, without removing any imports and see if that works? If it still hangs, you can start removing some imports, and the issue should resolve (at worst by the point where you're only importing Orbax). Also, a few nits: |
Thank you for the suggestions! It turns out import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('QtAgg')
%matplotlib qt in my notebook were the lines of code causing the issue. Is there a way to solve this issue other than not importing? |
I'm afraid your guess is as good as mine, since I'm not familiar with this particular backend, and there appears to be no problem with matplotlib without overriding the backend. Is it possible to skip that part? |
Yep, matplotlib by itself is not the issue. The qt backend is more of a convenience, not at all necessary. I will disable it for now. Thanks for the help! |
I have a training loop as shown below. I am running Python 3.10.10 and the latest versions of JAX (0.4.7), Flax (0.6.7), and Orbax (0.1.6). I am having some issues with the
restore
andsave
commands leading to the code hanging in Jupyter Notebook. When I call thetrain_model
function, the code block would freeze at eitherrestore
orsave
but resumes if I run another code block. I think it could potentially have something to do withasyncio
, but I am not totally sure. I had recently switched over fromflax.checkpoints
, where this wasn't an issue. Any help on this would be appreciated!The text was updated successfully, but these errors were encountered: