New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails restoring weights #41508
Comments
I should add that I checked that the checkpoint were correctly saved. If I do use
and then use It's really when trying to keep training those previous policies using the method described above that it fails. |
I fell like it might be due to me using |
The trick of passing the checkpoint via |
I was able to use
And 2. and 3. have similar behavior, different from 1. Side problem is that tune.run is absent from documentation. So I first thought that it was being deprecated. I finally found the info I needed in the function implementation in the repo but wasn't straightforward at all. Questions still remains:
|
What happened + What you expected to happen
The code of examples/restore_1_of_n_agents_from_checkpoint.py seems to not be working (at least in my case).
The weight are not recovered but re-initialized. The way I see it is that instead of having the same policy reward means (in Wandb) as before I get reinitialized values.
Maybe the example is not up to date or maybe I am doing something wrong here. I am using tune.Tuner().fit() and not tune.train() as in the example. But not sure why this would fail...
Versions / Dependencies
Python 3.10
Ray 2.8
Reproduction script
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: