-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Plotting Continued Models on the Same Line #364
Comments
For that you will need to combine the csv files and then put them in a single folder. |
Thank you for the reply! Will these methods work with |
For evaluation, you will need to merge the
What did you try so far? |
I tried looking through the plotting code where the npz files were used. I never used npz files before, but based on what I looked up, it appears Speaking of which, I took your suggestion on combining the csv files and made a Google Colab notebook to perform this task. I tested it by merging three 100000-timestep runs of CartPole-v1 using A2C and PPO. I then used variant of |
could you share the link as it might be useful for others?
yes, it's a dictionary of numpy arrays. |
Yes. I will admit this is a work in progress as I still need to implement npy merging. This was also made for the project I am working on, which uses a fork of rl-baselines3-zoo. Here is the Google Colab link. For evaluations.npz, is there a way to get all the keys? |
|
Thank you. I tried this and was curious why it would not print the keys, but I found a Stack Overflow post that stated |
Following up to this thread, I have a question regarding saving checkpoints in rl-baselines3-zoo. Say I want to train an agent with Asterix-v4 for 40 million timesteps. Within the six-hour limit, training can run through 10 million timesteps. When it concludes, three model zip files stand out to me: What are the differences between these model files? If I wanted to resume training for another 10 million timesteps (with the aim being to eventually reach 40 million time steps), which of these would be the best to use? I wanted to ask because when I resumed training with |
one is a checkpoint, this other is saved at the end of training, the last one is the best model according to the evaluation callback.
Usually the one saved at the end of training, but that's not always true (for instance, if there is a performance drop just before the end of training).
That's probably due to the fact you are using schedules and schedule are reset when resuming training. |
I see. When you refer to schedules being reset when resuming training, is this whether I see the seed is also set to a constant. Does that factor into this as well? For context, the algorithms I am running are based on PPO. |
Thank you for your advice concerning schedules. I modified some of the experiment manager code when running a trained agent, and it seemed to do the trick. Below is what the graphs currently look like with plotting across different checkpoints. I have two more questions: when the PPO algorithm runs, does it have an offline component? How is it affected by checkpoints? I wanted to ask because when I ran MsPacman-v4, one of the PPO-based attention algorithms (DSH_SH) experienced a significant drop in score at around 7.5 million timesteps, and I was curious if this occurred because of the seed. |
❓ Question
Hello, I have a question regarding plotting in rl-baselines3-zoo. I work on a cluster that limits runs to at most six hours, so I thought it would be a good idea to use checkpoints to save my runs. After I ran out of time, I scheduled a new job and continued from rl_model_9000000_steps.zip for another million steps, and this ran as expected.
However, two things occurred. First, the continued run's contents went into a different directory from the original run (DemonAttack-v4_2 instead of DemonAttack-v4_1). Second, when I tried to plot it with plot_train.py, it treated these directories as different runs.
How can I combine these two runs into one? My hope is to make the second run extend the first run as intended with checkpoints. Below is the plot made by plot_train.py.
Also attached are the contents for both DemonAttack-v4 directories.
Thank you in advance.
Checklist
The text was updated successfully, but these errors were encountered: