-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Please fill out the form below.
System Information
ml.c5.2xlarge
Describe the problem
I was trying to run rl_roboschool_ray example notebook
But the s3://<your_s3_bucket>/<training_job_name>/output folder was not created for hopper and humanoid case., so there was no intermediate training video saved, nor the final model.tar.gz.
I tried the reacher example, the output folder was created fine.
And output folder never created for hopper and humanoid cases after I tried couple of times
The hopper and humanoid training also never ends till time reach train_max_run.
I searched the issues, there was a similar one "Ray RLLib examples not saving model output #581"
But I don't quite get the response of "According to user script in the example checkpoints should be saved to 'opt/ml/output/intermediate' folder and moved to s3://<your_s3_bucket>/<training_job_name>/output/intermediate
location during training.
You can modify the user script to save checkpoints to /opt/ml/model
directory at the end of the training instead."
The cloudwatch log looks fine, but there is no "output" folder in S3
It seems the sync between /opt/ml/output and s3://<your_s3_bucket>/<training_job_name>/output is not always working
21:11:12
== Status ==
21:11:12
Using FIFO scheduling algorithm.
21:11:12
Resources requested: 8/8 CPUs, 0/0 GPUs
21:11:12
Result logdir: /opt/ml/output/intermediate/training
21:11:12
RUNNING trials: - PPO_RoboschoolHumanoid-v1_0:#011RUNNING [pid=124], 694 s, 4 iter, 1280398 ts, -83.2 rew
21:13:51
== sgd epochs ==
21:13:52
0 {'cur_lr': 9.999999747378752e-05, 'total_loss': -0.00026426092, 'policy_loss': -0.00033865558, 'vf_loss': 0.0, 'vf_explained_var': -1.0, 'kl': 0.00014878887, 'entropy': 23.64479}
21:13:53
1 {'cur_lr': 9.999999747378752e-05, 'total_loss': -0.0022162762, 'policy_loss': -0.0030561804, 'vf_loss': 0.0, 'vf_explained_var': -1.0, 'kl': 0.0016798142, 'entropy': 23.630758}
21:13:54
2 {'cur_lr': 9.999999747378752e-05, 'total_loss': -0.0035353287, 'policy_loss': -0.0055731665, 'vf_loss': 0.0, 'vf_explained_var': -1.0, 'kl': 0.004075686, 'entropy': 23.615211}
21:13:55
3 {'cur_lr': 9.999999747378752e-05, 'total_loss': -0.004284208, 'policy_loss': -0.0067875045, 'vf_loss': 0.0, 'vf_explained_var': -1.0, 'kl': 0.005006588, 'entropy': 23.599758}
21:13:56
4 {'cur_lr': 9.999999747378752e-05, 'total_loss': -0.00491518, 'policy_loss': -0.0074260524, 'vf_loss': 0.0, 'vf_explained_var': -1.0, 'kl': 0.0050217416, 'entropy': 23.58474}
........
Result for PPO_RoboschoolHumanoid-v1_0: date: 2019-02-05_21-14-01 done: false episode_len_mean: 19.204132823698043 episode_reward_max: -39.52636344946514 episode_reward_mean