Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] restore from checkpoint using train.py #3204

Closed
rnunziata opened this issue Nov 2, 2018 · 2 comments
Closed

[rllib] restore from checkpoint using train.py #3204

rnunziata opened this issue Nov 2, 2018 · 2 comments
Projects

Comments

@rnunziata
Copy link

I am trying to run train.py against a prior checkpoint but it does not seem to pick it up or complain about the restore parameter passed. See the mean score below. Does train not use this parameter? I could not find any examples showing its use even though it is in the arg list.

pong-impala:
    env: Pong-ram-v4 
    run: IMPALA
    checkpoint_freq: 20    
    config:
        sample_batch_size: 50
        train_batch_size: 500
        num_workers: 7   

========================================================================================= 
>>   python python/ray/rllib/train.py  --config-file=my.yaml 
========================================================================================= 

== Status ==
Using FIFO scheduling algorithm.
Resources requested: 8/8 CPUs, 1/1 GPUs
Result logdir: /home/rjn/ray_results/pong-impala
RUNNING trials:
 - IMPALA_Pong-ram-v4_0:	RUNNING [pid=18500], 3487 s, 343 iter, 11877000 ts, -16.8 rew

Result for IMPALA_Pong-ram-v4_0:
  date: 2018-11-02_12-59-44
  done: false
  episode_len_mean: 2317.63
  episode_reward_max: -9.0
  episode_reward_mean: -16.81
  episode_reward_min: -21.0
  episodes_this_iter: 15
  episodes_total: 7434
  experiment_id: 382b7dff22c94d4f9168bc03ba30e821
  hostname: rjn-Oryx-Pro
  info:
    learner:
      cur_lr: 0.0005000000237487257
      entropy: 824.3236083984375
      grad_gnorm: 40.000003814697266
      policy_loss: 16.117586135864258
      var_gnorm: 30.87358856201172
      vf_explained_var: 0.34759581089019775
      vf_loss: 35.219303131103516
    learner_queue:
      size_count: 23822
      size_mean: 0.0
      size_quantiles:
      - 0.0
      - 0.0
      - 0.0
      - 0.0
      - 0.0
      size_std: 0.0
    num_steps_replayed: 0
    num_steps_sampled: 11911200
    num_steps_trained: 11911000
    num_weight_syncs: 238224
    sample_throughput: 3360.381
    timing_breakdown:
      enqueue_time_ms: 0.026
      learner_dequeue_time_ms: 125.537
      learner_grad_time_ms: 16.766
      learner_load_time_ms: .nan
      learner_load_wait_time_ms: .nan
      put_weights_time_ms: 7.462
      sample_processing_time_ms: 17.836
      sample_time_ms: 17.855
      train_time_ms: 17.855
    train_throughput: 2800.317
  iterations_since_restore: 344
  node_ip: 192.168.1.100
  num_metric_batches_dropped: 0
  pid: 18500
  policy_reward_mean: {}
  time_since_restore: 3497.607877254486
  time_this_iter_s: 10.153211116790771
  time_total_s: 3497.607877254486
  timestamp: 1541177984
  timesteps_since_restore: 11911200
  timesteps_this_iter: 34200
  timesteps_total: 11911200
  training_iteration: 344
 
 
========================================================================================= 
  >>    python python/ray/rllib/train.py  --restore=~/ray_results/pong-impala/IMPALA_Pong-ram-v4_0_2018-11-02_12-01-21_zpUuC/checkpoint_340y58PU_   --config-file=my.yaml  
==========================================================================================
  
  


== Status ==
Using FIFO scheduling algorithm.



Result for IMPALA_Pong-ram-v4_0:
  date: 2018-11-02_13-11-54
  done: false
  episode_len_mean: 1176.6
  episode_reward_max: -19.0
  episode_reward_mean: -20.6
  episode_reward_min: -21.0
  episodes_this_iter: 20
  episodes_total: 20
  experiment_id: 3a9271cd0da94a3cac878fdd551b1a63
  hostname: rjn-Oryx-Pro
  info:
    learner:
      cur_lr: 0.0005000000237487257
      entropy: 836.6295166015625
      grad_gnorm: 40.000003814697266
      policy_loss: 181.09475708007812
      var_gnorm: 22.668785095214844
      vf_explained_var: 0.2981424331665039
      vf_loss: 42.735008239746094
    learner_queue:
      size_count: 52
      size_mean: 0.0
      size_quantiles:
      - 0.0
      - 0.0
      - 0.0
      - 0.0
      - 0.0
      size_std: 0.0
    num_steps_replayed: 0
    num_steps_sampled: 26100
    num_steps_trained: 26000
    num_weight_syncs: 522
    sample_throughput: 3736.855
    timing_breakdown:
      enqueue_time_ms: 0.022
      learner_dequeue_time_ms: 109.484
      learner_grad_time_ms: 14.186
      learner_load_time_ms: .nan
      learner_load_wait_time_ms: .nan
      put_weights_time_ms: 6.008
      sample_processing_time_ms: 17.375
      sample_time_ms: 17.394
      train_time_ms: 17.394
    train_throughput: 5749.008
  iterations_since_restore: 1
  node_ip: 192.168.1.100
  num_metric_batches_dropped: 0
  pid: 21621
  policy_reward_mean: {}
  time_since_restore: 10.150846004486084
  time_this_iter_s: 10.150846004486084
  time_total_s: 10.150846004486084
  timestamp: 1541178714
  timesteps_since_restore: 26100
  timesteps_this_iter: 26100
  timesteps_total: 26100
  training_iteration: 1

@ericl ericl added this to Needs triage in RLlib via automation Nov 2, 2018
richardliaw pushed a commit that referenced this issue Nov 5, 2018
## What do these changes do?

Clean up the checkpointing to handle the new checkpoint dirs. Add a test for rollout.py

## Related issue number

#3206
#3204
@richardliaw
Copy link
Contributor

richardliaw commented Nov 6, 2018

I think the issue here is that if you specify a config, you can't override arguments. You'll have to include restore as a parameter in your config file.

This is something discussed in #2986, but I guess we never put in a warning...

Try it out and let me know if you run into anything.

@richardliaw richardliaw changed the title restore from checkpoint using train.py [rllib] restore from checkpoint using train.py Nov 6, 2018
@rnunziata
Copy link
Author

rnunziata commented Nov 6, 2018

sorry it says that in the train.py i missed that ...

RLlib automation moved this from Needs triage to Done Nov 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
RLlib
  
Done
Development

No branches or pull requests

2 participants