Is save checkpoint not yet supported for ppo ray trainer? #256

mickel-liu · 2024-03-27T06:20:23Z

When I set save_step other than -1, the program outputs an exception

self.actor.model, os.path.join(args.ckpt_path, "_actor"), tag, args.max_ckpt_num, args.max_ckpt_mem
AttributeError: 'Namespace' object has no attribute 'ckpt_path'

OpenRLHF/openrlhf/trainer/ppo_trainer.py

Lines 378 to 385 in 3c91875

    
           if global_step % args.save_steps == 0: 
        
               tag = f"global_step{global_step}" 
        
               self.strategy.save_ckpt( 
        
                   self.actor.model, os.path.join(args.ckpt_path, "_actor"), tag, args.max_ckpt_num, args.max_ckpt_mem 
        
               ) 
        
               self.strategy.save_ckpt( 
        
                   self.critic, os.path.join(args.ckpt_path, "_critic"), tag, args.max_ckpt_num, args.max_ckpt_mem 
        
               )

These three args are indeed not included in train_ppo_ray.py and I don't see arg.save_path being used.

I did see this issue was mentioned in #133, wondering if there's any update.

The text was updated successfully, but these errors were encountered:

hijkzzz · 2024-03-27T11:46:41Z

Yes, we haven't fully developed and tested this feature yet. Welcome contribution

mickel-liu · 2024-03-27T20:11:10Z

i'm happy to look into it, but how have you guys been saving models?

suehyunpark · 2024-05-11T10:12:28Z

Hi @mickel-liu, have you figured this out? I have no choice but to use train_ppo_ray.py for PPO instead of train_ppo.py, because it doesn't OOM during model loading in my configuration. I am looking into ways to save checkpoints during/after training, and was hoping if you have delved into this feature as well.

mickelliu · 2024-05-12T07:21:48Z

Hi @mickel-liu, have you figured this out? I have no choice but to use train_ppo_ray.py for PPO instead of train_ppo.py, because it doesn't OOM during model loading in my configuration. I am looking into ways to save checkpoints during/after training, and was hoping if you have delved into this feature as well.

Hi, I did look into the code and found out the saving checkpoints feature is not yet implemented. But actually saving checkpoints wasn't what I was looking for, I want the actual model checkpoints, not the intermediate states as being referred in this repo. So I ended up changing the code on my fork and now it saves model checkpoints after a pre-set amount of iterations. Here's the code in my fork: https://github.com/mickelliu/OpenRLHF/blob/a7f21aa26ac027fcf30ca1c588e01cf07c67cb6f/openrlhf/trainer/ppo_trainer.py#L428-L442

Regardless of ckpt feature is being officially implemented, train_ppo_ray.py will save a model checkpoint at the end of the training.

suehyunpark · 2024-05-15T08:32:19Z

Hi @mickel-liu, have you figured this out? I have no choice but to use train_ppo_ray.py for PPO instead of train_ppo.py, because it doesn't OOM during model loading in my configuration. I am looking into ways to save checkpoints during/after training, and was hoping if you have delved into this feature as well.

Hi, I did look into the code and found out the saving checkpoints feature is not yet implemented. But actually saving checkpoints wasn't what I was looking for, I want the actual model checkpoints, not the intermediate states as being referred in this repo. So I ended up changing the code on my fork and now it saves model checkpoints after a pre-set amount of iterations. Here's the code in my fork: https://github.com/mickelliu/OpenRLHF/blob/a7f21aa26ac027fcf30ca1c588e01cf07c67cb6f/openrlhf/trainer/ppo_trainer.py#L428-L442

Regardless of ckpt feature is being officially implemented, train_ppo_ray.py will save a model checkpoint at the end of the training.

Thanks for the quick reply and for sharing your code! I'm glad to know that saving the trained model would be that simple. Although the checkpointing feature would be a great add, this fix seems to solve my issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is save checkpoint not yet supported for ppo ray trainer? #256

Is save checkpoint not yet supported for ppo ray trainer? #256

mickel-liu commented Mar 27, 2024

hijkzzz commented Mar 27, 2024

mickel-liu commented Mar 27, 2024

suehyunpark commented May 11, 2024

mickelliu commented May 12, 2024

suehyunpark commented May 15, 2024

Is save checkpoint not yet supported for ppo ray trainer? #256

Is save checkpoint not yet supported for ppo ray trainer? #256

Comments

mickel-liu commented Mar 27, 2024

hijkzzz commented Mar 27, 2024

mickel-liu commented Mar 27, 2024

suehyunpark commented May 11, 2024

mickelliu commented May 12, 2024

suehyunpark commented May 15, 2024