Retraining the saved model #2

anushmanukyan · 2018-09-04T13:35:04Z

I'm trying to retrain the saved model, but it behaves very strangely:

does not seem to start from the same behaviour that has been saved
repeats only one type of action after running the retraining

I guess this is pytorch issue, but maybe you've succeeded the retraining, and might know how should it be done?

Saving:

def save_model_for_training(self, episode, filepath):
        checkpoint = {
            'episode': episode,
            'state_dict': self.net.state_dict(),
            'optimizer': self.optimizer.state_dict()
        }
        torch.save(checkpoint, filepath)

 self.save_model_for_training(episode, filepath= self.model_path + 'model.pt')

Loading saved model:

checkpoint = torch.load(self.model_path + 'model.pt')
self.start_episode = checkpoint['episode']
self.net.load_state_dict(checkpoint['state_dict'])
self.optimizer.load_state_dict(checkpoint['optimizer'])

Thanks a lot in advance

The text was updated successfully, but these errors were encountered:

TianhongDai · 2018-09-04T14:29:59Z

@anushmanukyan Could I know which environment and algorithm you are training for?

anushmanukyan · 2018-09-05T07:57:35Z

@TianhongDai I am using PPO.

TianhongDai · 2018-09-05T08:49:49Z

@anushmanukyan I guess you just load the weights of the model. if you check the line here: https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L113 When I test the network, I also load the object of running mean filter. Because during training , I use the running mean filter to normalize the input. So, if you want to retrain your model, you should also load the "trained" running mean filter. Otherwise you will get different result.

anushmanukyan · 2018-09-12T12:12:21Z

I added running mean filter and retraining seems to work better now.
However I have another question: how the demo.py works? Basically I can not figure out how the testing works, since I save the best model, but then when i test this model it has different reward than it had while saving that model. How it can be possible? And also if I run several times the same model then I get different performance.

Thank you so much for your help.

TianhongDai · 2018-09-12T12:21:21Z

@anushmanukyan Hi, I think demo.py should work fine, you can download my pre-trained model from: https://drive.google.com/drive/u/2/folders/1cZjjCA5WHs-Lfw63ntzeUjMo_wZoIgXw Then, just run python demo.py . It will still get same high scores as it get during training. You can check https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L111 here to see how did i test the network.

TianhongDai closed this as completed Jun 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retraining the saved model #2

Retraining the saved model #2

anushmanukyan commented Sep 4, 2018 •

edited

TianhongDai commented Sep 4, 2018

anushmanukyan commented Sep 5, 2018

TianhongDai commented Sep 5, 2018 •

edited

anushmanukyan commented Sep 12, 2018

TianhongDai commented Sep 12, 2018

Retraining the saved model #2

Retraining the saved model #2

Comments

anushmanukyan commented Sep 4, 2018 • edited

TianhongDai commented Sep 4, 2018

anushmanukyan commented Sep 5, 2018

TianhongDai commented Sep 5, 2018 • edited

anushmanukyan commented Sep 12, 2018

TianhongDai commented Sep 12, 2018

anushmanukyan commented Sep 4, 2018 •

edited

TianhongDai commented Sep 5, 2018 •

edited