Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Support for fallbacks when loading cloudpickle items fail #172

Closed
Miffyli opened this issue Oct 1, 2020 · 5 comments
Closed
Labels
enhancement New feature or request

Comments

@Miffyli
Copy link
Collaborator

Miffyli commented Oct 1, 2020

Related issue #171

As seen in #171, cloudpickle/pickle can easily fail when transferring models between Python versions (and in case some other shenanigans). There is currently no way to address this issue as loading of model tries to un-pickle all pickle objects and does not catch errors.

Two suggestions to deal with this, both of which would be nice to have:

  1. Provide custom_objects dictionary (similar to what we had in SB2), which will override any objects read from file. This could be used to even prevent loading the item from the file, in which case all JSON-ed items can be used as before.
  2. Update set_parameters function to only load parameters from the file but not the data objects (which include pickled items). This was included in SB2.

@ManifoldFR If you had other suggestions please feel free to throw them here :)

@Miffyli Miffyli added the enhancement New feature or request label Oct 1, 2020
@ManifoldFR
Copy link
Contributor

ManifoldFR commented Oct 1, 2020

From what I saw the problem I encountered is with the learning rate and clip ratio schedulers (for PPO), in the particular case where a constant scheduler is created using constant_fn (I don't know about other cases though) because pickling/unpickling functions is brittle especially when they are locals of a higher-order functon. I think we can look at how PyTorch supports saving state dicts for schedulers. From the source I see they use classes and the __dict__ attribute to save/load state dicts.

In general I think the ideas for other fallbacks are good because it allows the user more flexibility.

@ManifoldFR
Copy link
Contributor

ManifoldFR commented Oct 3, 2020

I started a branch for schedulers. I think we should not subclass from PyTorch's _LRScheduler straightaway to get the learning rate because they're designed to step with the optimizer.
I think we can take inspiration from their design though:

  • use a class with state_dict/load_state_dict methods
  • update the scheduler parameters (namely timesteps) using their step() method, update the param groups from there when applicable (i.e. not for clip ratio)

The third point would require to slightly modify OnPolicyAlgorithm and OffPolicyAlgorithm. We'll still make the class Callable so as to not modify other classes.

@Miffyli
Copy link
Collaborator Author

Miffyli commented Oct 4, 2020

I wouldn't just focus on schedules though, as other objects are stored as pickled objects (no easy alternative). I am not sure having non-pickled versions of all objects is possible, and it would not be easy at the very least. I'd focus on the fallback tools to help reconstructing models if loading fails (part of the reason why currently the saved model contains some description of the pickled object). However if you can find something that could be used to achieve less-brittle serialization I am all ears!

@aliamiri1380
Copy link

aliamiri1380 commented Feb 12, 2021

this can create a file from model, but it's too large:

import dill
with open('nnn.pkl', 'wb') as f:
    dill.dump(model, f)

then you can save and load with dumped model without any error:

with open('Downloads/model.pkl', 'rb') as f:
    model = dill.load(f)
model.save('nnn')
model = PPO.load('nnn')

this sometimes gives cuda error, if you saw that, modify serialization.py as:
https://stackoverflow.com/questions/56369030/runtimeerror-attempting-to-deserialize-object-on-a-cuda-device

@araffin
Copy link
Member

araffin commented Mar 6, 2021

as a follow-up, I added support for custom objects here: #336
and included that in the rl zoo: DLR-RM/rl-baselines3-zoo#69
(so models trained with python 3.6/3.7 can be loaded with python 3.8+, but not retrained yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants