Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Sacred 0.8.x #109

Closed
AdamGleave opened this issue Oct 15, 2019 · 7 comments · Fixed by #194
Closed

Upgrade to Sacred 0.8.x #109

AdamGleave opened this issue Oct 15, 2019 · 7 comments · Fixed by #194

Comments

@AdamGleave
Copy link
Member

Our code currently breaks with anything >0.7.4 because they made config objects non-picklable. We can workaround this via a config attribute to disable their ReadOnly*, or by using a helper method.

See IDSIA/sacred#499 and IDSIA/sacred#508 for some discussion

@AdamGleave
Copy link
Member Author

Keep running into problems with parallelism with Sacred 0.7.4: IDSIA/sacred#503

Fixed in 0.8.x, so slightly higher priority to upgrade (although I can work around it for now)

@shwang
Copy link
Member

shwang commented May 31, 2020

@AdamGleave: It seems to me that our serialization errors in imitation might not actually be coming from implicitly pickling Configs (as I think you might have mentioned in one of the threads related to this Sacred issue). I didn't realize this until today, but the latest version of Sacred (v0.8) under regular Sacred settings, never passes dict into argument-capturing functions any more. It always passes in ReadOnlyDict.

At least in imitation it looks like the pickle error comes from me trying to forward a ReadOnlyDict (which I used to think is just a dict) into Ray subprocesses.

@shwang
Copy link
Member

shwang commented May 31, 2020

IIRC in the Adversarial Policies repo you actually pass around pickled configs. For imitation it might be sufficient to just convert every ReadOnlyDict into a dict. (This would be messy though, as we'd have to recursively "unfreeze" the data structure)

@AdamGleave
Copy link
Member Author

AdamGleave commented May 31, 2020

@AdamGleave: It seems to me that our serialization errors in imitation might not actually be coming from implicitly pickling Configs (as I think you might have mentioned in one of the threads related to this Sacred issue). I didn't realize this until today, but the latest version of Sacred (v0.8) under regular Sacred settings, never passes dict into argument-capturing functions any more. It always passes in ReadOnlyDict.

At least in imitation it looks like the pickle error comes from me trying to forward a ReadOnlyDict (which I used to think is just a dict) into Ray subprocesses.

I'm a bit confused what you mean by implicitly pickling configs? My current understanding was the problem is that the arguments Sacred passes in aren't pickleable, and we want to pickle them when sending to other processes (e.g. for Ray). Apologies if I gave a misleading impression. My view is Sacred should not be changing the types of config arguments under the user's feet in this way, which is why I opened the issue, not that it's an insurmountable problem.

I think in adversarial policies I do explicitly cast ReadOnlyDict to dict, but as you say it's messy, but yes a viable workaround. Think would maybe need to do the same thing with other frozen types they use (e.g. lists?) Sacred also has a config option I believe that disables this whole read-only thing.

@shwang
Copy link
Member

shwang commented May 31, 2020

Thanks for clarifying. My previous (and incorrect) understanding for why imitation.scripts.parallel was failing with new Sacred was that somehow Ray was pickling Sacred configs.

I think I thought this because I didn't realize that Sacred was converting captured dict/list arguments into ReadOnlyContainer types.

@AdamGleave
Copy link
Member Author

AdamGleave commented May 31, 2020 via email

@shwang
Copy link
Member

shwang commented May 31, 2020

Presumably the reason they want to use ReadOnlyContainer is because they don't want the configs or containers inside the config to be updated?

Seems like a better way to do captured arguments in ReadOnly mode is to pass recursively "unfrozen" shallow copies of ReadOnlyDict/List into the user functions.

@shwang shwang linked a pull request Jun 2, 2020 that will close this issue
@qxcv qxcv mentioned this issue Oct 2, 2020
AdamGleave added a commit that referenced this issue Aug 31, 2021
* Upgrade SB3 and make logger.py non-global; much breakage elsewhere

* Make BC use local logger; if folder None default to temporary directory

* Lint

* Make adversarial trainer and DAgger local logger

* Make DAgger and BC pickleable again

* Remove old workarounds for SB3 issue #109

* 4 TB directories expected (we reduced to 2 only because SB TB disabled previously)

* Boost test coverage

* Exempt coverage of intentionally dead test code

* Rename _HierarchicalLogger in docstring

* Add missing functions in logger, add some test cases, minor tidying

* Add missing functions in logger, add some test cases, minor tidying

* Pass str into functions expecting it

* Bugfix in path_to_str; add test

Co-authored-by: Erik Jenner <erik.jenner99@gmail.com>
ejnnr added a commit that referenced this issue Sep 2, 2021
* Generalize RewardNet base class

The previous RewardNet contained some things that are specific to
AIRL or at least not necessary for some other applications
(in particular the train/test split).

* Fix typo in docstring

Co-authored-by: Steven H. Wang <wang.steven.h@gmail.com>

* Remove pass statement in abstractmethod

This is consistent with the rest of the codebase and avoids
codecov complaints.

* Reduce AIRL-specific reward net code

Almost all of the code in reward_nets is now completely general
and can be used by other algorithms in the future.
Only AIRLRewardNet is specific to AIRL (and it's simply a thin wrapper).

* Add test for RewardNet.device without parameters

* Remove pass statement to avoid codecov issues

* Add first preference comparisons prototype

* Fix line lengths

* Add test and fix lots of bugs

* Fix logger error in test

* Fix docstring formatting issue

* Improve doc string

Co-authored-by: Adam Gleave <adam@gleave.me>

* Improve type signatures

Co-authored-by: Adam Gleave <adam@gleave.me>

* Fix missing imports and create type aliases

* Fix imports to adhere to Google style

Not for imitation.data.types for now, will decide on that later.

* Fix another import

* Sample inside SyntheticGatherer

Instead of returning analytic probabilities, we now sample
discrete decisions (0 or 1).

* Add discount factor to SyntheticGatherer

* Improve doc string

Co-authored-by: Adam Gleave <adam@gleave.me>

* Simplify code slightly

* Slight refactoring

* Fix docstrings

* Add assert statements

* Add discounting in reward model training

I think this makes sense given that we now support
discounting for the synthetic gatherer.

* Fix dtype

* Add another assert

* Split up train() and sample()

* Add TODO comment

* Decrease warning threshold and improve docstring

* Add logging and training script (still WIP)

* Apply suggestions from code review

Co-authored-by: Adam Gleave <adam@gleave.me>

* Fix many small issues from code review

* Save final models

* Expose more arguments via CLI

* Evaluate trained agent at the end

* Rename "steps" to "iterations"

I think this makes it clearer that his is "one level higher"
than the agent training steps.

* Speed up smoke test

* Move Fragmenter into preference_comparisons

Seems unlikely to be used elsewhere and doesn't
make much sense to have inside imitation.data.

* Change some default hparams

These aren't tuned but at least they quickly solve Cartpole.

* Train reward model before agent

Training the agent on the initially random reward model
in the first iterations doesn't make sense.

This also required sampling trajectories if none have been
gathered yet.

* Increase test coverage

* Upgrade SB3 and make logger.py non-global; much breakage elsewhere

* Make BC use local logger; if folder None default to temporary directory

* Lint

* Make adversarial trainer and DAgger local logger

* Make DAgger and BC pickleable again

* Remove old workarounds for SB3 issue #109

* 4 TB directories expected (we reduced to 2 only because SB TB disabled previously)

* Boost test coverage

* Exempt coverage of intentionally dead test code

* Rename _HierarchicalLogger in docstring

* Switch to temperature param and improve coverage

* Add more tests

* Switch to local logger

* Configure logger correctly in script

* Remove duplicate log message

* Replace hasattr checks

This was very brittle if someone calls the logger e.g. self._logger
in other implementations. Now the logger is set in the base class.
This necessitated introducing ABCs for Fragmenter etc., which I
originally wanted to avoid, but I think it starts to make sense.
I also did this for PreferenceGatherer though it currently doesn't
use the logger because I strongly assume that it or at least
other implementations will at some point.

* Remove unnecessary warn method

* Delete unused import

* Simplify tests

* Ignore unused dummy function for coverage

Co-authored-by: Steven H. Wang <wang.steven.h@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants