-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to Sacred 0.8.x #109
Comments
Keep running into problems with parallelism with Sacred 0.7.4: IDSIA/sacred#503 Fixed in 0.8.x, so slightly higher priority to upgrade (although I can work around it for now) |
@AdamGleave: It seems to me that our serialization errors in At least in |
IIRC in the Adversarial Policies repo you actually pass around pickled configs. For |
I'm a bit confused what you mean by implicitly pickling configs? My current understanding was the problem is that the arguments Sacred passes in aren't pickleable, and we want to pickle them when sending to other processes (e.g. for Ray). Apologies if I gave a misleading impression. My view is Sacred should not be changing the types of config arguments under the user's feet in this way, which is why I opened the issue, not that it's an insurmountable problem. I think in adversarial policies I do explicitly cast |
Thanks for clarifying. My previous (and incorrect) understanding for why I think I thought this because I didn't realize that Sacred was converting captured dict/list arguments into ReadOnlyContainer types. |
Ah. I think if Ray ended up pickling entire Sacred config objects you'd run
into other issues, e.g. FileStorageObserver's are not pickleable. Which is
why we're quite careful what is in scope in parallel code to avoid
accidentally pickling too much.
…On Sun, May 31, 2020, 15:59 Steven H. Wang ***@***.***> wrote:
Thanks for clarifying. My previous (and incorrect) understanding for why
imitation.scripts.parallel was failing with new Sacred was that somehow
Ray was pickling Sacred configs.
I think I thought this because I didn't realize that Sacred was converting
captured dict/list arguments into ReadOnlyContainer types.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#109 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZ3I3JIVYFFOBDQ32NU7LRULONJANCNFSM4JAWKXQA>
.
|
Presumably the reason they want to use ReadOnlyContainer is because they don't want the configs or containers inside the config to be updated? Seems like a better way to do captured arguments in ReadOnly mode is to pass recursively "unfrozen" shallow copies of ReadOnlyDict/List into the user functions. |
* Upgrade SB3 and make logger.py non-global; much breakage elsewhere * Make BC use local logger; if folder None default to temporary directory * Lint * Make adversarial trainer and DAgger local logger * Make DAgger and BC pickleable again * Remove old workarounds for SB3 issue #109 * 4 TB directories expected (we reduced to 2 only because SB TB disabled previously) * Boost test coverage * Exempt coverage of intentionally dead test code * Rename _HierarchicalLogger in docstring * Add missing functions in logger, add some test cases, minor tidying * Add missing functions in logger, add some test cases, minor tidying * Pass str into functions expecting it * Bugfix in path_to_str; add test Co-authored-by: Erik Jenner <erik.jenner99@gmail.com>
* Generalize RewardNet base class The previous RewardNet contained some things that are specific to AIRL or at least not necessary for some other applications (in particular the train/test split). * Fix typo in docstring Co-authored-by: Steven H. Wang <wang.steven.h@gmail.com> * Remove pass statement in abstractmethod This is consistent with the rest of the codebase and avoids codecov complaints. * Reduce AIRL-specific reward net code Almost all of the code in reward_nets is now completely general and can be used by other algorithms in the future. Only AIRLRewardNet is specific to AIRL (and it's simply a thin wrapper). * Add test for RewardNet.device without parameters * Remove pass statement to avoid codecov issues * Add first preference comparisons prototype * Fix line lengths * Add test and fix lots of bugs * Fix logger error in test * Fix docstring formatting issue * Improve doc string Co-authored-by: Adam Gleave <adam@gleave.me> * Improve type signatures Co-authored-by: Adam Gleave <adam@gleave.me> * Fix missing imports and create type aliases * Fix imports to adhere to Google style Not for imitation.data.types for now, will decide on that later. * Fix another import * Sample inside SyntheticGatherer Instead of returning analytic probabilities, we now sample discrete decisions (0 or 1). * Add discount factor to SyntheticGatherer * Improve doc string Co-authored-by: Adam Gleave <adam@gleave.me> * Simplify code slightly * Slight refactoring * Fix docstrings * Add assert statements * Add discounting in reward model training I think this makes sense given that we now support discounting for the synthetic gatherer. * Fix dtype * Add another assert * Split up train() and sample() * Add TODO comment * Decrease warning threshold and improve docstring * Add logging and training script (still WIP) * Apply suggestions from code review Co-authored-by: Adam Gleave <adam@gleave.me> * Fix many small issues from code review * Save final models * Expose more arguments via CLI * Evaluate trained agent at the end * Rename "steps" to "iterations" I think this makes it clearer that his is "one level higher" than the agent training steps. * Speed up smoke test * Move Fragmenter into preference_comparisons Seems unlikely to be used elsewhere and doesn't make much sense to have inside imitation.data. * Change some default hparams These aren't tuned but at least they quickly solve Cartpole. * Train reward model before agent Training the agent on the initially random reward model in the first iterations doesn't make sense. This also required sampling trajectories if none have been gathered yet. * Increase test coverage * Upgrade SB3 and make logger.py non-global; much breakage elsewhere * Make BC use local logger; if folder None default to temporary directory * Lint * Make adversarial trainer and DAgger local logger * Make DAgger and BC pickleable again * Remove old workarounds for SB3 issue #109 * 4 TB directories expected (we reduced to 2 only because SB TB disabled previously) * Boost test coverage * Exempt coverage of intentionally dead test code * Rename _HierarchicalLogger in docstring * Switch to temperature param and improve coverage * Add more tests * Switch to local logger * Configure logger correctly in script * Remove duplicate log message * Replace hasattr checks This was very brittle if someone calls the logger e.g. self._logger in other implementations. Now the logger is set in the base class. This necessitated introducing ABCs for Fragmenter etc., which I originally wanted to avoid, but I think it starts to make sense. I also did this for PreferenceGatherer though it currently doesn't use the logger because I strongly assume that it or at least other implementations will at some point. * Remove unnecessary warn method * Delete unused import * Simplify tests * Ignore unused dummy function for coverage Co-authored-by: Steven H. Wang <wang.steven.h@gmail.com> Co-authored-by: Adam Gleave <adam@gleave.me>
Our code currently breaks with anything >0.7.4 because they made config objects non-picklable. We can workaround this via a config attribute to disable their
ReadOnly*
, or by using a helper method.See IDSIA/sacred#499 and IDSIA/sacred#508 for some discussion
The text was updated successfully, but these errors were encountered: