Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 0.1.5 #17

Merged
merged 26 commits into from May 20, 2021
Merged

Version 0.1.5 #17

merged 26 commits into from May 20, 2021

Conversation

md-enlite
Copy link
Contributor

@md-enlite md-enlite commented May 20, 2021

Features:

  • Adds documentation for run_context
  • Changes of simulated environment interfaces step_without_observation -> fast_step
  • Adds seeding to environments, models and trainers
  • Initial commit of the Maze Python API
  • Adds an ExportGifWrapper
  • Adds network architecture visualizations to Tensorboard Images
  • adds incremental min/max stats
  • adds categorical (support-based) value networks
  • added value transformations

EnliteAI Bot added 26 commits May 7, 2021 01:25
(Issue RL-604 - Consider entire wrapper stack in clone_from of SimulatedEnvs)
(Issue RL-604 - Consider entire wrapper stack in clone_from of SimulatedEnvs)
(Issue RL-604 - Consider entire wrapper stack in clone_from of SimulatedEnvs)
(Issue RL-604 - Consider entire wrapper stack in clone_from of SimulatedEnvs)
… passing of Python objects to trainers/runners. Rewrote plain python training example.

(Issue RL-578 - Add pure-Python API layer for more accessible training)
(Issue RL-621 - Stats logging of mcts rollouts in AlphaZero)
…ID in single sub-step envs

(Issue RL-631 - Fix: Allow random policy to sample actions without explicit actor ID in single sub-step envs)
…previous evaluation run

(Issue RL-630 - Fix: Rollout evaluation statistics from previous unfinished episodes carry over)
…o default eval concurrency in local setup

(Issue RL-630 - Fix: Rollout evaluation statistics from previous unfinished episodes carry over)
value normalization with min/max stats
discounted value bootstrapping

(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
adds critic evaluation
adds list of wrappers to exclude
fixes Q value min/max normalization

(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-580 - AlphaZero for discounted, infinite horizon Tasks)
(Issue RL-635 - step_without_observation -> fast_step interface)
…tions etc.

(Issue RL-637 - Fix and improve trajectory record convenience accessors to actions etc.)
(Issue RL-637 - Fix and improve trajectory record convenience accessors to actions etc.)
(Issue RL-637 - Fix and improve trajectory record convenience accessors to actions etc.)
@md-enlite md-enlite merged commit 953a2e5 into main May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant