Skip to content

v0.2.1

Choose a tag to compare

@PaParaZz1 PaParaZz1 released this 22 Nov 08:15
· 664 commits to main since this release

API Change

  1. remove torch in all envs (numpy array is the basic data format in env)
  2. remove on_policy field in all the config
  3. change eval_freq from 50 to 1000

Tutorial and Doc

  1. env tutorial/环境指南

Env (dizoo)

  1. gym-hybrid env (#86)
  2. gym-soccer (HFO) env (#94)
  3. Go-Bigger env baseline (#95)
  4. sac and ppo config for bipedalwalker env(#121)

Algorithm

  1. DQfD Imitation Learning algorithm (#48) (#98)
  2. TD3BC offline RL algorithm (#88)
  3. MBPO model-based RL algorithm (#113)
  4. PADDPG hybrid action space algorithm (#109)
  5. PDQN hybrid action space algorithm (#118)
  6. fix R2D2 bugs and produce benchmark, add naive NGU (#40)
  7. self-play training demo in slime_volley env (#23)
  8. add example of GAIL entry + config for mujoco (#114)

Enhancement

  1. enable arbitrary policy num in serial sample collector
  2. add torch DataParallel for single machine multi-GPU
  3. add registry force_overwrite argument
  4. add naive buffer periodic thruput seconds argument

Fix

  1. target model wrapper hard reset bug
  2. fix learn state_dict target model bug
  3. ppo bugs and update atari ppo offpolicy config (#108)
  4. pyyaml version bug (#99)
  5. small fix on bsuite environment (#117)
  6. discrete cql unittest bug
  7. release workflow bug
  8. base policy model state_dict overlap bug
  9. remove on_policy option in dizoo config and entry
  10. remove torch in env

Test

  1. add pure docker setting test (#103)
  2. add unittest for dataset and evaluator (#107)
  3. add unittest for on-policy algorithm (#92)
  4. add unittest for ppo and td (MARL case) (#89)

Style

  1. gym version == 0.20.0
  2. torch version >= 1.1.0, <= 1.10.0
  3. ale-py == 0.7.0

New Repo

Contributors: @PaParaZz1 @puyuan1996 @Will-Nie @YinminZhang @Weiyuhong-1998 @LikeJulia @sailxjx @davide97l @jayyoung0802 @lichuminglcm @yifan123 @RobinC94 @zjowowen