v0.2.1

PaParaZz1 released this 22 Nov 08:15

· 664 commits to main since this release

cf8ad13

API Change

remove torch in all envs (numpy array is the basic data format in env)
remove on_policy field in all the config
change eval_freq from 50 to 1000

Tutorial and Doc

env tutorial/环境指南

Env (dizoo)

gym-hybrid env (#86)
gym-soccer (HFO) env (#94)
Go-Bigger env baseline (#95)
sac and ppo config for bipedalwalker env(#121)

Algorithm

DQfD Imitation Learning algorithm (#48) (#98)
TD3BC offline RL algorithm (#88)
MBPO model-based RL algorithm (#113)
PADDPG hybrid action space algorithm (#109)
PDQN hybrid action space algorithm (#118)
fix R2D2 bugs and produce benchmark, add naive NGU (#40)
self-play training demo in slime_volley env (#23)
add example of GAIL entry + config for mujoco (#114)

Enhancement

enable arbitrary policy num in serial sample collector
add torch DataParallel for single machine multi-GPU
add registry force_overwrite argument
add naive buffer periodic thruput seconds argument

Fix

target model wrapper hard reset bug
fix learn state_dict target model bug
ppo bugs and update atari ppo offpolicy config (#108)
pyyaml version bug (#99)
small fix on bsuite environment (#117)
discrete cql unittest bug
release workflow bug
base policy model state_dict overlap bug
remove on_policy option in dizoo config and entry
remove torch in env

Test

add pure docker setting test (#103)
add unittest for dataset and evaluator (#107)
add unittest for on-policy algorithm (#92)
add unittest for ppo and td (MARL case) (#89)

Style

gym version == 0.20.0
torch version >= 1.1.0, <= 1.10.0
ale-py == 0.7.0

New Repo

Go-Bigger OpenDILab Multi-Agent Decision Intelligence Environment
GoBigger-Challenge-2021 Basic code and description for GoBigger challenge 2021

Contributors: @PaParaZz1 @puyuan1996 @Will-Nie @YinminZhang @Weiyuhong-1998 @LikeJulia @sailxjx @davide97l @jayyoung0802 @lichuminglcm @yifan123 @RobinC94 @zjowowen

Contributors

sailxjx, RobinC94, and 11 other contributors

Assets 2