v0.3.0

PaParaZz1 released this 24 Mar 08:59

· 522 commits to main since this release

a1d8fdf

API Change

add new BaseEnv definition:
- remove info method
- add random_action method
- add observation_space, action_space, reward_space properties
- Env English doc | 环境中文文档
modify the return value of eval method in InteractionSerialEvaluator class from Tuple[bool, float] to Tuple[bool, dict].
move the default logger to rich logger, you can set env variable like export ENABLE_RICH_LOGGING=False to disable it.
add train_iter and env_step argument in ding CLI.
- you can use them like ding -m serial -c pendulum_sac_config.py -s 0 --train-iter 1e3
remove default n_sample/n_episode value in policy default config.

Env (dizoo)

add bitfilp HER DQN benchmark (#192) (#193) (#197)
add slime volley league training demo (#229)

Algorithm

Gated TransformXL (GTrXL) algorithm (#136)
TD3 + VAE(HyAR) latent action algorithm (#152)
stochastic dueling network (#234)
use log prob instead of using prob in ACER (#186)

Feature

support envpool env manager (#228)
add league main and other improvements in new framework (#177) (#214)
add pace controller middleware in new framework (#198)
add auto recover option in new framework (#242)
add k8s parser in new framework (#243)
support async event handler and logger (#213)
add grad norm calculator (#205)
add gym vector env manager (#147)
add train_iter and env_step in serial pipeline (#212)
add rich logger handler (#219) (#223) (#232)
add naive lr_scheduler demo

Refactor

new BaseEnv and DingEnvWrapper (#171) (#231) (#240) Env English doc | 环境中文文档

Polish

Improve configurations in dizoo and add more algorithm benchmark doc example | 文档示例

MAPPO and MASAC smac config (#209) (#239)
QMIX smac config (#175)
R2D2 atari config (#181)
A2C atari config (#189)
GAIL box2d and mujoco config (#188)
ACER atari config (#180)
SQIL atari config (#230)
TREX atari/mujoco config
IMPALA atari config
MBPO/D4PG mujoco config

Fix

random_collect compatible to episode collector (#190)
remove default n_sample/n_episode value in policy config (#185)
PDQN model bug on gpu device (#220)
TREX algorithm CLI bug (#182)
DQfD JE computation bug and move to AdamW optimizer (#191)
pytest problem for parallel middleware (#211)
mujoco numpy compatibility bug
markupsafe 2.1.0 bug
framework parallel module network emit bug
mpire bug and disable algotest in py3.8
lunarlander env import and env_id bug
icm unittest repeat name bug
buffer thruput close bug

Test

resnet unittest (#199)
SAC/SQN unittest (#207)
CQL/R2D3/GAIL unittest (#201)
NGU td unittest (#210)
model wrapper unittest (#215)
MAQAC model unittest (#226)

Style

add doc docker (#221) (latex support)

Contributors: @PaParaZz1 @sailxjx @puyuan1996 @Will-Nie @Weiyuhong-1998 @davide97l @zjowowen @LuciusMos @kxzxvbk @Hcnaeg @jayyoung0802 @simonat2011 @jiaruonan

Contributors

sailxjx, PaParaZz1, and 11 other contributors

Assets 2