Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(zlx): League training with slime volley env #23

Merged
merged 12 commits into from
Oct 8, 2021
Merged

feature(zlx): League training with slime volley env #23

merged 12 commits into from
Oct 8, 2021

Conversation

LuciusMos
Copy link
Collaborator

@LuciusMos LuciusMos commented Aug 12, 2021

Related Issues

#17

TODO

  • Add slimevolleygym env in dizoo
  • PPO tuning in SlimeVolley-v0
  • slimevolleygym env visualization
    (ding -m eval -c slime_volley_ppo_config.py -s 8 --load-path=./ckpt_best.pth.tar --replay-path=./video)
  • self-play pipeline and tuning

@PaParaZz1 PaParaZz1 added env Questions about RL environment serial Serial training related labels Aug 12, 2021
@LuciusMos LuciusMos linked an issue Aug 12, 2021 that may be closed by this pull request
11 tasks
@codecov-commenter
Copy link

codecov-commenter commented Aug 12, 2021

Codecov Report

Merging #23 (9f64438) into main (c500a2e) will increase coverage by 0.01%.
The diff coverage is n/a.

❗ Current head 9f64438 differs from pull request most recent head 94287f6. Consider uploading reports for the commit 94287f6 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main      #23      +/-   ##
==========================================
+ Coverage   89.03%   89.05%   +0.01%     
==========================================
  Files         356      356              
  Lines       25912    25912              
==========================================
+ Hits        23071    23076       +5     
+ Misses       2841     2836       -5     
Flag Coverage Δ
unittests 89.05% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
ding/model/template/vac.py 98.55% <ø> (ø)
ding/envs/env_manager/subprocess_env_manager.py 83.41% <0.00%> (+0.48%) ⬆️
ding/worker/learner/comm/flask_fs_learner.py 91.87% <0.00%> (+1.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c500a2e...94287f6. Read the comment docs.

zlx-sensetime added 2 commits August 13, 2021 14:59
…ay and league training pipeline(evaluator is not finished, now use a very naive one)
@PaParaZz1 PaParaZz1 added this to In progress in Roadmap0.2.0 Sep 9, 2021
@PaParaZz1 PaParaZz1 added this to the Environment Generalization milestone Sep 14, 2021
@zxzzz0
Copy link

zxzzz0 commented Sep 17, 2021

@PaParaZz1 Could we fix the unit test failure?

@PaParaZz1 PaParaZz1 changed the title League training with slime volley env WIP: feature(zlx): League training with slime volley env Sep 17, 2021
@PaParaZz1 PaParaZz1 removed this from In progress in Roadmap0.2.0 Sep 30, 2021
@PaParaZz1 PaParaZz1 added this to In progress in Roadmap 0.3.0 Oct 1, 2021
@PaParaZz1 PaParaZz1 self-assigned this Oct 1, 2021
n_episode=128, unroll_len=1, discount_factor=1.0, gae_lambda=1.0, collector=dict(get_train_sample=True, )
),
other=dict(
league=dict(
Copy link

@zxzzz0 zxzzz0 Oct 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we enable league metric (TrueSkill) here as we did in the league demo. This will be useful to see the training progress.

strong_win_rate=0.7,
mutate_prob=0.0,
),
use_pretrain=False,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaParaZz1 Follow up on this. Is it possible that we can have another new config like xxx_league_PPO_using_pretrain_config.py, where use_pretrain is set to True rather than False here so that we don't train from scratch (Motivation is that training from scratch can be slow and we believe using pretrain model can benefit the training)

In this new config each time the model is mutated, it will be reset to the pretrain model. Assume you obtained this pretrain model offline by running PPO to train with the built-in bot dizoo/slime_volley/config/slime_volley_ppo_config.py for a very short period of time.

@PaParaZz1
Copy link
Member

Here are DI-engine PPO (vs bot) results in SlimeVolley-v0 env (5 random seed):

Screen Shot 2021-10-05 at 10 12 06 PM

Screen Shot 2021-10-05 at 10 12 26 PM

@zxzzz0
Copy link

zxzzz0 commented Oct 5, 2021

Here are DI-engine PPO (vs bot) results in SlimeVolley-v0 env (5 random seed):

@PaParaZz1 Well done! What a nice curve. This curve is very similar to the other people's PPO study as reported here.

Let's use this model as pretrain model and start the league train with TrueSkill.

@PaParaZz1
Copy link
Member

Here are DI-engine PPO self-play training results (win rate with built-in AI) in SlimeVolley-v0 env (2 random seed), its performance has outperformed than that in original repo:

Screen Shot 2021-10-07 at 11 18 08 PM

@PaParaZz1 PaParaZz1 changed the title WIP: feature(zlx): League training with slime volley env feature(zlx): League training with slime volley env Oct 8, 2021
@PaParaZz1
Copy link
Member

rule-based bot vs trained agent

slime_volley.mp4

@PaParaZz1 PaParaZz1 mentioned this pull request Oct 8, 2021
11 tasks
@PaParaZz1 PaParaZz1 moved this from In progress to Review in progress in Roadmap 0.3.0 Oct 8, 2021
@PaParaZz1 PaParaZz1 merged commit dbf432c into opendilab:main Oct 8, 2021
Roadmap 0.3.0 automation moved this from Review in progress to Done Oct 8, 2021
slime_volley_league_ppo_config = dict(
exp_name="slime_volley_league_ppo",
env=dict(
collector_env_num=8,
Copy link

@zxzzz0 zxzzz0 Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) CPU problem
We've tried this and found that only one core out of our 64 cores got 100% utilization. The remaining cores are nearly 2% utilization all the time. This situation remains the same even if we change collector_env_num from 8 to 64. Do you know how to fully utilize all cores? Should we change SyncSubprocessEnvManager into something else? @PaParaZz1
(2) GPU problem
We have one machine with 2 GPUs and 8 GiB memory each. Although we've changed the batch_size in learner to a 16x larger number. The memory usage for the first GPU is still only 1.6 GiB and the second GPU memory usage is always 0 GiB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GPU problem, I want to know what kind of multi-gpu implementation you use, torch.nn.DataParallel, torch.DistributedDataParallel or other methods? And you can open another issue to track these two problems.

Copy link

@zxzzz0 zxzzz0 Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GPU problem, we don't do any multi-gpu implementation and simply running slime_volley_league_ppo_config.py from master branch. But we want to fully utilize this single machine with multiple GPUs. I've tried adding one line to learner multi_gpu=True and wrap main with with DistContext():. However it raised errors as we reported here. It looks like it's doesn't support single machine with multiple GPUs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will add torch.DataParallel in DI-engine before 10.25, for single machine with multiple GPUs. And you can pay attention to related PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. In the related PR please also update slime_volley_ppo_config.py to use it.

puyuan1996 pushed a commit to puyuan1996/DI-engine that referenced this pull request Dec 14, 2021
…olley env (opendilab#23)

* slime volley env in dizoo, first commit

* fix bug in slime volley env

* modify volley env to satisfy ding 1v1 requirements; add naive self-play and league training pipeline(evaluator is not finished, now use a very naive one)

* adopt volley builtin ai as default eval opponent

* polish(nyz): polish slime_volley_env and its test

* feature(nyz): add slime_volley vs bot ppo demo

* feature(nyz): add battle_sample_serial_collector and adapt abnormal check in subprocess env manager

* feature(nyz): add slime volley self-play demo

* style(nyz): add slime_volleyball env gif and split MARL and selfplay label

* feature(nyz): add save replay function in slime volleyball env

Co-authored-by: zlx-sensetime <zhaoliangxuan@sensetime.com>
Co-authored-by: niuyazhe <niuyazhe@sensetime.com>
puyuan1996 pushed a commit to puyuan1996/DI-engine that referenced this pull request Apr 18, 2022
…olley env (opendilab#23)

* slime volley env in dizoo, first commit

* fix bug in slime volley env

* modify volley env to satisfy ding 1v1 requirements; add naive self-play and league training pipeline(evaluator is not finished, now use a very naive one)

* adopt volley builtin ai as default eval opponent

* polish(nyz): polish slime_volley_env and its test

* feature(nyz): add slime_volley vs bot ppo demo

* feature(nyz): add battle_sample_serial_collector and adapt abnormal check in subprocess env manager

* feature(nyz): add slime volley self-play demo

* style(nyz): add slime_volleyball env gif and split MARL and selfplay label

* feature(nyz): add save replay function in slime volleyball env

Co-authored-by: zlx-sensetime <zhaoliangxuan@sensetime.com>
Co-authored-by: niuyazhe <niuyazhe@sensetime.com>
SolenoidWGT pushed a commit to SolenoidWGT/DI-engine that referenced this pull request Aug 22, 2023
* fix/fix_submodule_err (opendilab#61)

* fix/fix_submodule_err

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>

* fix issue templates (opendilab#65)

* fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51)

* update tokenizer example

* fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73)

* fix a typo in readme

* in order to find InternLMTokenizer, select a lower version of Transformers

---------

Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>

* [Doc] Add wechat and discord link in readme (opendilab#78)

* Doc:add wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* [Docs]: add Japanese README (opendilab#43)

* Add Japanese README

* Update README-ja-JP.md

replace message

* Update README-ja-JP.md

* add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48)

Co-authored-by: YWMditto <862779238@qq.com>

* use fp16 in instruction (opendilab#80)

* [Enchancement] add more options for issue template (opendilab#77)

* [Enchancement] add more options for issue template

* update qustion icon

* fix link

* Use tempfile for convert2hf.py (opendilab#23)

Fix InternLM/InternLM#50

* delete torch_dtype of README's example code (opendilab#100)

* set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99)

* Update web_demo.py (opendilab#97)

Remove meaningless log.

* [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106)

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>
Co-authored-by: Kai Chen <chenkaidev@gmail.com>
Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com>
Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com>
Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>
Co-authored-by: vansin <msnode@163.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com>
Co-authored-by: YWMditto <862779238@qq.com>
Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com>
Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com>
Co-authored-by: Shuo Zhang <zhangshuolove@live.com>
Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
SolenoidWGT pushed a commit to SolenoidWGT/DI-engine that referenced this pull request Aug 22, 2023
* fix/fix_submodule_err (opendilab#61)

* fix/fix_submodule_err

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>

* fix issue templates (opendilab#65)

* fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51)

* update tokenizer example

* fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73)

* fix a typo in readme

* in order to find InternLMTokenizer, select a lower version of Transformers

---------

Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>

* [Doc] Add wechat and discord link in readme (opendilab#78)

* Doc:add wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* [Docs]: add Japanese README (opendilab#43)

* Add Japanese README

* Update README-ja-JP.md

replace message

* Update README-ja-JP.md

* add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48)

Co-authored-by: YWMditto <862779238@qq.com>

* use fp16 in instruction (opendilab#80)

* [Enchancement] add more options for issue template (opendilab#77)

* [Enchancement] add more options for issue template

* update qustion icon

* fix link

* Use tempfile for convert2hf.py (opendilab#23)

Fix InternLM/InternLM#50

* delete torch_dtype of README's example code (opendilab#100)

* set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99)

* Update web_demo.py (opendilab#97)

Remove meaningless log.

* [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106)

* docs(install.md): update dependency package transformers version to >= 4.28.0 (opendilab#124)

Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>

* docs(LICENSE): add license (opendilab#125)

* add license of colossalai and flash-attn

* fix lint

* modify the name

* fix AutoModel map in convert2hf.py (opendilab#116)

* variables are not printly as expect (opendilab#114)

* feat(solver): fix code to adapt to torch2.0 and provide docker images (opendilab#128)

* feat(solver): fix code to adapt to torch2.0

* docs(install.md): publish internlm environment image

* docs(install.md): update dependency packages version

* docs(install.md): update default image

---------

Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>

* add demo test (opendilab#132)

Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>

* fix web_demo cache accelerate (opendilab#133)

* fix(hybrid_zero_optim.py): delete math import

* Update embedding.py

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>
Co-authored-by: Kai Chen <chenkaidev@gmail.com>
Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com>
Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com>
Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>
Co-authored-by: vansin <msnode@163.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com>
Co-authored-by: YWMditto <862779238@qq.com>
Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com>
Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com>
Co-authored-by: Shuo Zhang <zhangshuolove@live.com>
Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
Co-authored-by: huangting4201 <1538303371@qq.com>
Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>
Co-authored-by: ytxiong <45058324+yingtongxiong@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
Co-authored-by: kkscilife <126147887+kkscilife@users.noreply.github.com>
Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>
Co-authored-by: hw <45089338+MorningForest@users.noreply.github.com>
SolenoidWGT added a commit to SolenoidWGT/DI-engine that referenced this pull request Aug 22, 2023
* fix/fix_submodule_err (opendilab#61)

* fix/fix_submodule_err

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>

* fix issue templates (opendilab#65)

* fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51)

* update tokenizer example

* fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73)

* fix a typo in readme

* in order to find InternLMTokenizer, select a lower version of Transformers

---------

Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>

* [Doc] Add wechat and discord link in readme (opendilab#78)

* Doc:add wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* [Docs]: add Japanese README (opendilab#43)

* Add Japanese README

* Update README-ja-JP.md

replace message

* Update README-ja-JP.md

* add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48)

Co-authored-by: YWMditto <862779238@qq.com>

* use fp16 in instruction (opendilab#80)

* [Enchancement] add more options for issue template (opendilab#77)

* [Enchancement] add more options for issue template

* update qustion icon

* fix link

* Use tempfile for convert2hf.py (opendilab#23)

Fix InternLM/InternLM#50

* delete torch_dtype of README's example code (opendilab#100)

* set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99)

* Update web_demo.py (opendilab#97)

Remove meaningless log.

* [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106)

* docs(install.md): update dependency package transformers version to >= 4.28.0 (opendilab#124)

Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>

* docs(LICENSE): add license (opendilab#125)

* add license of colossalai and flash-attn

* fix lint

* modify the name

* fix AutoModel map in convert2hf.py (opendilab#116)

* variables are not printly as expect (opendilab#114)

* feat(solver): fix code to adapt to torch2.0 and provide docker images (opendilab#128)

* feat(solver): fix code to adapt to torch2.0

* docs(install.md): publish internlm environment image

* docs(install.md): update dependency packages version

* docs(install.md): update default image

---------

Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>

* add demo test (opendilab#132)

Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>

* fix web_demo cache accelerate (opendilab#133)

* Doc: add twitter link (opendilab#141)

* Feat add checkpoint fraction (opendilab#151)

* feat(config): add checkpoint_fraction into config

* feat: remove checkpoint_fraction from configs/7B_sft.py

---------

Co-authored-by: wangguoteng.p <wangguoteng925@qq.com>

* [Doc] update deployment guide to keep consistency with lmdeploy (opendilab#136)

* update deployment guide

* fix error

* use llm partition (opendilab#159)

Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>

* test(ci_scripts): clean test data after test, remove unnecessary global variables, and other optimizations (opendilab#165)

* test: optimization of ci scripts(variables, test data cleaning, etc).

* chore(workflows): disable ci job on push.

* fix: update partition

* test(ci_scripts): add install requirements automaticlly,trigger event about lint check and other optimizations (opendilab#174)

* add pull_request in lint check

* use default variables in ci_scripts

* fix format

* check and install requirements automaticlly

* fix format

---------

Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>

* feat(profiling): add a simple memory profiler (opendilab#89)

* feat(profiling): add simple memory profiler

* feat(profiling): add profiling argument

* feat(CI_workflow): Add PR & Issue auto remove workflow (opendilab#184)

* feat(ci_workflow): Add PR & Issue auto remove workflow

Add a workflow for stale PR & Issue  auto remove
- pr & issue well be labeled as stale for inactive in 7 days
- staled PR & Issue  well be remove in 7 days
- run this workflow every day on 1:30 a.m.

* Update stale.yml

* feat(bot): Create .owners.yml for Auto Assign (opendilab#176)

* Create .owners.yml: for issue/pr assign automatically

* Update .owners.yml

* Update .owners.yml

fix typo

* [feat]: add pal reasoning script (opendilab#163)

* [Feat] Add PAL inference script

* Update README.md

* Update tools/README.md

Co-authored-by: BigDong <yudongwang1226@gmail.com>

* Update tools/pal_inference.py

Co-authored-by: BigDong <yudongwang1226@gmail.com>

* Update pal script

* Update README.md

* restore .ore-commit-config.yaml

* Update tools/README.md

Co-authored-by: BigDong <yudongwang1226@gmail.com>

* Update tools/README.md

Co-authored-by: BigDong <yudongwang1226@gmail.com>

* Update pal inference script

* Update READMD.md

* Update internlm/utils/interface.py

Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com>

* Update pal script

* Update pal script

* Update script

* Add docstring

* Update format

* Update script

* Update script

* Update script

---------

Co-authored-by: BigDong <yudongwang1226@gmail.com>
Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com>

* test(ci_scripts): add timeout settings and clean work after the slurm job (opendilab#185)

* restore pr test on develop branch

* add mask

* add post action to cancel slurm job

* remove readonly attribute on job log

* add debug info

* debug job log

* try stdin

* use stdin

* set default value avoid error

* try setting readonly on job log

* performance echo

* remove debug info

* use squeue to check slurm job status

* restore the lossed parm

* litmit retry times

* use exclusive to avoid port already in use

* optimize loop body

* remove partition

* add {} for variables

* set env variable for slurm partition

---------

Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>

* refactor(tools): move interface.py and import it to web_demo (opendilab#195)

* move interface.py and import it to web_demo

* typo

* fix(ci): fix lint error

* fix(ci): fix lint error

---------

Co-authored-by: Sun Peng <sunpengsdu@gmail.com>
Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>
Co-authored-by: Kai Chen <chenkaidev@gmail.com>
Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com>
Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com>
Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>
Co-authored-by: vansin <msnode@163.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com>
Co-authored-by: YWMditto <862779238@qq.com>
Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com>
Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com>
Co-authored-by: Shuo Zhang <zhangshuolove@live.com>
Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>
Co-authored-by: ytxiong <45058324+yingtongxiong@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
Co-authored-by: kkscilife <126147887+kkscilife@users.noreply.github.com>
Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn>
Co-authored-by: hw <45089338+MorningForest@users.noreply.github.com>
Co-authored-by: Guoteng <32697156+SolenoidWGT@users.noreply.github.com>
Co-authored-by: wangguoteng.p <wangguoteng925@qq.com>
Co-authored-by: lvhan028 <lvhan_028@163.com>
Co-authored-by: zachtzy <141206206+zachtzy@users.noreply.github.com>
Co-authored-by: cx <759046501@qq.com>
Co-authored-by: Jaylin Lee <61487970+APX103@users.noreply.github.com>
Co-authored-by: del-zhenwu <dele.zhenwu@gmail.com>
Co-authored-by: Shaoyuan Xie <66255889+Daniel-xsy@users.noreply.github.com>
Co-authored-by: BigDong <yudongwang1226@gmail.com>
Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com>
Co-authored-by: huangting4201 <huangting3@sensetime.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
env Questions about RL environment serial Serial training related
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Add slimevolleygym into dizoo
4 participants