Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support #3567

yynil · 2023-04-14T10:45:52Z

📌 Checklist before creating the PR

[ x] I have created an issue for this PR for traceability
[ x] The title follows the standard format: [doc/gemini/tensor/...]: A concise description
[ x] I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234
'Implement ##3566', 'Implement #3565'

📝 What does this PR do?

Add model support for ChatGLM-6B.
Add a video ram friendly trainer when training PPO model. One 3090ti with 24GB vram can finish the PPO training.
Add Peft reward model, so all stage-3 models are peft.
Add a support to preprocess texts to binary datasets to save the data process time.

💥 Checklist before requesting a review

[x ] I have linked my PR to an issue (instruction)
[ x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
[x ] I have performed a self-review of my code
[x ] I have added thorough tests.
[ x] I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

[x ] 🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

Fazziekey · 2023-04-17T05:58:28Z

@yynil hello, if you want to support more models, can you add all models class in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models?
you can make a dir glm in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models/glm

Fazziekey · 2023-04-17T06:02:37Z

I think this PR changes too many things, I think you can split it into three PR:

add glm models: including Reward model, actor model, critic model and LMmodels;
add datasets preprocess in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/dataset
add community example in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/examples/community

yynil · 2023-04-17T07:20:04Z

@yynil hello, if you want to support more models, can you add all models class in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models? you can make a dir glm in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models/glm

I'll create another branch to separate these changes.

Fazziekey · 2023-04-17T07:41:17Z

I add how to support your own model in this PR, #3579, please refer to it @yynil

yynil · 2023-04-25T07:50:10Z

Since the ChatGLM is not willing to release a smaller model to public to train a reward model, I'm suspending the support to ChatGLM. My Branch will then move to bloom because bloom has a very good small model to allow us train reward model much easier.

binmakeswell · 2023-05-05T03:26:21Z

Since the ChatGLM is not willing to release a smaller model to public to train a reward model, I'm suspending the support to ChatGLM. My Branch will then move to bloom because bloom has a very good small model to allow us train reward model much easier.

Thanks, you are welcome to share and update your PR.

yynil added 3 commits April 11, 2023 23:47

save changes

06a982d

Add support for ChatGLM-6B and impplement vram friendly training.

897a99d

Update the documents

fa3f795

binmakeswell requested a review from ht-zhou April 14, 2023 14:14

yynil mentioned this pull request Apr 15, 2023

请发布一个小参数版本的ChatGLM，与ChatGLM-6B共享Tokenizer，让RLHF最后一步PPO能够最大可能提速 THUDM/GLM#159

Open

Fazziekey mentioned this pull request Apr 17, 2023

[coati] add costom model support guide #3579

Merged

10 tasks

binmakeswell requested review from Fazziekey and removed request for ht-zhou April 17, 2023 07:44

This was referenced Apr 17, 2023

[BUG]: how can i fine-tuning the glm-130b based on colossal-ai #3528

Open

[FEATURE]: train seq2seq model #3564

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support #3567

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support #3567

yynil commented Apr 14, 2023

Fazziekey commented Apr 17, 2023

Fazziekey commented Apr 17, 2023

yynil commented Apr 17, 2023

Fazziekey commented Apr 17, 2023 •

edited

yynil commented Apr 25, 2023

binmakeswell commented May 5, 2023

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support #3567

Are you sure you want to change the base?

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support #3567

Conversation

yynil commented Apr 14, 2023

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

Fazziekey commented Apr 17, 2023

Fazziekey commented Apr 17, 2023

yynil commented Apr 17, 2023

Fazziekey commented Apr 17, 2023 • edited

yynil commented Apr 25, 2023

binmakeswell commented May 5, 2023

Fazziekey commented Apr 17, 2023 •

edited