Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model parallel #1193

Closed
beautifull4frank opened this issue Mar 14, 2023 · 7 comments
Closed

model parallel #1193

beautifull4frank opened this issue Mar 14, 2023 · 7 comments

Comments

@beautifull4frank
Copy link

beautifull4frank commented Mar 14, 2023

Hello, I just successfully run "summarize_rlhf/trlx_gptj_text_summarization.py", but I am not sure whether it is implemented using model parallel or not. I have to run large size gpt3, so I need to seperate the huge mode into several gpus using model paralled.

@sgugger
Copy link
Collaborator

sgugger commented Mar 14, 2023

Maybe ask on the repo where you picked that example? I have no idea what the script "summarize_rlhf/trlx_gptj_text_summarization.py" is so can't really help.

@beautifull4frank
Copy link
Author

cool man, I am so sorry, I got a mistake, I used trlx to train a chat bloom, I put my issue in wrong place. I success train the bloom560m. But , the trlx uses accelerate to train huge chat gpt model. I am just not sure whether accelerate can use model parallel to train 300g chat bloomz.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@lifefeel
Copy link

lifefeel commented Apr 17, 2023

cool man, I am so sorry, I got a mistake, I used trlx to train a chat bloom, I put my issue in wrong place. I success train the bloom560m. But , the trlx uses accelerate to train huge chat gpt model. I am just not sure whether accelerate can use model parallel to train 300g chat bloomz.

I wondered the same thing when I was using trlx. I found the following in the Accelerate documentation (Handling big models for inference)

The model parallelism used when your model is split on several GPUs is naive and not optimized, meaning that only one GPU works at a given time and the other sits idle.

It seems that model parallelism is still partially supported.

@sgugger
Copy link
Collaborator

sgugger commented Apr 17, 2023

We have never claimed supporting pipeline parallelism (where there is a schedule that split your batches in micro-batches and make sure all GPUs work at the same time) only sequential model parallelism (where the GPU1 waits for GPU 0 to finish and so forth). This is still quite fast if you batch your inputs together.

@lifefeel
Copy link

We have never claimed supporting pipeline parallelism (where there is a schedule that split your batches in micro-batches and make sure all GPUs work at the same time) only sequential model parallelism (where the GPU1 waits for GPU 0 to finish and so forth). This is still quite fast if you batch your inputs together.

Thanks for the detailed explanation 😄

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants