model parallel #1193

beautifull4frank · 2023-03-14T10:24:29Z

Hello, I just successfully run "summarize_rlhf/trlx_gptj_text_summarization.py", but I am not sure whether it is implemented using model parallel or not. I have to run large size gpt3, so I need to seperate the huge mode into several gpus using model paralled.

sgugger · 2023-03-14T13:09:23Z

Maybe ask on the repo where you picked that example? I have no idea what the script "summarize_rlhf/trlx_gptj_text_summarization.py" is so can't really help.

beautifull4frank · 2023-03-15T01:54:40Z

cool man, I am so sorry, I got a mistake, I used trlx to train a chat bloom, I put my issue in wrong place. I success train the bloom560m. But , the trlx uses accelerate to train huge chat gpt model. I am just not sure whether accelerate can use model parallel to train 300g chat bloomz.

github-actions · 2023-04-13T15:08:10Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

lifefeel · 2023-04-17T13:18:01Z

cool man, I am so sorry, I got a mistake, I used trlx to train a chat bloom, I put my issue in wrong place. I success train the bloom560m. But , the trlx uses accelerate to train huge chat gpt model. I am just not sure whether accelerate can use model parallel to train 300g chat bloomz.

I wondered the same thing when I was using trlx. I found the following in the Accelerate documentation (Handling big models for inference)

The model parallelism used when your model is split on several GPUs is naive and not optimized, meaning that only one GPU works at a given time and the other sits idle.

It seems that model parallelism is still partially supported.

sgugger · 2023-04-17T13:23:46Z

We have never claimed supporting pipeline parallelism (where there is a schedule that split your batches in micro-batches and make sure all GPUs work at the same time) only sequential model parallelism (where the GPU1 waits for GPU 0 to finish and so forth). This is still quite fast if you batch your inputs together.

lifefeel · 2023-04-17T13:35:43Z

We have never claimed supporting pipeline parallelism (where there is a schedule that split your batches in micro-batches and make sure all GPUs work at the same time) only sequential model parallelism (where the GPU1 waits for GPU 0 to finish and so forth). This is still quite fast if you batch your inputs together.

Thanks for the detailed explanation 😄

github-actions · 2023-05-12T15:06:01Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model parallel #1193

model parallel #1193

beautifull4frank commented Mar 14, 2023 •

edited

sgugger commented Mar 14, 2023

beautifull4frank commented Mar 15, 2023

github-actions bot commented Apr 13, 2023

lifefeel commented Apr 17, 2023 •

edited

sgugger commented Apr 17, 2023

lifefeel commented Apr 17, 2023

github-actions bot commented May 12, 2023

model parallel #1193

model parallel #1193

Comments

beautifull4frank commented Mar 14, 2023 • edited

sgugger commented Mar 14, 2023

beautifull4frank commented Mar 15, 2023

github-actions bot commented Apr 13, 2023

lifefeel commented Apr 17, 2023 • edited

sgugger commented Apr 17, 2023

lifefeel commented Apr 17, 2023

github-actions bot commented May 12, 2023

beautifull4frank commented Mar 14, 2023 •

edited

lifefeel commented Apr 17, 2023 •

edited