-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The performance of model parallelism (MP) is not good #124
Comments
Hi @feifeibear . Thank you so much for your effort. We would appreciate it if you could also share the configurations used to test the same models with Deepspeed and PatrickStar? We would like to evaluate and improve the performance on a similar node scale as well as larger scale. |
The DeepSpeed benchmark script
|
I have uploaded the logs of DeepSpeed and PatirckStar to Baidu WangPan... link: https://pan.baidu.com/s/1vEHl0hPuxDb7HjOlpuW-YA?pwd=1mfd |
@feifeibear Thank you! |
This issue is stale because it has been open for 14 days with no activity. |
Thanks for your report, detailed tests with stable code will come soon. |
We have updated a lot. This issue was closed due to inactivity. Thanks. |
Hello developers.
I found the performance of MP provided is not good. I compared it with PatrickStar and DeepSpeed. Can you check it with me? See MR #115
BTW: I strongly recommend you add Tflops as an indicator of performance.
Platform: a node of SuperPod including 8xA100 and 1TB memory CPU. BS = batch size, pstar=PatrickStar, deeps=DeepSpeed
Entries indicate the Throughput (batch/elapse). Xd-Xmp is using Colossal-AI.
The text was updated successfully, but these errors were encountered: