fp8_group
when using FSDP and tensor parallelism
#656
Labels
documentation
Improvements or additions to documentation
fp8_group
when using FSDP and tensor parallelism
#656
Hi,
What is the correct
fp8_group
when using FSDP and tensor parallelism together?Is it all gpus or between tensor parallel groups?
Thanks.
The text was updated successfully, but these errors were encountered: