Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions test/distributed/_composable/fsdp/test_fully_shard_memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,9 @@ def _test_fully_shard_training_memory(
# number is kept much smaller than the actual memory usage, which is on
# the order of 100-200+ MB)
buffer_mb = 16
# The default workspace for hipblaslt is larger than for cublas/cublaslt
# which requires a slight increase to this buffer value.
buffer_mb = 16 if torch.version.cuda else 18
if reshard_after_forward:
# 3x max unsharded block parameters (current all-gather + copy-out
# and next all-gather), non-block parameters, and other
Expand Down