Hello,
When I configured --sequence-parallel and --tp-comm-overlap and started the training. It shows below information:
TypeError: UbufP2PCommOverlap(): incompatible function arguments. The following argument types are supported:
1. () -> None
Invoked with: tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], device='cuda:7', dtype=torch.bfloat16), 7, 2, 16, 2, 0, 0, 3, 0, 0, tensor([])
How to fix it? Thanks.