-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base' #52
Comments
你好, |
我和你一样的问题,也是按照要求安装的环境,你解决了吗? |
运行TSR_train.py 时出现错误
File "TSR_train.py", line 7, in
from src.TSR_trainer import TrainerConfig, TrainerForContinuousEdgeLine, TrainerForEdgeLineFinetune
File "D:\AIworkspace\ZITS_inpainting-main\src\TSR_trainer.py", line 14, in
from apex import amp
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex_init_.py", line 27, in
from . import transformer
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer_init_.py", line 4, in
from apex.transformer import pipeline_parallel
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel_init_.py", line 1, in
from apex.transformer.pipeline_parallel.schedules import get_forward_backward_func
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules_init_.py", line 3, in
from apex.transformer.pipeline_parallel.schedules.fwd_bwd_no_pipelining import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules\fwd_bwd_no_pipelining.py", line 10, in
from apex.transformer.pipeline_parallel.schedules.common import Batch
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules\common.py", line 14, in
from apex.transformer.tensor_parallel.layers import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel_init_.py", line 21, in
from apex.transformer.tensor_parallel.layers import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel\layers.py", line 32, in
from apex.transformer.tensor_parallel.mappings import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel\mappings.py", line 29, in
torch.distributed.reduce_scatter_tensor = torch.distributed._reduce_scatter_base
AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base'
我的环境是 torch=1.9.0+cu111 cuda=11.1 , 请问作者如何解决?
谢谢
The text was updated successfully, but these errors were encountered: