Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCCL related problems occurred during installation #76

Closed
Sakura-rain opened this issue Sep 29, 2021 · 3 comments
Closed

NCCL related problems occurred during installation #76

Sakura-rain opened this issue Sep 29, 2021 · 3 comments

Comments

@Sakura-rain
Copy link

Describe the bug
NCCL issues always arise during fastmoe compilation and installation. I tried to download other versions of NCCL compressed files including include and lib for import, but still encountered the same problem, the error is as follows:

微信图片_20210929095232

Platform

  • Device: [NVIDIA V100]
  • OS: [centos]
  • CUDA version: [10.1]
  • NCCL version: [2.8.3]
  • PyTorch version: [1.8.0]
@laekov
Copy link
Owner

laekov commented Sep 29, 2021

I suppose your compiler gets a wrong nccl.h. There is another nccl.h in PyTorch's cpp header.

@Sakura-rain
Copy link
Author

I suppose your compiler gets a wrong . There is another in PyTorch's cpp header.nccl.h``nccl.h

export USE_NCCL=/public/home/LDC2/nccl_2.8.3-1+cuda10.1_x86_64:$PATH
I executed the above statement to import the external NCCL library, but there were errors
I suspect that the specified environment variables do not match
How do I specify an external NCCL environment variable, or how do I change the nccl.h specified earlier in PyTorch.

@laekov
Copy link
Owner

laekov commented Sep 29, 2021

Well, we are not using USE_NCCL environment variable for specifying the path of NCCL installation. You can try the following environment setting.

export USE_SYSTEM_NCCL=1
export NCCL_ROOT=/public/home/LDC2/nccl_2.8.3-1+cuda10.1_x86_64
export NCCL_INCLUDE_DIR=$NCCL_ROOT/include
export NCCL_LIB_DIR=$NCCL_ROOT/lib
export CPLUS_INCLUDE_PATH=$NCCL_INCLUDE_DIR:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=$NCCL_LIB_DIR:$LIBRARY_PATH
export LD_LIBRARY_PATH=$NCCL_LIB_DIR:$LD_LIBRARY_PATH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants