Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[import jittor will try to download NCCL even NCCL is loaded] #495

Open
baibizhe opened this issue Sep 26, 2023 · 0 comments
Open

[import jittor will try to download NCCL even NCCL is loaded] #495

baibizhe opened this issue Sep 26, 2023 · 0 comments

Comments

@baibizhe
Copy link

Describe the bug

jittor维护者你好 我们试图在超算集群上安装jittor,解决了一些链接问题后,最后还是有一个关于NCCL的问题, 就是我们的gpu节点或者计算节点是没有网络权限的,但是jittor坚持要下载NCCL 即使NCCL已经被load进来了,有没有什么办法在import jittor 的时候 不下载NCCL呢

Hello, jittor maintainer. We tried to install jittor on the supercomputing cluster. After solving some link problems, there was still a problem about NCCL. That is, our gpu node or computing node did not have network permissions, but jittor insisted on downloading. NCCL Even if NCCL has been loaded, is there any way to not download NCCL when importing jittor?

Full Log

This is the email from administrator from computer system
"
This is an interesting code. I got it to install using the following commands.

  1. Load the required modules.

$ module load gcc
$ module load python/3.10
$ module load opencv
$ module load cuda
$ module load imkl
$ module load nccl

  1. Create a virtual environment, and activate.

$ virtualenv venv
$ source venv/bin/activate
(venv)$

  1. Get the code, and install the needed Python packages. Install JNerF.

(venv)$ git clone https://github.com/Jittor/JNeRF
(venv)$ cd JNeRF
(venv)$
(venv)$ PYTHONPATH= pip install open3d
(venv)$ pip install -r requirements.txt
(venv)$ pip install -e .
(venv)$

  1. This code has issues. Or rather the jittor package does not appear to be well crafted, and doesn't know how to find things that are in non-standard locations. Let's put in some links to help it along.

(venv)$ cd ../venv/bin
(venv)$
(venv)$ ln -s $EBROOTPYTHON/bin/python3.10-config
(venv)$
(venv)$ ..
(venv)$ ln -s $EBROOTPYTHON/include
(venv)$

You should now be able to import it. However, it appears that it can only be imported if a GPU is available. If you go onto a compute node that has a GPU you run into another problem, as jittor insists upon downloading and trying to install NCCL, even though the NCCL module is loaded. Given that compute nodes don't have internet access this is going to fail. You'll need to figure out how to convince jittor to not install NCCL.

"

jittor维护者你好 我们试图在超算集群上安装jittor,解决了一些链接问题后,最后还是有一个关于NCCL的问题, 就是我们的gpu节点或者计算节点是没有网络权限的,但是jittor坚持要下载NCCL 即使NCCL已经被load进来了,有没有什么办法在import jittor 的时候 不下载NCCL呢

Hello, jittor maintainer. We tried to install jittor on the supercomputing cluster. After solving some link problems, there was still a problem about NCCL. That is, our gpu node or computing node did not have network permissions, but jittor insisted on downloading. NCCL Even if NCCL has been loaded, is there any way to not download NCCL when importing jittor?

If you are submitting an issue for the first time, please refer to our guideline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant