Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Solved] Failed to build tinycudann; Could not build wheels for tinycudann; Could not find filesystem; xxx.so.xx no such file or directory #337

Open
QXmX29 opened this issue Jul 12, 2023 · 9 comments

Comments

@QXmX29
Copy link

QXmX29 commented Jul 12, 2023

My problems have been solved!

If you met similar problems as below, maybe my experience could help you out~

Problem-Cause-Solution

Basic problem: Failed to build tinycudann and Could not build wheels for tinycudann ...

Q1: ... fatal error: filesystem: 没有那个文件或目录(no such file or directory)

  • Check you gcc version using command gcc -v or gcc --version. These problems arise due to low gcc version (<8). Here I referred to the CSDN blog about filesystem&gcc and stackoverflow.
  • Solution: Upgrade gcc (>=8). e.g. For me, gcc=8.5.0

Q2: After upgrading gcc, I met error like this: /mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.1: cannot open shared object file: No such file or directory

  • Check by using command ldd <path/to/cc1plus>, where you should replace <path/to/ccqplus> by the correct path in your error information. e.g. For the example above, it should be ldd /mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus.
  • You would probably find a line like this: libmpfr.so.1 => not found. Check the output carefully and you may find the other line in the form of libmpfr.so.1 => <a/thorough/path>. Copy the correct path and check it if neccessary using commands cd <that/paht> and 'ls' (Remember to 'cd' back afterwards!).
  • Add that path(s) to the experimental variable LD_LIBRARY_PATH: export LD_LIBRARY_PATH=<path>:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}. (I use cluster, so if you are not, just add it/them to your environmental variables.)

The Original Description (modified a little)

Background

I just wanted to try threestudio, so I followed its directions and installed pytorch before install the requirements. In its requirements.txt is "git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch", but I failed at this step.
I had to install tiny-cuda-nn in the cluster. By setting the environmental variables (export xxx=xxx), I made cuda=11.3 and gcc=7.5.0. These settings used to prove successful but I don't know why it should fail after I reinstalled conda and reset the environments... Or maybe it is because I killed a terminal by closing the VS code (at that time I successfully built the PyTorch extension for tiny-cuda-nn)?
I have no idea but to delete the folder, clone it again, reinstall conda and reset all the environments... But I just got errors again and again! Could someone help me out? Pleeeease!
Besides, I used srun to run the process:

from pip._internal import main
# main(['install', '-r', 'requirements.txt'])
main(['install', 'git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch'])

In the terminal (I hided the partition):
srun -p xxx --gres=gpu:0 --ntasks-per-node=1 python tmp.py

My settings:

python=3.9.16
cuda=11.3
gcc=7.5.0
torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1
pip=23.1.2
setuptools=67.8.0
wheel=0.38.4

Important Outputs (I think)

  1. /tmp/pip-req-build-f8bpzkp0/dependencies/json/json.hpp:3954:14: fatal error: filesystem: 没有那个文件或目录
  2. error: command '/mnt/petrelfs/share/cuda-11.3/bin/nvcc' failed with exit code 1
  3. note: This error originates from a subprocess, and is likely not a problem with pip.
  4. ERROR: Failed building wheel for tinycudann
  5. Failed to build tinycudann
  6. ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects

What I've tried

  1. Upgrade setuptools & wheel: It didn't work
  2. Change gcc version: neither did it work to change to higher or lower version I could access. The error caused by "filesystem" disappeared but another error was generated... (If you are curious about that, I'd love to retry and show the detailed results)
  3. Change to cuda=11.8, torch=2.0.1 and so on: failed (I didn't remembered exactly why)
  4. Rebuild the environments and even reinstall conda: I at first created several environment using conda and virtualenv, but I deleted all after meeting the "Segmentation fault" when running the lauch.py of threestudio (that would be another complicated case for me!). I expected to smoothly build a new environment in an original state, but obviously I failed!!!
  5. Maybe other methods? But I've forgotten.

Below is the whole info (except that of srun):**(I think I've hided all my information..)

WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
  Cloning https://github.com/NVlabs/tiny-cuda-nn/ to /tmp/pip-req-build-f8bpzkp0
  Running command git clone --quiet https://github.com/NVlabs/tiny-cuda-nn/ /tmp/pip-req-build-f8bpzkp0
  Resolved https://github.com/NVlabs/tiny-cuda-nn/ to commit <some sort of 123 and abc, I don't know whether it's related to my information or not so I just hided them>
  Running command git submodule update --init --recursive -q
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: tinycudann
  Building wheel for tinycudann (setup.py): started
  Building wheel for tinycudann (setup.py): finished with status 'error'
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [47 lines of output]
      Building PyTorch extension for tiny-cuda-nn version 1.7
      Obtained compute capability 80 from PyTorch
      nvcc: NVIDIA (R) Cuda compiler driver
      Copyright (c) 2005-2021 NVIDIA Corporation
      Built on Mon_May__3_19:15:13_PDT_2021
      Cuda compilation tools, release 11.3, V11.3.109
      Build cuda_11.3.r11.3/compiler.29920130_0
      Detected CUDA version 11.3
      Targeting C++ standard 17
      running bdist_wheel
      /mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/utils/cpp_extension.py:411: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
        warnings.warn(msg.format('we could not find ninja.'))
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-39
      creating build/lib.linux-x86_64-cpython-39/tinycudann
      copying tinycudann/__init__.py -> build/lib.linux-x86_64-cpython-39/tinycudann
      copying tinycudann/modules.py -> build/lib.linux-x86_64-cpython-39/tinycudann
      running egg_info
      creating tinycudann.egg-info
      writing tinycudann.egg-info/PKG-INFO
      writing dependency_links to tinycudann.egg-info/dependency_links.txt
      writing top-level names to tinycudann.egg-info/top_level.txt
      writing manifest file 'tinycudann.egg-info/SOURCES.txt'
      reading manifest file 'tinycudann.egg-info/SOURCES.txt'
      writing manifest file 'tinycudann.egg-info/SOURCES.txt'
      copying tinycudann/bindings.cpp -> build/lib.linux-x86_64-cpython-39/tinycudann
      running build_ext
      building 'tinycudann_bindings._80_C' extension
      creating dependencies
      creating dependencies/fmt
      creating dependencies/fmt/src
      creating src
      creating build/temp.linux-x86_64-cpython-39
      creating build/temp.linux-x86_64-cpython-39/tinycudann
      gcc -pthread -B /mnt/petrelfs/xxx/anaconda3/envs/studio/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../dependencies/fmt/src/format.cc -o build/temp.linux-x86_64-cpython-39/../../dependencies/fmt/src/format.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
      gcc -pthread -B /mnt/petrelfs/xxx/anaconda3/envs/studio/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../dependencies/fmt/src/os.cc -o build/temp.linux-x86_64-cpython-39/../../dependencies/fmt/src/os.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
      /mnt/petrelfs/share/cuda-11.3/bin/nvcc -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../src/common_host.cu -o build/temp.linux-x86_64-cpython-39/../../src/common_host.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
      In file included from /tmp/pip-req-build-f8bpzkp0/include/tiny-cuda-nn/cpp_api.h:32:0,
                       from /tmp/pip-req-build-f8bpzkp0/include/tiny-cuda-nn/common_host.h:33,
                       from ../../src/common_host.cu:31:
      /tmp/pip-req-build-f8bpzkp0/dependencies/json/json.hpp:3954:14: fatal error: filesystem: 没有那个文件或目录
           #include <filesystem>
                    ^~~~~~~~~~~~
      compilation terminated.
      error: command '/mnt/petrelfs/share/cuda-11.3/bin/nvcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tinycudann
  Running setup.py clean for tinycudann
Failed to build tinycudann
ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects
@QXmX29
Copy link
Author

QXmX29 commented Jul 12, 2023

Cheers!!! At least now I've temporarily solved the problem.

Background & Solution

  1. I changed cuda=11.8, gcc=8.5.0 and upgraded torch, torchaudio and torchvison using pip install --upgrade. So: torch=2.0.1, torchaudio=2.0.2, torchvision=0.15.2 (I'm not quite sure whether it's one of the key steps but I do change the versions, and this may have some impact on the afterward steps)
  2. Then I ran the process of pip install ... This time no "filesystem" error, but I met a new error that worths my attention:
/mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.1: cannot open shared object file: No such file or directory
  1. To solved it, I referred to this website and added Environmental Variables "LD_LIBRARY_PATH" according to new errors
  2. pip install ..., a new but similar error, so similarly added new path and ran again. And this time I successfully installed tiny-cuda-nn!
  3. My settings of Environmental Variables:
export EXTRA_LIB_HOME=/mnt/petrelfs/share/gcc/mpc-0.8.1/lib:/mnt/lustre/share/gcc/mpfr-2.4.2/lib:/mnt/lustre/share/gcc/gmp-4.3.2/lib
export CUDA_HOME=/mnt/petrelfs/share/cuda-11.8
export GCC_HOME=/mnt/petrelfs/share/gcc/gcc-8.5.0
export LD_LIBRARY_PATH=${EXTRA_LIB_HOME}:${GCC_HOME}/lib64:${CUDA_HOME}/lib64:${CUDA_HOME}/extras/CUPTI/lib64

Here EXTRA_LIB_HOME is created just to make it easy for me to understand the structure of LD_LIBRARY_PATH.


Attention
Here CUDA_HOME and GCC_HOME respectively determine the versions of cuda and gcc. LD_LIBRARY_PATH shows the paths of the libraries.
The EXTRA_LIB_HOME is created by myself, which completed the missing links of some sort of file -- I referred to this website.
About whether to set EXTRA_LIB_HOME or not and how to set it, see if you could find the error similar to this:

/mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.6: cannot open shared object file: No such file or directory

If you do met the similar problem but cannot figure out what that website says, you can read stuffs below
According to the website I mentioned above, just use command ldd to list the links:

ldd /mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus

Tthen you may find some xxx.so => not found (for example, libmpfr.so.1)
As for me, I found two xxx.so in the list, one points to not found while the other points to an exact file path like /mnt/petrelfs/share/gcc/mpc-0.8.1/lib/xxx.so (cd /mnt/petrelfs/share/gcc/mpc-0.8.1/lib/ and ls then you could find the file xxx.so)

Forgive me for my looooong comment, because for a novice like me it's realy sad to read a solution which is quite simple and effective but puzzles me... T^T

@ashishd
Copy link

ashishd commented Jul 12, 2023

Hi There,
Your main issue is that you are trying to install this on a cluster instead of a personal workstation. You have two options.

  1. Create a singularity image (via pulling a docker image of tiny-cuda-nn, if a docker image exist). If you are unfamiliar with developing docker/singularity recipes, this could be a steep learning curve.
  2. Please follow instructions for your HPC cluster to load/swap modules. That is why you get shared libraries not found. When you load the correct modules, the corresponding paths will be automatically added to your profile. You have to do this everytime you log in to your compute nodes.

Perhaps, you might want to create your conda environment and installation in interactive mode (using salloc). This will help you better to identify where your installation problems are.

Hope this helps.
Ash.

@QXmX29
Copy link
Author

QXmX29 commented Jul 13, 2023

Hi There, Your main issue is that you are trying to install this on a cluster instead of a personal workstation. You have two options.

  1. Create a singularity image (via pulling a docker image of tiny-cuda-nn, if a docker image exist). If you are unfamiliar with developing docker/singularity recipes, this could be a steep learning curve.
  2. Please follow instructions for your HPC cluster to load/swap modules. That is why you get shared libraries not found. When you load the correct modules, the corresponding paths will be automatically added to your profile. You have to do this everytime you log in to your compute nodes.

Perhaps, you might want to create your conda environment and installation in interactive mode (using salloc). This will help you better to identify where your installation problems are.

Hope this helps. Ash.

Thank you very much for your thorough answer!!! I've solved the problem and learned a lot from your comment!!! $\hat{0} _{\ \nabla} \ \hat{0}$

@QXmX29 QXmX29 changed the title Failed to build tinycudann; Could not build wheels for tinycudann; Could not find filesystem [Solved] Failed to build tinycudann; Could not build wheels for tinycudann; Could not find filesystem; xxx.so.xx no such file or directory Jul 13, 2023
@yhyang-myron
Copy link

Thank you very much for sharing! This method successfully solved my problem.

@yhyang-myron
Copy link

I found that directly using these commands can also work:
export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH

@QXmX29
Copy link
Author

QXmX29 commented Jul 17, 2023

I found that directly using these commands can also work: export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH

Congratulations! 🎉 For me, I've already added these paths before but still met those terrifying problems. So I had to try other methods to solve them. 😜

@QXmX29
Copy link
Author

QXmX29 commented Jul 17, 2023

I found that directly using these commands can also work: export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH

Oh, the command export can only temporarily take effect in only one bash (terminal). Try add it to the end of the file .bashrc so you wouldn't be troubled every time you open a new bash.

@yhyang-myron
Copy link

Thank you so much! When I changed to a different version, I encountered the above problem again and applied your method to solve it.

@garrisonz
Copy link

I encounter the same error about No such file or directory, e.g.

/data/zhangyupeng/w/tiny-cuda-nn/dependencies/json/json.hpp:3954:14: fatal error: filesystem: No such file or directory

My Solution:
upgrade g++, ensure g++ / gcc / c++ all points 9 version.
I think versiono 8 or above should be fine according README.md of this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants