New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torch version #5
Comments
I have tested it on pytorch1.3 + cuda10, it runs successfully |
I have used pytorch 1.3.1, CUDA 10.2. It seems like that pytorch version is crucial. (See #1) |
@rosinality I installed pytorch 1.3.1,torchvision 0.4.2, cuda10.1, it occurred that "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv". Your torchvision is 0.4.2, right? |
Could you retry after remove /tmp/torch_extensions directory? |
Sorry, I have no idea to remove /tmp/torch_extensions, and I am not familiar with pytorch-c++ extension. Could you explain more? |
I suspect it is trying to use cached binaries even after CUDA updates. |
now I have update cuda to 10.2, and add cuda to .bashrc file, but tha same error occurred. So do you have some suggestion? I had better reboot the machine? |
I don't think you need to reboot after CUDA updates. Could you post full error logs? |
Traceback (most recent call last): |
how about your gcc version? my gcc is 5.4, I am hesitating to update to gcc7.3 |
I'm using gcc 5.4 Did you tried to remove cached binaries in /tmp/torch_extensions? Then could you show me
|
ldd /tmp/torch_extensions/fused/fused.so |
Seems like that there are cases that pytorch couldn't resolve CUDA shared libraries. (NVIDIAGameWorks/kaolin#30) But I don't know how you can resolve it. If you use anaconda, maybe you can try to make new virtual envs and try again after install pytorch 1.3 and cudatoolkit 10.1 on new venvs. |
you are right, after 'rm -rf /tmp/torch_extensions', the error disappeared. Thank you so much.
so this case that pytorch couldn't resolve CUDA shared libraries may be ignored.
…------------------ 原始邮件 ------------------
发件人: "Kim Seonghyeon"<notifications@github.com>;
发送时间: 2019年12月26日(星期四) 晚上10:55
收件人: "rosinality/stylegan2-pytorch"<stylegan2-pytorch@noreply.github.com>;
抄送: "晴子"<1271005479@qq.com>;"Author"<author@noreply.github.com>;
主题: Re: [rosinality/stylegan2-pytorch] torch version (#5)
Seems like that there are cases that pytorch couldn't resolve CUDA shared libraries. (NVIDIAGameWorks/kaolin#30) But I don't know how you can resolve it. If you use anaconda, maybe you can try to make new virtual envs and try again after install pytorch 1.3 and cudatoolkit 10.1 on new venvs.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I have the same problem...But I was unable to solve this problem by removing /tmp/torch_extensions. Did you do anything else to solve this problem? @qingzi02010 |
No, I used the commended version of torch, once operating 'rm -rf /tmp/torch_extensions', "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv" disappeared. |
I am using python3.7 of anaconda. I don't know whether there is any relations between the problem and python. You can try.
…------------------ 原始邮件 ------------------
发件人: "wosecz"<notifications@github.com>;
发送时间: 2020年1月6日(星期一) 下午2:51
收件人: "rosinality/stylegan2-pytorch"<stylegan2-pytorch@noreply.github.com>;
抄送: "晴子"<1271005479@qq.com>;"Mention"<mention@noreply.github.com>;
主题: Re: [rosinality/stylegan2-pytorch] torch version (#5)
I have the same problem...But I was unable to solve this problem by removing /tmp/torch_extensions. Did you do anything else to solve this problem? @qingzi02010
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Yes this method is correct. I tried several times and fix this problem. (But got another problem......) Thank you for your reply! |
https://www.cnblogs.com/rainsoul/p/12162779.html |
Does anyone else which tensorflow version to use? Because neither tf 1.14 or 1.15 (see original stylegan2 repo) are compatible with CUDA 10.2 |
@kevinstan I use tf 1.15 on CUDA 10.2. It seems it can run on it. |
something weird happens to me. when I try to train it from screen
I tried removing |
I face the same issue, did this resolve, if yes how ? |
@Harsha-Musunuri could you resolve this issue? |
@rosinality pytorch 1.3 is not compatible with CUDA 10.2, did you install it locally and build PyTorch from source? |
@denabazazian I don't remember the environments well. You can use recent version of pytorch. |
@denabazazian try this https://drive.google.com/file/d/1EaYl5IP0gBqjagX9mZfXr88l13eUzKay/view?usp=sharing |
I tried the conda env file to no avail. I'm using cuda 10.1 with pytorch 1.7.1. I failed to downgrade this to 1.3.1. I tried other pytorch versions but ran into other problems which when resolved ended back to this state: CalledProcessError Traceback (most recent call last) ~/miniconda3/envs/dG/lib/python3.9/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs) CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1. The above exception was the direct cause of the following exception: RuntimeError Traceback (most recent call last) ~/Documents/dG/alias-free-gan-pytorch/stylegan2/op/init.py in ~/Documents/dG/alias-free-gan-pytorch/stylegan2/op/fused_act.py in ~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, keep_intermediates) ~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, keep_intermediates) ~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda) ~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix) RuntimeError: Error building extension 'fused': [1/2] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/TH -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/mr/miniconda3/envs/dG/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++14 -c /home/mr/Documents/dG/alias-free-gan-pytorch/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o |
@MHRosenberg It is not pytorch version problem, but cuda build environment. You can check you can build cuda programs, or use https://github.com/rosinality/alias-free-gan-pytorch/blob/main/Dockerfile. |
Hi, I was working on SAM code and I am getting error in imports: ImportError: /root/.cache/torch_extensions/fused/fused.so: cannot open shared object file: No such file or directory |
I am also facing the same issue |
How do we install these version on google colab, it shows for me there is no version like that available |
Getting this same error: ...while running 2-year-old code in a
Deleting the @Harsha-Musunuri's GDrive link has now expired, whatever it was linking to. Anyone have any further tips for resolving this? |
Some errors occurred during compiling the code, can you tell us the version of the torch, and other software environment, such as cuda, cudnn, gcc, ninja, re2c. Thank you !
The text was updated successfully, but these errors were encountered: