Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler interprets fmax in lltm_cuda_kernel.cu __device__ function as std::fmax #14

Closed
mmazeika opened this issue Jun 27, 2018 · 6 comments

Comments

@mmazeika
Copy link

mmazeika commented Jun 27, 2018

I cloned the repository, and the CPU version compiles, but I get the following error when running python setup.py install in the cuda folder.

running install
running bdist_egg
running egg_info
creating lltm_cuda.egg-info
writing dependency_links to lltm_cuda.egg-info/dependency_links.txt
writing lltm_cuda.egg-info/PKG-INFO
writing top-level names to lltm_cuda.egg-info/top_level.txt
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
reading manifest file 'lltm_cuda.egg-info/SOURCES.txt'
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'lltm_cuda' extension
creating build
creating build/temp.linux-x86_64-3.5
gcc -pthread -B /home/mantas/anaconda3/envs/pytorch04/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/TH -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.0/include -I/home/mantas/anaconda3/envs/pytorch04/include/python3.5m -c lltm_cuda.cpp -o build/temp.linux-x86_64-3.5/lltm_cuda.o -DTORCH_EXTENSION_NAME=lltm_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-9.0/bin/nvcc -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/TH -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.0/include -I/home/mantas/anaconda3/envs/pytorch04/include/python3.5m -c lltm_cuda_kernel.cu -o build/temp.linux-x86_64-3.5/lltm_cuda_kernel.o -DTORCH_EXTENSION_NAME=lltm_cuda --compiler-options '-fPIC' -std=c++11
lltm_cuda_kernel.cu(54): error: calling a host function("std::fmax<double, float> ") from a global function("_NV_ANON_NAMESPACE::lltm_cuda_forward_kernel ") is not allowed

lltm_cuda_kernel.cu(54): error: identifier "std::fmax<double, float> " is undefined in device code

2 errors detected in the compilation of "/tmp/tmpxft_00002819_00000000-6_lltm_cuda_kernel.cpp1.ii".
error: command '/usr/local/cuda-9.0/bin/nvcc' failed with exit status 1

I'm using PyTorch 0.4.0 installed via conda a few weeks ago, Python 3.5, CUDA 9.0, cuDNN 7.1.4, and GCC 6.4.0.

@goldsborough
Copy link
Contributor

Hmm that's interesting that I didn't notice this. Could you do me a favor and see if it goes a way if you change fmax to ::fmax so that it uses the global CUDA version

@mmazeika
Copy link
Author

mmazeika commented Jun 27, 2018

I changed return fmax(0.0, z) + fmin(0.0, alpha * (exp(z) - 1.0)); in the elu function to return ::fmax(0.0, z) + fmin(0.0, alpha * (exp(z) - 1.0));, but the error function didn't change. It didn't start complaining about the fmin.

I tried changing line 54 in the original code from candidate_cell[index] = elu(gates[gates_row + 2 * state_size + column]); to candidate_cell[index] = sigmoid(gates[gates_row + 2 * state_size + column]); so as to avoid calling fmax and fmin, and I got pages of errors as a result. Two errors in the printout that repeat several times are
error: wrong number of template arguments (5, should be 2) return __and_<__not_<is_same<tuple<_Elements...>
and
error: mismatched argument pack lengths while expanding ‘std::is_constructible<_Elements, _UElements&&>’ return __and_<is_constructible<_Elements, _UElements&&>...>::value;.

I've attached the printout in a text file to avoid clutter. torch.cuda.is_available() returns True in Python.

error.txt

@mmazeika
Copy link
Author

Ah, I see. I was using a different python environment from the one I normally use, so when I actually run test = torch.FloatTensor([1]).cuda(), I get the error

Found GPU0 GeForce GTX 770M which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.

I'll let you know if this problem is fixed with PyTorch installed from source.

@goldsborough
Copy link
Contributor

Sounds good, let me know.

@mmazeika
Copy link
Author

Yep, that did the trick.

@YiwenShaoStephen
Copy link

Hi, I met exactly the same issue when trying to compile it. And I've checked my PyTorch version is up-to-date (0.4.1) and my cuda version is 9.1.

@goldsborough goldsborough mentioned this issue Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants