Compiler interprets fmax in lltm_cuda_kernel.cu device function as std::fmax #14

mmazeika · 2018-06-27T17:11:46Z

I cloned the repository, and the CPU version compiles, but I get the following error when running python setup.py install in the cuda folder.

running install
running bdist_egg
running egg_info
creating lltm_cuda.egg-info
writing dependency_links to lltm_cuda.egg-info/dependency_links.txt
writing lltm_cuda.egg-info/PKG-INFO
writing top-level names to lltm_cuda.egg-info/top_level.txt
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
reading manifest file 'lltm_cuda.egg-info/SOURCES.txt'
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'lltm_cuda' extension
creating build
creating build/temp.linux-x86_64-3.5
gcc -pthread -B /home/mantas/anaconda3/envs/pytorch04/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/TH -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.0/include -I/home/mantas/anaconda3/envs/pytorch04/include/python3.5m -c lltm_cuda.cpp -o build/temp.linux-x86_64-3.5/lltm_cuda.o -DTORCH_EXTENSION_NAME=lltm_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-9.0/bin/nvcc -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/TH -I/home/mantas/anaconda3/envs/pytorch04/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda-9.0/include -I/home/mantas/anaconda3/envs/pytorch04/include/python3.5m -c lltm_cuda_kernel.cu -o build/temp.linux-x86_64-3.5/lltm_cuda_kernel.o -DTORCH_EXTENSION_NAME=lltm_cuda --compiler-options '-fPIC' -std=c++11
lltm_cuda_kernel.cu(54): error: calling a host function("std::fmax<double, float> ") from a global function("_NV_ANON_NAMESPACE::lltm_cuda_forward_kernel ") is not allowed

lltm_cuda_kernel.cu(54): error: identifier "std::fmax<double, float> " is undefined in device code

2 errors detected in the compilation of "/tmp/tmpxft_00002819_00000000-6_lltm_cuda_kernel.cpp1.ii".
error: command '/usr/local/cuda-9.0/bin/nvcc' failed with exit status 1

I'm using PyTorch 0.4.0 installed via conda a few weeks ago, Python 3.5, CUDA 9.0, cuDNN 7.1.4, and GCC 6.4.0.

The text was updated successfully, but these errors were encountered:

goldsborough · 2018-06-27T17:24:11Z

Hmm that's interesting that I didn't notice this. Could you do me a favor and see if it goes a way if you change fmax to ::fmax so that it uses the global CUDA version

mmazeika · 2018-06-27T17:48:20Z

I changed return fmax(0.0, z) + fmin(0.0, alpha * (exp(z) - 1.0)); in the elu function to return ::fmax(0.0, z) + fmin(0.0, alpha * (exp(z) - 1.0));, but the error function didn't change. It didn't start complaining about the fmin.

I tried changing line 54 in the original code from candidate_cell[index] = elu(gates[gates_row + 2 * state_size + column]); to candidate_cell[index] = sigmoid(gates[gates_row + 2 * state_size + column]); so as to avoid calling fmax and fmin, and I got pages of errors as a result. Two errors in the printout that repeat several times are
error: wrong number of template arguments (5, should be 2) return __and_<__not_<is_same<tuple<_Elements...>
and
error: mismatched argument pack lengths while expanding ‘std::is_constructible<_Elements, _UElements&&>’ return __and_<is_constructible<_Elements, _UElements&&>...>::value;.

I've attached the printout in a text file to avoid clutter. torch.cuda.is_available() returns True in Python.

error.txt

mmazeika · 2018-06-27T18:11:03Z

Ah, I see. I was using a different python environment from the one I normally use, so when I actually run test = torch.FloatTensor([1]).cuda(), I get the error

Found GPU0 GeForce GTX 770M which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.

I'll let you know if this problem is fixed with PyTorch installed from source.

goldsborough · 2018-06-27T18:12:42Z

Sounds good, let me know.

mmazeika · 2018-06-28T19:21:09Z

Yep, that did the trick.

YiwenShaoStephen · 2018-10-07T21:00:21Z

Hi, I met exactly the same issue when trying to compile it. And I've checked my PyTorch version is up-to-date (0.4.1) and my cuda version is 9.1.

mmazeika closed this as completed Jun 28, 2018

goldsborough mentioned this issue Jul 30, 2018

Half Tensor Dispatch compatibility ? #15

Closed

rusty1s mentioned this issue Aug 21, 2018

fatal error: cuda_runtime.h: No such file or directory rusty1s/pytorch_unique#3

Closed

onlytailei mentioned this issue Oct 6, 2018

segmentation fault for pcl icp implementation in pytorch cpp extension #20

Closed

goldsborough mentioned this issue Oct 9, 2018

Type error #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler interprets fmax in lltm_cuda_kernel.cu device function as std::fmax #14

Compiler interprets fmax in lltm_cuda_kernel.cu device function as std::fmax #14

mmazeika commented Jun 27, 2018 •

edited

Loading

goldsborough commented Jun 27, 2018

mmazeika commented Jun 27, 2018 •

edited

Loading

mmazeika commented Jun 27, 2018

goldsborough commented Jun 27, 2018

mmazeika commented Jun 28, 2018

YiwenShaoStephen commented Oct 7, 2018

Compiler interprets fmax in lltm_cuda_kernel.cu __device__ function as std::fmax #14

Compiler interprets fmax in lltm_cuda_kernel.cu __device__ function as std::fmax #14

Comments

mmazeika commented Jun 27, 2018 • edited Loading

goldsborough commented Jun 27, 2018

mmazeika commented Jun 27, 2018 • edited Loading

mmazeika commented Jun 27, 2018

goldsborough commented Jun 27, 2018

mmazeika commented Jun 28, 2018

YiwenShaoStephen commented Oct 7, 2018

Compiler interprets fmax in lltm_cuda_kernel.cu device function as std::fmax #14

Compiler interprets fmax in lltm_cuda_kernel.cu device function as std::fmax #14

mmazeika commented Jun 27, 2018 •

edited

Loading

mmazeika commented Jun 27, 2018 •

edited

Loading