fatal error: cuda_runtime.h: No such file or directory #131

gr8Adakron · 2018-07-05T06:54:47Z

I have Installed and make NCCL successfully then added all the environment paths too, after that I am trying to run this test program:


#include <nccl.h>

typedef struct {
  double* sendBuff;
  double* recvBuff;
  int size;
  cudaStream_t stream;
} PerThreadData;

int main(int argc, char* argv[])
{
  int nGPUs;
  cudaGetDeviceCount(&nGPUs);
  ncclComm_t* comms = (ncclComm_t*)malloc(sizeof(ncclComm_t)*nGPUs);
  ncclCommInitAll(comms, nGPUs); // initialize communicator
                                // One communicator per process

  PerThreadData* data;

  ... // Allocate data and issue work to each GPU's
      // perDevStream to populate the sendBuffs.

  for(int i=0; i<nGPUs; ++i) {
    cudaSetDevice(i); // Correct device must be set
                      // prior to each collective call.
    ncclAllReduce(data[i].sendBuff, data[i].recvBuff, size,
        ncclDouble, ncclSum, comms[i], data[i].stream);
  }

  ... // Issue work into data[*].stream to consume buffers, etc.
}

and it keeps giving me this error:

$ g++ nccl_temp.cpp

In file included from nccl_temp.cpp:1:0:
/usr/local/include/nccl.h:10:26: fatal error: cuda_runtime.h: No such file or directory
compilation terminated.

This is when I do: locate cuda_runtime.h it returns me this:
/usr/local/cuda-9.0/targets/x86_64-linux/include/cuda_runtime.h

This is my LD_LIBRARY_PATH variable:
LD_LIBRARY_PATH=:./build/lib:/home/afzal/nickel/lib:/usr/local/cuda/lib64:/usr/local/cuda-9.0/targets/x86_64-linux/include/

This is my PATH variable:
PATH=/home/afzal/.virtualenvs/tensorflow_py36/bin:/home/afzal/bin:/home/afzal/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:/home/afzal/.fzf/bin

Any help? As I am trying to install tf-serving but that too returns me error of nccl libraries, so I thought I will first solve the issues of nccl and that will eventually solve the problems of tf-serving.

-Thanks.

The text was updated successfully, but these errors were encountered:

gr8Adakron · 2018-07-05T08:32:59Z

@kerrmudgeon @tfogal @dholt @jaredcasper I am stuck on this installation process, can anyone help me out, Please!

-thanks in advance

gr8Adakron · 2018-07-05T08:45:53Z

And I am getting these errors in tf-serving installation:


nccl_manager.cc:(.text._ZN10tensorflow11NcclManager18LoopKernelLaunchesEPNS0_10NcclStreamE+0x386): undefined reference to `ncclBcast'

nccl_manager.cc:
(.text._ZN10tensorflow11NcclManager15GetCommunicatorEPNS0_10CollectiveE+0x53a): undefined reference to `ncclCommInitAll'

nccl_manager.cc:(.text._ZN10tensorflow11NcclManager15GetCommunicatorEPNS0_10CollectiveE+0xf21): undefined reference to `ncclGetErrorString'

kwen2501 · 2018-07-05T14:13:00Z

Hi, you need to use CUDA compiler nvcc instead of g++ to compile your CUDA program. See, for example, here: https://devblogs.nvidia.com/easy-introduction-cuda-c-and-c/

gr8Adakron · 2018-07-05T20:42:24Z

Thanks! somehow I solved it after running this command.

sudo apt-get install nvidia-cuda-toolkit

As from my information everything was already installed, I don't know why nvcc was not present.

But still I am getting the tensorflow-serving error. Can you help me out with this?

I don't know everything is there still it says, after building and compiling while running the final command it returns all the undefined things. Which is weird and it's giving me a headache.

undefined reference to `ncclCommInitAll'

Please, help me ?

sjeaugey · 2018-09-26T18:17:53Z

It would look like -lnccl is missing from the link command.

Still, I think this no longer applies to the current version. Please re-open if needed.

victorhcm · 2019-11-22T14:12:50Z

You can also fix it using CPATH to point where the header files are:

export CPATH=/usr/local/cuda-10.1/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.1/bin:$PATH

jefflgaol · 2019-11-25T14:11:00Z

Good solution there, mate!

sisrfeng · 2020-04-05T10:26:58Z

while running centerNet, I met similar problem. I use the docker
image from https://hub.docker.com/r/frt03/centernet, and the problem is solved.
I think it is because my CUDA version is too old (9.0)

DineshRajanT · 2020-06-19T05:25:15Z

You can also fix it using CPATH to point where the header files are:

export CPATH=/usr/local/cuda-10.1/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.1/bin:$PATH

But this still didn't solve for me

sakex · 2020-09-09T14:05:56Z

You can also fix it using CPATH to point where the header files are:

export CPATH=/usr/local/cuda-10.1/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.1/bin:$PATH

Thank you sir,

I'd like to add it works with cuda-11, just change to

export CPATH=/usr/local/cuda-11.0/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.0/bin:$PATH

chenQ1114 · 2021-07-29T05:14:58Z

I got the same error: cuda_runtime.h No such file or directory.

gmake -C src build BUILDDIR=/home/mwp141/Tool/nccl-2.8.3-1/build
which: no nvcc in (/usr/local/cuda-9.2/bin)
which: no nvcc in (/usr/local/cuda-9.2/bin)
which: no nvcc in (/usr/local/cuda-9.2/bin)
which: no nvcc in (/usr/local/cuda-9.2/bin)
gmake[1]: Entering directory /home/mwp141/Tool/nccl-2.8.3-1/src' Generating nccl.h.in > /home/mwp141/Tool/nccl-2.8.3-1/build/include/nccl.h Grabbing include/nccl_net.h > /home/mwp141/Tool/nccl-2.8.3-1/build/include/nccl_net.h Compiling init.cc > /home/mwp141/Tool/nccl-2.8.3-1/build/obj/init.o In file included from init.cc:7:0: /home/mwp141/Tool/nccl-2.8.3-1/build/include/nccl.h:10:26: fatal error: cuda_runtime.h: No such file or directory #include <cuda_runtime.h> ^ compilation terminated. gmake[1]: *** [/home/mwp141/Tool/nccl-2.8.3-1/build/obj/init.o] Error 1 gmake[1]: Leaving directory /home/mwp141/Tool/nccl-2.8.3-1/src'
gmake: *** [src.build] Error 2

export CPATH=/usr/local/cuda-9.2/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.2/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-9.2/bin:$PATH

This did not solved by me. Do you know how to fix it? Thanks!

sjeaugey · 2021-07-29T09:11:17Z

There seems to be no nvcc in /usr/local/cuda-9.2/bin. Is CUDA installed in /usr/local/cuda-9.2 ? If not, did you set CUDA_HOME to /usr/local/cuda-9.2 by mistake? Otherwise you can set CUDA_HOME to a path where CUDA is installed.

Also note, the latest version of NCCL will probably not compile with an old CUDA 9.2. I'd advise to upgrade to CUDA 10.2 at least, and preferably 11.4.

ArchanaShinde1 · 2023-02-17T04:59:17Z

There seems to be no nvcc in /usr/local/cuda-9.2/bin. Is CUDA installed in /usr/local/cuda-9.2 ? If not, did you set CUDA_HOME to /usr/local/cuda-9.2 by mistake? Otherwise you can set CUDA_HOME to a path where CUDA is installed.

Also note, the latest version of NCCL will probably not compile with an old CUDA 9.2. I'd advise to upgrade to CUDA 10.2 at least, and preferably 11.4.

export CUDA_HOME=/usr/local/cuda-11.4
works for me.Thanks!

amughrabi · 2023-03-31T17:03:56Z

If you are using Anaconda, the following line works like a charm:

conda install -c nvidia cuda-toolkit

mmehedin · 2024-02-01T16:09:21Z

If you are using Anaconda, the following line works like a charm:
conda install -c nvidia cuda-toolkit

this is the solution for the cuda_runtime.h error

sjeaugey closed this as completed Sep 26, 2018

s-kodge mentioned this issue Jun 3, 2022

AttributeError: module 'tutel_custom_kernel' has no attribute 'inject_source' microsoft/tutel#156

Closed

mmehedin mentioned this issue Feb 1, 2024

Not possible to declare pytorch-quantization as a dependency NVIDIA/TensorRT#2199

Closed

turjo-001 mentioned this issue Mar 15, 2024

RuntimeError: Error building extension 'decompress_residuals_cpp' - ninja/Colbert/Torch error. bclavie/RAGatouille#166

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fatal error: cuda_runtime.h: No such file or directory #131

fatal error: cuda_runtime.h: No such file or directory #131

gr8Adakron commented Jul 5, 2018 •

edited

Loading

gr8Adakron commented Jul 5, 2018 •

edited

Loading

gr8Adakron commented Jul 5, 2018

kwen2501 commented Jul 5, 2018

gr8Adakron commented Jul 5, 2018

sjeaugey commented Sep 26, 2018

victorhcm commented Nov 22, 2019

jefflgaol commented Nov 25, 2019

sisrfeng commented Apr 5, 2020

DineshRajanT commented Jun 19, 2020

sakex commented Sep 9, 2020

chenQ1114 commented Jul 29, 2021 •

edited

Loading

sjeaugey commented Jul 29, 2021 •

edited

Loading

ArchanaShinde1 commented Feb 17, 2023 •

edited

Loading

amughrabi commented Mar 31, 2023

mmehedin commented Feb 1, 2024

fatal error: cuda_runtime.h: No such file or directory #131

fatal error: cuda_runtime.h: No such file or directory #131

Comments

gr8Adakron commented Jul 5, 2018 • edited Loading

gr8Adakron commented Jul 5, 2018 • edited Loading

gr8Adakron commented Jul 5, 2018

kwen2501 commented Jul 5, 2018

gr8Adakron commented Jul 5, 2018

sjeaugey commented Sep 26, 2018

victorhcm commented Nov 22, 2019

jefflgaol commented Nov 25, 2019

sisrfeng commented Apr 5, 2020

DineshRajanT commented Jun 19, 2020

sakex commented Sep 9, 2020

chenQ1114 commented Jul 29, 2021 • edited Loading

sjeaugey commented Jul 29, 2021 • edited Loading

ArchanaShinde1 commented Feb 17, 2023 • edited Loading

amughrabi commented Mar 31, 2023

mmehedin commented Feb 1, 2024

gr8Adakron commented Jul 5, 2018 •

edited

Loading

gr8Adakron commented Jul 5, 2018 •

edited

Loading

chenQ1114 commented Jul 29, 2021 •

edited

Loading

sjeaugey commented Jul 29, 2021 •

edited

Loading

ArchanaShinde1 commented Feb 17, 2023 •

edited

Loading