missing cicc? #9

hmaarrfk · 2023-12-30T23:03:27Z

Solution to issue cannot be found in the documentation.

I checked the documentation.

Issue

cicc seems to be in ${PREFIX}/nvvm/bin instead of ${PREFIX}/bin

so does libdevice10..bc

xref: conda-forge/tensorflow-feedstock#296

Installed packages

| linux-64/cuda-nvcc-tools-12.0.76-h59595ed_1.conda 
 | linux-64/cuda-nvcc-tools-12.1.105-hd3aeb46_0.conda 
 | linux-64/cuda-nvcc-tools-12.0.76-h59595ed_0.conda

Environment info

The text was updated successfully, but these errors were encountered:

jakirkham · 2024-01-04T02:11:15Z

Thanks for raising Mark! 🙏

nvvm is actually the expected location for these files

In CUDA 12, the nvvm contents match the CUDA Toolkit layout. In CUDA 11, the cudatoolkit package is not matching this layout ( conda-forge/cudatoolkit-feedstock#96 )

Hence libdevice and other bits wind up in the wrong place in the cudatoolkit package. Think we discussed this before in issue ( conda-forge/tensorflow-feedstock#296 ) where cudatoolkit package layout issues had cropped up

With cicc itself, it is typically used by nvcc (not usually external programs)

Have seen one other case where cicc was not found, but after further investigation it was due to some build configuration issues ( scopetools/cudadecon#29 )

So am wondering if there is a similar issue here. Do you have more context on the issue that came up?

hmaarrfk · 2024-01-04T04:46:18Z

Tensorflow 2.15 and cuda builds is where it came up

jakirkham · 2024-01-04T04:48:13Z

Ok is there a log or something we could look at?

hmaarrfk · 2024-01-04T04:59:15Z

Not really since we disable building tf on the cis . you can see how I modified the build script though.

but I’ll upload something tomorrow

conda-forge/tensorflow-feedstock#366

edit: this comment in particular shows a small portion of the log
conda-forge/tensorflow-feedstock#366 (comment)

jakirkham · 2024-01-04T07:34:22Z

Completely understandable

An uploaded log would work. Happy to look at snippets too

We might consider setting up TensorFlow on the Quansight CI as well to make that a bit easier to manage

jakirkham · 2024-02-09T06:06:55Z

Found this (admittedly old) thread, which mentions cicc may need to be in the search path

jakirkham · 2024-02-09T06:09:15Z

This logic should add NVVM's bin directory to the $PATH

cuda-nvcc-impl-feedstock/recipe/nvcc.profile.patch

Line 12 in 45b1556

    
           +PATH            += $(TOP)/bin:$(TOP)/$(_NVVM_BRANCH_)/bin:$(TOP)/../../bin:$(TOP)/../../$(_NVVM_BRANCH_)/bin:

leofang · 2024-05-07T20:55:11Z

This logic should add NVVM's bin directory to the $PATH

Is there any action we need in this feedstock?

hmaarrfk · 2024-05-07T21:12:14Z

i'm not sure. happy to revisit in the future.

I haven't had time to go through the tensorflow builds in a long time.

LourensVeen · 2024-07-02T17:34:13Z

I'm seeing the same issue as in scopetools/cudadecon#29, in a similar situation with old CUDA code using CMake. And I can reproduce it without calling cicc directly:

conda create -n test
conda activate test
conda install cuda-toolkit

touch source.cu
${CONDA_PREFIX}/bin/nvcc -c source.cu        # works

${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc -c source.cu
<command-line>: fatal error: cuda_runtime.h: No such file or directory

${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc -c -I${CONDA_PREFIX}/targets/x86_64-linux/include source.cu
sh: 1: cicc: not found

For the failing call, strace tells me:

openat(AT_FDCWD, "${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc.profile", O_RDONLY) = -1 ENOENT (No such file or directory)

while for the successful one it says:

openat(AT_FDCWD, "${CONDA_PREFIX}/bin/nvcc.profile", O_RDONLY) = 3

This latter file contains the line

CICC_PATH        = $(TOP)/nvvm/bin

which explains why cicc isn't found, I think.

So if nvcc tries to find its configuration in a location relative to itself, perhaps the symlink for nvcc should be accompanied by one for nvcc.profile?

LourensVeen · 2024-07-03T07:56:49Z

And a little more digging: CMake runs nvcc -v __cmake_determine_cuda, which prints the configuration as created from nvcc.profile and then errors out. This has the line

#$ TOP=${CONDA_PREFIX}/bin/../targets/x86_64-linux

which CMake then uses to locate nvcc at ${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc, from where it can't find its configuration.

So it's the nvcc.profile itself that points CMake to a version of nvcc that cannot read nvcc.profile 😄.

Adding a symlink at ${CONDA_PREFIX)/targets/x86_64-linux/bin/nvcc.profile to ${CONDA_PREFIX}/bin/nvcc.profile fixes the problem.

leofang · 2024-07-03T14:54:25Z

@robertmaynard @adibbley do you have insights for what Lorens brought up above?

LourensVeen · 2024-07-03T15:01:41Z

I've now also added a symlink for the bin/crt directory, to avoid errors linking code that uses the driver API with the stubs. I have more issues still, but the code I'm working on is also messy so they may be unrelated.

robertmaynard · 2024-07-05T18:02:37Z

What Cmake version are you using? This sounds like an older version of CMake that didn't properly handle symlinks inside TOP and has been fixed

LourensVeen · 2024-07-08T07:00:04Z

This is a new CMake, but with an old configuration that uses the now-obsolete FindCUDA macro.

But my first example reproduces the problem without CMake being involved in any way. Are you saying that users are expected to first resolve the symlink at ${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc, rather than trying to run it directly as if it were the linked-to executable?

robertmaynard · 2024-07-08T11:32:44Z

But my first example reproduces the problem without CMake being involved in any way. Are you saying that users are expected to first resolve the symlink at ${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc, rather than trying to run it directly as if it were the linked-to executable?

After looking at this more the issue is entirely due to a bad setup by conda. You are correct that a nvcc.profile needs to be beside the nvcc symlink in ${CONDA_PREFIX}/targets/x86_64-linux/bin/.

In the current form the nvcc at ${CONDA_PREFIX}/targets/x86_64-linux/bin/ is broken and the verbose output from the compiler looks like:

#$ NVCC_PREPEND_FLAGS=" -ccbin=/home/rmaynard/miniconda3/envs/cuda_stub_env/bin/x86_64-conda-linux-gnu-c++"
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/home/rmaynard/miniconda3/envs/cuda_stub_env/targets/x86_64-linux/bin
#$ _THERE_=/home/rmaynard/miniconda3/envs/cuda_stub_env/targets/x86_64-linux/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ "/home/rmaynard/miniconda3/envs/cuda_stub_env/bin"/x86_64-conda-linux-gnu-c++ ....

When I symlink the nvcc.profile as well into targets/x86_64-linux/bin I see proper paths for the crt headers being included and a simple test case properly finds them.

@leofang @adibbley We need to create a nvcc.profile symlink like we do for targets/x86_64-linux/bin/nvcc

robertmaynard · 2024-07-08T12:18:40Z

@LourensVeen The only reason that ${CONDA_PREFIX}/targets/x86_64-linux/bin/nvcc exists is to support legacy CMake versions where the FindCUDA or FindCUDAToolkit would validate the CUDA Toolkit layout by searching for a nvcc executable under bin. Therefore we have that symlink so that targets/x86_64-linux/ matches the checked layout.

But I also believe that if we are going to offer a symlink to the compiler it should work so we don't give footguns to users

Edit: So at some point expect targets/x86_64-linux/bin/nvcc to go away and the only nvcc compiler to be in <prefix>/bin

LourensVeen · 2024-07-09T07:20:10Z

Okay, that makes sense to me. I'll be updating that CMake config.

You need to symlink bin/crt as well to make nvcc work if you want a temporary solution.

hmaarrfk added the bug Something isn't working label Dec 30, 2023

jakirkham mentioned this issue Feb 9, 2024

Rebuild for CUDA 12 w/arch + Windows support conda-forge/dgl-feedstock#31

Merged

hmaarrfk closed this as completed May 7, 2024

LourensVeen mentioned this issue Jul 3, 2024

support cuda 12, rerender build scopetools/cudadecon#29

Merged

leofang reopened this Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing cicc? #9

missing cicc? #9

hmaarrfk commented Dec 30, 2023

jakirkham commented Jan 4, 2024

hmaarrfk commented Jan 4, 2024

jakirkham commented Jan 4, 2024

hmaarrfk commented Jan 4, 2024 •

edited

Loading

jakirkham commented Jan 4, 2024 •

edited

Loading

jakirkham commented Feb 9, 2024

jakirkham commented Feb 9, 2024

leofang commented May 7, 2024

hmaarrfk commented May 7, 2024

LourensVeen commented Jul 2, 2024

LourensVeen commented Jul 3, 2024 •

edited

Loading

leofang commented Jul 3, 2024

LourensVeen commented Jul 3, 2024

robertmaynard commented Jul 5, 2024

LourensVeen commented Jul 8, 2024

robertmaynard commented Jul 8, 2024

robertmaynard commented Jul 8, 2024 •

edited

Loading

LourensVeen commented Jul 9, 2024

missing cicc? #9

missing cicc? #9

Comments

hmaarrfk commented Dec 30, 2023

Solution to issue cannot be found in the documentation.

Issue

Installed packages

Environment info

jakirkham commented Jan 4, 2024

hmaarrfk commented Jan 4, 2024

jakirkham commented Jan 4, 2024

hmaarrfk commented Jan 4, 2024 • edited Loading

jakirkham commented Jan 4, 2024 • edited Loading

jakirkham commented Feb 9, 2024

jakirkham commented Feb 9, 2024

leofang commented May 7, 2024

hmaarrfk commented May 7, 2024

LourensVeen commented Jul 2, 2024

LourensVeen commented Jul 3, 2024 • edited Loading

leofang commented Jul 3, 2024

LourensVeen commented Jul 3, 2024

robertmaynard commented Jul 5, 2024

LourensVeen commented Jul 8, 2024

robertmaynard commented Jul 8, 2024

robertmaynard commented Jul 8, 2024 • edited Loading

LourensVeen commented Jul 9, 2024

hmaarrfk commented Jan 4, 2024 •

edited

Loading

jakirkham commented Jan 4, 2024 •

edited

Loading

LourensVeen commented Jul 3, 2024 •

edited

Loading

robertmaynard commented Jul 8, 2024 •

edited

Loading