cuML fails to import with undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11

## 🐛 Bug

RAPIDS cuML cannot be imported in `v132` Docker images. It raises an error on import:

```
ImportError: /opt/conda/lib/python3.10/site-packages/cuml/internals/../../../.././libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11
```

My best guess would be this is an issue with mixing conda and system CUDA Toolkits. This container has three different CTKs installed.

- The system version in `/usr/local/cuda`  is CUDA 11.3
  - Provides `/usr/local/cuda/lib64/libcublas.so.11.5.1.109` (and similar for `libcublasLt`)
- The conda environment has `cudatoolkit` 11.7.0
  - Provides `/opt/conda/lib/libcublas.so.11.10.1.25` (and similar for `libcublasLt`)
- The conda environment has `libcublas` / `libcublas-dev` from CUDA 11.8.0
  - Provides `/opt/conda/lib/libcublas.so.11.11.3.6` (and similar for `libcublasLt`)

The function `cublasLtGetStatusString` was added in CUDA 11.4.2. https://docs.nvidia.com/cuda/archive/11.4.2/cuda-toolkit-release-notes/index.html#cublas-11.4.2

I suspect this is somehow getting the system CUDA Toolkit version 11.3 when we need something that is 11.4.2 or newer, perhaps from the conda environment.

### To Reproduce

```
docker run -it gcr.io/kaggle-gpu-images/python:v132 python -c "import cuml"
```

Image `v122` does not show this problem. I can try to get a more detailed diagnosis and attempt to find when this was broken. The images are large so it takes some time to download and test them.

**Workaround:** @cdeotte found that running `import cudf` before `import cuml` fixes the problem in `v132`.

### Expected behavior

cuML imports successfully.

### Additional context

There has been a similar issue raised on this repo before, but with less detail: https://github.com/Kaggle/docker-python/issues/1224#issuecomment-1449209577

I looked at the library dependencies and unresolved symbols in RAPIDS (`cuml` and `raft`) but I don't see any direct references to `cublasLtGetStatusString`, which makes me think it's failing to resolve a shared library needed within the CUDA Toolkit itself. There is a reference to that symbol in `libcublas`, which probably expects to load the symbol from `libcublasLt`. I'm not sure why importing `cudf` first fixes this issue, since `cudf` doesn't use `libcublas`.

I was able to successfully use the commands below to install/import `cuml` with CUDA 11.3.0, so I know this cannot be reproduced by only `cuml` with a CUDA Toolkit prior to 11.4.2 when `cublasLtGetStatusString` was introduced. It must have something to do with the three separate CUDA Toolkits in the Kaggle image.

```
nvidia-docker run -it nvidia/cuda:11.3.0-devel-ubi8
# then in the container...
yum install python3.9
pip3.9 install 'numba<0.57' cudf-cu11 cuml-cu11 --extra-index-url=https://pypi.nvidia.com/
python3.9 -c "import cuml"
```

I don't think there's much that can be fixed on the cuML side. A fix would probably require some consolidation among the image's three separate CUDA Toolkits. There is a similar issue reported here, where a user is combining TensorFlow and PyTorch without RAPIDS, but probably with multiple CTKs involved: https://github.com/pytorch/pytorch/issues/88882

Full traceback here:

<details>

```
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[1], line 1
----> 1 import cuml

File /opt/conda/lib/python3.10/site-packages/cuml/__init__.py:17
      1 #
      2 # Copyright (c) 2022-2023, NVIDIA CORPORATION.
      3 #
   (...)
     14 # limitations under the License.
     15 #
---> 17 from cuml.internals.base import Base, UniversalBase
     19 # GPU only packages
     21 import cuml.common.cuda as cuda

File /opt/conda/lib/python3.10/site-packages/cuml/internals/__init__.py:17
      1 #
      2 # Copyright (c) 2019-2023, NVIDIA CORPORATION.
      3 #
   (...)
     14 # limitations under the License.
     15 #
---> 17 from cuml.internals.base_helpers import BaseMetaClass, _tags_class_and_instance
     18 from cuml.internals.api_decorators import (
     19     _deprecate_pos_args,
     20     api_base_fit_transform,
   (...)
     32     exit_internal_api,
     33 )
     34 from cuml.internals.api_context_managers import (
     35     in_internal_api,
     36     set_api_output_dtype,
     37     set_api_output_type,
     38 )

File /opt/conda/lib/python3.10/site-packages/cuml/internals/base_helpers.py:20
     17 from inspect import Parameter, signature
     18 import typing
---> 20 from cuml.internals.api_decorators import (
     21     api_base_return_generic,
     22     api_base_return_array,
     23     api_base_return_sparse_array,
     24     api_base_return_any,
     25     api_return_any,
     26     _deprecate_pos_args,
     27 )
     28 from cuml.internals.array import CumlArray
     29 from cuml.internals.array_sparse import SparseCumlArray

File /opt/conda/lib/python3.10/site-packages/cuml/internals/api_decorators.py:24
     21 import warnings
     23 # TODO: Try to resolve circular import that makes this necessary:
---> 24 from cuml.internals import input_utils as iu
     25 from cuml.internals.api_context_managers import BaseReturnAnyCM
     26 from cuml.internals.api_context_managers import BaseReturnArrayCM

File /opt/conda/lib/python3.10/site-packages/cuml/internals/input_utils.py:19
      1 #
      2 # Copyright (c) 2019-2023, NVIDIA CORPORATION.
      3 #
   (...)
     14 # limitations under the License.
     15 #
     17 from collections import namedtuple
---> 19 from cuml.internals.array import CumlArray
     20 from cuml.internals.array_sparse import SparseCumlArray
     21 from cuml.internals.global_settings import GlobalSettings

File /opt/conda/lib/python3.10/site-packages/cuml/internals/array.py:22
     19 import operator
     20 import pickle
---> 22 from cuml.internals.global_settings import GlobalSettings
     23 from cuml.internals.logger import debug
     24 from cuml.internals.mem_type import MemoryType, MemoryTypeError

File /opt/conda/lib/python3.10/site-packages/cuml/internals/global_settings.py:19
     17 import os
     18 import threading
---> 19 from cuml.internals.available_devices import is_cuda_available
     20 from cuml.internals.device_type import DeviceType
     21 from cuml.internals.logger import warn

File /opt/conda/lib/python3.10/site-packages/cuml/internals/available_devices.py:17
      1 #
      2 # Copyright (c) 2022-2023, NVIDIA CORPORATION.
      3 #
   (...)
     14 # limitations under the License.
     15 #
     16 from cuml.internals.device_support import GPU_ENABLED
---> 17 from cuml.internals.safe_imports import gpu_only_import_from, UnavailableError
     19 try:
     20     from functools import cache  # requires Python >= 3.9

File /opt/conda/lib/python3.10/site-packages/cuml/internals/safe_imports.py:21
     19 import traceback
     20 from cuml.internals.device_support import CPU_ENABLED, GPU_ENABLED
---> 21 from cuml.internals import logger
     24 class UnavailableError(Exception):
     25     """Error thrown if a symbol is unavailable due to an issue importing it"""

ImportError: /opt/conda/lib/python3.10/site-packages/cuml/internals/../../../.././libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuML fails to import with undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11 #1258

🐛 Bug

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cuML fails to import with undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11 #1258

Description

🐛 Bug

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions