Skip to content

make -C docker release_run ERROR: The NVIDIA Driver is present, but CUDA failed to initialize. GPU functionality will not be available. [[ System has unsupported display driver / cuda driver combination (error 803) ]] #189

@Midcc

Description

@Midcc

I compiled tenosrrt-llm on a host machine, and executed the command make -C docker release_build without any errors, successfully generating the image tensorrt-llm. However, when I copied the image to another host machine and attempted to start the image, I received the following error:

sudo docker run --rm -it --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --gpus=all --volume /app/xueht/llm/TensorRT-LLM-release-0.5.0:/code/tensorrt_llm --workdir /code/tensorrt_llm --hostname sftech-27-release --name tensorrt_llm-release-appdeploy --tmpfs /tmp:exec tensorrt_llm/release:latest

=============
== PyTorch ==

NVIDIA Release 23.08 (build 66128610)
PyTorch Version 2.1.0a0+29c30b1

Container image Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2014-2023 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006 Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015 Google Inc.
Copyright (c) 2015 Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

ERROR: The NVIDIA Driver is present, but CUDA failed to initialize. GPU functionality will not be available.
[[ System has unsupported display driver / cuda driver combination (error 803) ]]

The driver information is as follows:

nvidia-smi
Mon Oct 30 02:09:16 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.12 Driver Version: 535.104.12 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|

Metadata

Metadata

Assignees

Labels

triagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions