[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

Z3TA · 2023-05-05T00:09:39Z

I'm a beginner so sorry if this is the wrong place.

Branch/Tag/Commit

main?

Docker Image Version

nvcr.io/nvidia/pytorch:22.03-py3

GPU name

NVIDIA GeForce GTX TITAN X

CUDA Driver

Cuda compilation tools, release 11.6, V11.6.112 ?

Reproduced Steps

1. docker run -ti --gpus all nvcr.io/nvidia/pytorch:22.03-py3 bash
2. git clone https://github.com/NVIDIA/FasterTransformer.git
3. cd FasterTransformer mkdir build && cd build
4. cmake -DSM=80 -DCMAKE_BUILD_TYPE=Release .. && make -j12
5. ./bin/bert_example 32 12 32 12 64 0 0

root@1423b5c7567a:/workspace/FasterTransformer/build# ./bin/bert_example 32 12 32 12 64 0 0
[INFO] Device: NVIDIA GeForce GTX TITAN X 
Before loading model: free: 11.54 GB, total: 11.93 GB, used:  0.39 GB
[WARNING] gemm_config.in is not found; using default GEMM algo
terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 

Aborted (core dumped)


root@1423b5c7567a:/workspace/FasterTransformer/build# /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Thu_Feb_10_18:23:41_PST_2022
Cuda compilation tools, release 11.6, V11.6.112
Build cuda_11.6.r11.6/compiler.30978841_0

zpc is my PC, usually running Arch with multi-seat, but I didn't manage to install the nvidia driver on Arch (likely due to latest linux kernel, also tried linux-lts) so I booted into Ubuntu ... =)

zeta@zpc:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.6 LTS
Release:	20.04
Codename:	focal

zeta@zpc:~$ lspci -k | grep -A 2 -E "(VGA|3D)"
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
	Subsystem: Gigabyte Technology Co., Ltd 2nd Generation Core Processor Family Integrated Graphics Controller
	Kernel modules: i915
--
01:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
	Subsystem: NVIDIA Corporation GM200 [GeForce GTX TITAN X]
	Kernel driver in use: nvidia
--
02:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd GK104 [GeForce GTX 770]
	Kernel driver in use: nvidia

zeta@zpc:~$ nvidia-smi
Fri May  5 01:56:28 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03   Driver Version: 470.182.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 28%   66C    P5    32W / 250W |    310MiB / 12212MiB |     10%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 N/A |                  N/A |
| 17%   32C    P8    N/A /  N/A |     10MiB /  2000MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7510      G   /usr/lib/xorg/Xorg                 55MiB |
|    0   N/A  N/A     10909      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A  N/A     11077      G   /usr/bin/gnome-shell               21MiB |
|    0   N/A  N/A     93404      G   /usr/lib/firefox/firefox          142MiB |
+-----------------------------------------------------------------------------

I suspect it has to do with my old CUDA Version: 11.4 ... I will try upgrading ...

zeta@zpc:~$ sudo ubuntu-drivers autoinstall

The following NEW packages will be installed:
libnvidia-cfg1-530 libnvidia-common-530 libnvidia-compute-530 libnvidia-compute-530:i386 libnvidia-decode-530 libnvidia-decode-530:i386 libnvidia-encode-530
libnvidia-encode-530:i386 libnvidia-extra-530 libnvidia-fbc1-530 libnvidia-fbc1-530:i386 libnvidia-gl-530 libnvidia-gl-530:i386 linux-modules-nvidia-530-5.4.0-148-generic
linux-modules-nvidia-530-generic linux-objects-nvidia-530-5.4.0-148-generic linux-signatures-nvidia-5.4.0-148-generic nvidia-compute-utils-530 nvidia-driver-530
nvidia-kernel-common-530 nvidia-kernel-source-530 nvidia-utils-530 xserver-xorg-video-nvidia-530

After upgrading to version 530 of nvidia drivers:

zeta@zpc:~$ nvidia-smi
Fri May  5 02:21:07 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX TITAN X      Off| 00000000:01:00.0  On |                  N/A |
| 22%   47C    P8               18W / 250W|    226MiB / 12288MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      8057      G   /usr/lib/xorg/Xorg                           29MiB |
|    0   N/A  N/A     10501      G   /usr/lib/xorg/Xorg                           68MiB |
|    0   N/A  N/A     10666      G   /usr/bin/gnome-shell                         11MiB |
|    0   N/A  N/A     11561      G   /usr/lib/firefox/firefox                    101MiB |
+---------------------------------------------------------------------------------------+

zeta@zpc:~$ docker ps -q -l
1423b5c7567a
zeta@zpc:~$ docker start 1423b5c7567a
zeta@zpc:~$ docker attach 1423b5c7567a

root@1423b5c7567a:/workspace/FasterTransformer/build# ./bin/bert_example 32 12 32 12 64 0 0
[INFO] Device: NVIDIA GeForce GTX TITAN X 
Before loading model: free: 11.59 GB, total: 11.92 GB, used:  0.34 GB
[WARNING] gemm_config.in is not found; using default GEMM algo
terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 

Aborted (core dumped)

Still the same error after upgrading nvidia driver. Or did I miss something ?

The text was updated successfully, but these errors were encountered:

nicobasile · 2023-05-05T18:14:03Z

I'm having same/similar issue, tried lots of things. According to NVIDIA CUDA/Triton docs I should be able to run the versions I've tried but I always encounter the same error above

I don't have access to update the NVIDIA drivers, so I'm using FasterTransformer v5.0
Using v100d-32gb's
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+

  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/build/fastertransformer_backend/build/_deps/repo-ft-src/src/fastertransformer/utils/allocator.h:181

Mhhhaster · 2023-06-16T03:06:09Z

The same issue.
root@3bf52cde87a7:/workspace/FasterTransformer/build# ./bin/swin_example 2 0 0 8 256 2
[FT][INFO] Device GRID T4-8C
terminate called after throwing an instance of 'std::runtime_error'
what(): [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160

Aborted (core dumped)

maybe I should upgrade cuda version

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04 Driver Version: 450.102.04 CUDA Version: 11.0 |

pinecho · 2024-04-11T20:22:09Z

I've encounter exactly the same error, I am using TITAN X as well. Does anyone share any solutions? Thanks.

Z3TA added the bug Something isn't working label May 5, 2023

nicobasile mentioned this issue May 5, 2023

CUDA: Operation Not Supported triton-inference-server/fastertransformer_backend#127

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

Z3TA commented May 5, 2023 •

edited

Loading

nicobasile commented May 5, 2023 •

edited

Loading

Mhhhaster commented Jun 16, 2023

pinecho commented Apr 11, 2024

[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

Comments

Z3TA commented May 5, 2023 • edited Loading

Branch/Tag/Commit

Docker Image Version

GPU name

CUDA Driver

Reproduced Steps

nicobasile commented May 5, 2023 • edited Loading

Mhhhaster commented Jun 16, 2023

nvidia-smi

pinecho commented Apr 11, 2024

Z3TA commented May 5, 2023 •

edited

Loading

nicobasile commented May 5, 2023 •

edited

Loading