Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 #592

Open
Z3TA opened this issue May 5, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@Z3TA
Copy link

Z3TA commented May 5, 2023

I'm a beginner so sorry if this is the wrong place.

Branch/Tag/Commit

main?

Docker Image Version

nvcr.io/nvidia/pytorch:22.03-py3

GPU name

NVIDIA GeForce GTX TITAN X

CUDA Driver

Cuda compilation tools, release 11.6, V11.6.112 ?

Reproduced Steps

1. docker run -ti --gpus all nvcr.io/nvidia/pytorch:22.03-py3 bash
2. git clone https://github.com/NVIDIA/FasterTransformer.git
3. cd FasterTransformer mkdir build && cd build
4. cmake -DSM=80 -DCMAKE_BUILD_TYPE=Release .. && make -j12
5. ./bin/bert_example 32 12 32 12 64 0 0

root@1423b5c7567a:/workspace/FasterTransformer/build# ./bin/bert_example 32 12 32 12 64 0 0
[INFO] Device: NVIDIA GeForce GTX TITAN X 
Before loading model: free: 11.54 GB, total: 11.93 GB, used:  0.39 GB
[WARNING] gemm_config.in is not found; using default GEMM algo
terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 

Aborted (core dumped)


root@1423b5c7567a:/workspace/FasterTransformer/build# /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Thu_Feb_10_18:23:41_PST_2022
Cuda compilation tools, release 11.6, V11.6.112
Build cuda_11.6.r11.6/compiler.30978841_0

zpc is my PC, usually running Arch with multi-seat, but I didn't manage to install the nvidia driver on Arch (likely due to latest linux kernel, also tried linux-lts) so I booted into Ubuntu ... =)

zeta@zpc:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.6 LTS
Release:	20.04
Codename:	focal

zeta@zpc:~$ lspci -k | grep -A 2 -E "(VGA|3D)"
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
	Subsystem: Gigabyte Technology Co., Ltd 2nd Generation Core Processor Family Integrated Graphics Controller
	Kernel modules: i915
--
01:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
	Subsystem: NVIDIA Corporation GM200 [GeForce GTX TITAN X]
	Kernel driver in use: nvidia
--
02:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd GK104 [GeForce GTX 770]
	Kernel driver in use: nvidia

zeta@zpc:~$ nvidia-smi
Fri May  5 01:56:28 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03   Driver Version: 470.182.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 28%   66C    P5    32W / 250W |    310MiB / 12212MiB |     10%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 N/A |                  N/A |
| 17%   32C    P8    N/A /  N/A |     10MiB /  2000MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7510      G   /usr/lib/xorg/Xorg                 55MiB |
|    0   N/A  N/A     10909      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A  N/A     11077      G   /usr/bin/gnome-shell               21MiB |
|    0   N/A  N/A     93404      G   /usr/lib/firefox/firefox          142MiB |
+-----------------------------------------------------------------------------

I suspect it has to do with my old CUDA Version: 11.4 ... I will try upgrading ...

zeta@zpc:~$ sudo ubuntu-drivers autoinstall

The following NEW packages will be installed:
libnvidia-cfg1-530 libnvidia-common-530 libnvidia-compute-530 libnvidia-compute-530:i386 libnvidia-decode-530 libnvidia-decode-530:i386 libnvidia-encode-530
libnvidia-encode-530:i386 libnvidia-extra-530 libnvidia-fbc1-530 libnvidia-fbc1-530:i386 libnvidia-gl-530 libnvidia-gl-530:i386 linux-modules-nvidia-530-5.4.0-148-generic
linux-modules-nvidia-530-generic linux-objects-nvidia-530-5.4.0-148-generic linux-signatures-nvidia-5.4.0-148-generic nvidia-compute-utils-530 nvidia-driver-530
nvidia-kernel-common-530 nvidia-kernel-source-530 nvidia-utils-530 xserver-xorg-video-nvidia-530

After upgrading to version 530 of nvidia drivers:

zeta@zpc:~$ nvidia-smi
Fri May  5 02:21:07 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX TITAN X      Off| 00000000:01:00.0  On |                  N/A |
| 22%   47C    P8               18W / 250W|    226MiB / 12288MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      8057      G   /usr/lib/xorg/Xorg                           29MiB |
|    0   N/A  N/A     10501      G   /usr/lib/xorg/Xorg                           68MiB |
|    0   N/A  N/A     10666      G   /usr/bin/gnome-shell                         11MiB |
|    0   N/A  N/A     11561      G   /usr/lib/firefox/firefox                    101MiB |
+---------------------------------------------------------------------------------------+
zeta@zpc:~$ docker ps -q -l
1423b5c7567a
zeta@zpc:~$ docker start 1423b5c7567a
zeta@zpc:~$ docker attach 1423b5c7567a

root@1423b5c7567a:/workspace/FasterTransformer/build# ./bin/bert_example 32 12 32 12 64 0 0
[INFO] Device: NVIDIA GeForce GTX TITAN X 
Before loading model: free: 11.59 GB, total: 11.92 GB, used:  0.34 GB
[WARNING] gemm_config.in is not found; using default GEMM algo
terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160 

Aborted (core dumped)

Still the same error after upgrading nvidia driver. Or did I miss something ?

@Z3TA Z3TA added the bug Something isn't working label May 5, 2023
@nicobasile
Copy link

nicobasile commented May 5, 2023

I'm having same/similar issue, tried lots of things. According to NVIDIA CUDA/Triton docs I should be able to run the versions I've tried but I always encounter the same error above

I don't have access to update the NVIDIA drivers, so I'm using FasterTransformer v5.0
Using v100d-32gb's
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+

  what():  [FT][ERROR] CUDA runtime error: operation not supported /workspace/build/fastertransformer_backend/build/_deps/repo-ft-src/src/fastertransformer/utils/allocator.h:181

@Mhhhaster
Copy link

The same issue.
root@3bf52cde87a7:/workspace/FasterTransformer/build# ./bin/swin_example 2 0 0 8 256 2
[FT][INFO] Device GRID T4-8C
terminate called after throwing an instance of 'std::runtime_error'
what(): [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160

Aborted (core dumped)

maybe I should upgrade cuda version

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04 Driver Version: 450.102.04 CUDA Version: 11.0 |

@pinecho
Copy link

pinecho commented Apr 11, 2024

I've encounter exactly the same error, I am using TITAN X as well. Does anyone share any solutions? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants