Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA "driver shutting down" Error when built with shared libraries #6399

Open
3 tasks done
MimetrikDE opened this issue Sep 29, 2023 · 1 comment
Open
3 tasks done
Labels
bug Not a build issue, this is likely a bug.

Comments

@MimetrikDE
Copy link

MimetrikDE commented Sep 29, 2023

Checklist

Describe the issue

When Open3D is linked as a dynamic library, creating a Open3d::core::Tensor object using the default constructor and then reassigning the object to a new instance with a new data allocation results in the following error when the application is closed:

[Open3D Error] (void __cdecl open3d::core::__OPEN3D_CUDA_CHECK(enum cudaError,const char *,const int)) C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:289: C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:114 CUDA runtime error: driver shutting down

This causes the application to hang for several seconds likely due to a crash on exit

This issue does not occur when the same code is run with Open3D built and linked as a static library.

Repository with minimal CMake project to reproduce

Steps to reproduce the bug

1. Build Open3D with shared libraries as described below
2. Create a Open3D::core::Tensor object using the default consructor
3. Pass the Tensor object to a function by reference
4. Assign a new Tensor instance to the passed object with a allocation on the device ("CUDA:0")
5. Wait for program to exit
6. Application crashes before closing with the stated error message.

 
#include <Open3D/Open3D.h>
#include <Open3D/core/CUDAUtils.h>

void AssignNew(open3d::core::Tensor& testTensor) {
	testTensor = open3d::core::Tensor::Zeros({ 100,3 }, open3d::core::Dtype::Float32, open3d::core::Device("CUDA:0"));
}


int main() {

	std::cout << "Start" << std::endl;

	open3d::core::Tensor testTensor;
	AssignNew(testTensor);

	open3d::core::cuda::ReleaseCache();

	std::cout << "Finished program" << std::endl;

	return 0;
}

Error message

Start
Finished program
[Open3D Error] (void __cdecl open3d::core::__OPEN3D_CUDA_CHECK(enum cudaError,const char *,const int)) C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:289: C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:114 CUDA runtime error: driver shutting down

Expected behavior

For the given example code and CMake project I expect the application to replace the empty Tensor instance passed into the function with a new Tensor instance holding a device allocation and exit the program cleanly.

Open3D, Python and System information

## Open3D Build Flags
- BUILD_CUDA_MODULE : True
- BUILD_SHARED_LIBS : True
- BUILD_WEBRTC : False
- STATIC_WINDOWS_RUNTIME : False
- BUILD_PYTHON_MODULE : False

Built using 
- Visual Studio 17 2022
- CMake 3.25.1
- Windows 10 64-bit x86
- C++17
- Open3D Version: 0.17.0, Commit: 5b6ef4b04b1a4184f12a2c1181ad2b7d2fe45248

System information
- i9 12900H
- RTX 3080 (Laptop)
- 16GB DDR4 RAM

Additional information

None

@MimetrikDE MimetrikDE added the bug Not a build issue, this is likely a bug. label Sep 29, 2023
@elias-Mimetrik
Copy link

elias-Mimetrik commented Apr 12, 2024

The issue does not happen for me on ubuntu 22.04, CUDA 12.3, Open3D latest main as of Apr. 11th 2024 (v0.18.0).

But I can reproduce the same error on Windows 11 with the following setup:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Not a build issue, this is likely a bug.
Projects
None yet
Development

No branches or pull requests

2 participants