Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Deconstruction of ThreadLocalStore<T> when program exits leads program crashed #4219

Closed
mz24cn opened this issue Dec 13, 2016 · 2 comments
Closed

Comments

@mz24cn
Copy link

mz24cn commented Dec 13, 2016

Environment info

Operating System:
Windows 10
GPU is GTX850M, CUDA version: 8.0

Compiler:
VS 2015

Package used (Python/R/Scala/Julia):
MXNet.cpp

MXNet version:
Or if installed from source:
yes

Error Message:

The program throws exception in main().

Minimum reproducible example

following 4 lines C++ code makes program crash:
#include <cuda_runtime.h>
#include <curand.h>
#include "mxnet-cpp/MxNetCpp.h"
int main(int argc, char** argv) {
Context ctx_dev(DeviceType::kGPU, 0);
NDArray rand(Shape(512, 28), ctx_dev, false); //this 'NDArray' is MXNet.cpp NDArray
NDArray::SampleGaussian(0, 1, &rand);
NDArray::WaitAll();
}

What have you tried to solve it?

1, NDArray::SampleGaussian use cudaMallocPitch() apply GPU memory;
2, When 4 lines code executed, the program 'prepared' to exit, some global static objects were going to be released;
3, CUDA runtime is also unloading since the program prepared to exit;
4, dmlc::ThreadLocalStoremxnet::resource::ResourceManagerImpl global static object was released. (thread_local.h, line 57);
5, The registered pointers were released (thread_local.h, line 52);
6, Eventually cudaFree() was called to release the memory applied in step 1 (tensor_gpu-int.h, line 54);
7, cudaErrorCudartUnloading was returned by cudaFree() since the reason of step 3;
8, dmlc::Error was thrown (base.h, line 219);
9, Windows reports the program crashed.

dmlc::ThreadLocalStoremxnet::resource::ResourceManagerImpl deconstruction was called by OS, and the sequence of unloading mxnet variables and CUDA runtime cannot be controlled. Maybe some design should be changed a little.

@mz24cn
Copy link
Author

mz24cn commented Dec 13, 2016

Remove last line NDArray::WaitAll(); may avoid of the exception, because cudaFree() was called before CUDA unloaded. Or substitute last line to be 'std::this_thread::sleep_for(std::chrono::milliseconds(1000));', the exception occurs, too.

@mz24cn
Copy link
Author

mz24cn commented Dec 13, 2016

I solved the issue. MXNotifyShutdown() should be called when all computation finished.

@mz24cn mz24cn closed this as completed Dec 13, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant