C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 #5669

QingshuChen · 2017-11-15T07:41:48Z

从网站上下载的cuda 8.0的libpaddle_capi_shared.so, 使用一个全连接的网络，在进行forward时，出现如下错误：

F1115 15:36:56.346094 17370 hl_cuda_cublas.cc:307] Check failed: stat == CUBLAS_STATUS_SUCCESS (1 vs. 0) [cublas status]: not initialized
*** Check failure stack trace: ***
    @     0x7f8ddce75bcd  google::LogMessage::Fail()
    @     0x7f8ddce7967c  google::LogMessage::SendToLog()
    @     0x7f8ddce756f3  google::LogMessage::Flush()
    @     0x7f8ddce7ab8e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f8ddd265147  hl_matrix_mul()
    @     0x7f8ddce7ab8e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f8ddd265147  hl_matrix_mul()
    @     0x7f8ddd0c3987  paddle::GpuMatrix::mul()
    @     0x7f8ddd0c3987  paddle::GpuMatrix::mul()
    @     0x7f8ddd0c3fe1  paddle::GpuMatrix::mul()
    @     0x7f8ddcfc5ffb  paddle::FullyConnectedLayer::forward()
    @     0x7f8ddd0c3fe1  paddle::GpuMatrix::mul()
    @     0x7f8ddcfc5ffb  paddle::FullyConnectedLayer::forward()
    @     0x7f8ddcedf95d  paddle::NeuralNetwork::forward()
    @     0x7f8ddce6f6b6  paddle_gradient_machine_forward
    @     0x7f8ddcedf95d  paddle::NeuralNetwork::forward()
    @           0xc52da7  recarch::paddle::PaddlePredictor::predict()
    @     0x7f8ddce6f6b6  paddle_gradient_machine_forward
    @           0x7e20f1  rec::predictor::CtrDnnEngineV4::calc_quality_callback()
    @           0xc52da7  recarch::paddle::PaddlePredictor::predict()
    @           0x7998a1  rec::predictor::BaseEngine::handle()
    @           0x79e646  rec::predictor::PredictTask::run()
    @           0x7e20f1  rec::predictor::CtrDnnEngineV4::calc_quality_callback()
    @           0x7a63cd  rec::predictor::WorkerThread::run()
    @           0x7998a1  rec::predictor::BaseEngine::handle()
    @           0x79e646  rec::predictor::PredictTask::run()
    @           0xf4c6aa  thread_proxy
    @           0x7a63cd  rec::predictor::WorkerThread::run()
    @           0xf4c6aa  thread_proxy
    @     0x7f8de08021c3  start_thread
    @     0x7f8de08021c3  start_thread
    @     0x7f8ddc0da12d  __clone
    @     0x7f8ddc0da12d  __clone
    @              (nil)  (unknown)

请问这个是什么原因？怎么解决？

hl_matrix_mul函数报错代码在：
stat = CUBLAS_GEMM(t_resource.handle,
CUBLAS_OP_N,
CUBLAS_OP_N,
dimN,
dimM,
dimK,
&alpha,
B_d,
ldb,
A_d,
lda,
&beta,
C_d,
ldc);
其中t_resource.handle是一个空指针，原因是t_resource是一个thread_local的变量，没有初始化。

The text was updated successfully, but these errors were encountered:

Xreki · 2017-11-16T01:59:14Z

paddle_init怎么调用的？

QingshuChen · 2017-11-16T02:36:26Z

@Xreki
command = {"--use_gpu=True", "--gpu_id=0"}
paddle_init(2, comand)

Xreki · 2017-11-16T05:57:30Z

paddle_init看起来没有问题，这可能是多线程的一个bug，主线程之外的其他线程的gpu资源没有初始化导致。

hedaoyuan · 2017-11-17T06:03:26Z

We need to add a paddle_init_cuda interface to inference API.

Xreki · 2017-11-20T05:35:36Z

@QingshuChen I create a PR #5773 to fix it. Please help to check. Thanks!

guoshengCS added the User 用于标记用户问题 label Nov 15, 2017

QingshuChen changed the title ~~C-API库进行inference时，出现cublas status: not initialized错误~~ C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 Nov 16, 2017

Xreki mentioned this issue Nov 20, 2017

Add a c-api interface to initialize the thread environment of Paddle #5773

Merged

Xreki closed this as completed in #5773 Dec 8, 2017

jacquesqiao mentioned this issue Jun 21, 2018

使用gpu进行预测时，报错Check failed: stat == CUBLAS_STATUS_SUCCESS (1 vs. 0) [cublas status]: not initialized #11624

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 #5669

C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 #5669

QingshuChen commented Nov 15, 2017 •

edited by lcy-seso

Loading

Xreki commented Nov 16, 2017

QingshuChen commented Nov 16, 2017

Xreki commented Nov 16, 2017

hedaoyuan commented Nov 17, 2017

Xreki commented Nov 20, 2017

C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 #5669

C-API库进行gpu多线程inference时，出现cublas status: not initialized错误 #5669

Comments

QingshuChen commented Nov 15, 2017 • edited by lcy-seso Loading

Xreki commented Nov 16, 2017

QingshuChen commented Nov 16, 2017

Xreki commented Nov 16, 2017

hedaoyuan commented Nov 17, 2017

Xreki commented Nov 20, 2017

QingshuChen commented Nov 15, 2017 •

edited by lcy-seso

Loading