eval模式程序执行失败 #522

trundleyrg · 2020-08-13T02:26:21Z

环境信息：
Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-96-generic x86_64)
显卡：TITAN RTX
device: 0, CUDA Capability: 75, Driver API Version: 10.1, Runtime API Version: 10.0
device: 0, cuDNN Version: 7.6.

调用文本检测训练，训练阶段部分正常，执行测试时报错。

报错信息如下：
/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py:789: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 121, in
main()
File "tools/train.py", line 98, in main
program.train_eval_rec_run(config, exe, train_info_dict, eval_info_dict)
File "/data/val/PaddleOCR/tools/program.py", line 354, in train_eval_rec_run
metrics = eval_rec_run(exe, config, eval_info_dict, "eval")
File "/data/val/PaddleOCR/tools/eval_utils/eval_rec_utils.py", line 67, in eval_rec_run
return_numpy=False)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 790, in run
six.reraise(*sys.exc_info())
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/six.py", line 686, in reraise
raise value
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 785, in run
use_program_cache=use_program_cache)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 838, in _run_impl
use_program_cache=use_program_cache)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 912, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 paddle::operators::CUDNNConvOpKernel::Compute(paddle::framework::ExecutionContext const&) const
3 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul,
paddle::operators::CUDNNConvOpKernel, paddle::operators::CUDNNConvOpKernel, paddle::operators::CUDNNConvOpKernelpaddle::platform::float16 >::opera
tor()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionC
ontext const&)
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
6 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
7 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
8 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator > const&, bool, bool)

Python Call Stacks (More useful to users):

File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 1403, in conv2d
"data_format": data_format,
File "/data/val/PaddleOCR/ppocr/modeling/backbones/rec_mobilenet_v3.py", line 157, in conv_bn_layer
bias_attr=False)
File "/data/val/PaddleOCR/ppocr/modeling/backbones/rec_mobilenet_v3.py", line 99, in call
name='conv1')
File "/data/val/PaddleOCR/ppocr/modeling/architectures/rec_model.py", line 111, in call
conv_feas = self.backbone(inputs)
File "/data/val/PaddleOCR/tools/program.py", line 176, in build
dataloader, outputs = model(mode=mode)
File "tools/train.py", line 58, in main
config, eval_program, startup_program, mode='eval')
File "tools/train.py", line 121, in
main()

Error Message Summary:

Error: An error occurred here. There is no accurate error hint for this error yet. We are continuously in the process of increasing hint for this kind of error che
ck. It would be helpful if you could inform us of how this conversion went by opening a github issue. And we will resolve it with high priority.

New issue link: https://github.com/PaddlePaddle/Paddle/issues/new
Recommended issue content: all error stack information
[Hint: CUDNN_STATUS_EXECUTION_FAILED] at (/paddle/paddle/fluid/operators/conv_cudnn_op.cu:286)
[operator < conv2d > error]

dyning · 2020-08-13T05:28:37Z

使用的我们提供的docker环境，还是？看问题像cuda或者cudnn安装的问题。

trundleyrg · 2020-08-13T05:38:05Z

使用的我们提供的docker环境，还是？看问题像cuda或者cudnn安装的问题。
自己配的环境。那我试试docker吧。

trundleyrg · 2020-08-14T08:44:28Z

在docker环境中执行可以。

trundleyrg closed this as completed Aug 14, 2020

an1018 pushed a commit to an1018/PaddleOCR that referenced this issue Aug 17, 2022

fix inference vis.py py2 encoding (PaddlePaddle#522)

003f265

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval模式程序执行失败 #522

eval模式程序执行失败 #522

trundleyrg commented Aug 13, 2020 •

edited

Loading

dyning commented Aug 13, 2020

trundleyrg commented Aug 13, 2020

trundleyrg commented Aug 14, 2020

eval模式程序执行失败 #522

eval模式程序执行失败 #522

Comments

trundleyrg commented Aug 13, 2020 • edited Loading

C++ Call Stacks (More useful to developers):

Python Call Stacks (More useful to users):

Error Message Summary:

dyning commented Aug 13, 2020

trundleyrg commented Aug 13, 2020

trundleyrg commented Aug 14, 2020

trundleyrg commented Aug 13, 2020 •

edited

Loading