Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval模式程序执行失败 #522

Closed
trundleyrg opened this issue Aug 13, 2020 · 3 comments
Closed

eval模式程序执行失败 #522

trundleyrg opened this issue Aug 13, 2020 · 3 comments

Comments

@trundleyrg
Copy link

trundleyrg commented Aug 13, 2020

环境信息:
Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-96-generic x86_64)
显卡:TITAN RTX
device: 0, CUDA Capability: 75, Driver API Version: 10.1, Runtime API Version: 10.0
device: 0, cuDNN Version: 7.6.

调用文本检测训练,训练阶段部分正常,执行测试时报错。

报错信息如下:
/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py:789: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 121, in
main()
File "tools/train.py", line 98, in main
program.train_eval_rec_run(config, exe, train_info_dict, eval_info_dict)
File "/data/val/PaddleOCR/tools/program.py", line 354, in train_eval_rec_run
metrics = eval_rec_run(exe, config, eval_info_dict, "eval")
File "/data/val/PaddleOCR/tools/eval_utils/eval_rec_utils.py", line 67, in eval_rec_run
return_numpy=False)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 790, in run
six.reraise(*sys.exc_info())
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/six.py", line 686, in reraise
raise value
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 785, in run
use_program_cache=use_program_cache)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 838, in _run_impl
use_program_cache=use_program_cache)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/executor.py", line 912, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 paddle::operators::CUDNNConvOpKernel::Compute(paddle::framework::ExecutionContext const&) const
3 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul,
paddle::operators::CUDNNConvOpKernel, paddle::operators::CUDNNConvOpKernel, paddle::operators::CUDNNConvOpKernelpaddle::platform::float16 >::opera
tor()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionC
ontext const&)
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
6 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
7 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
8 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator > const&, bool, bool)


Python Call Stacks (More useful to users):

File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/dell/anaconda2/envs/chineseocr/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 1403, in conv2d
"data_format": data_format,
File "/data/val/PaddleOCR/ppocr/modeling/backbones/rec_mobilenet_v3.py", line 157, in conv_bn_layer
bias_attr=False)
File "/data/val/PaddleOCR/ppocr/modeling/backbones/rec_mobilenet_v3.py", line 99, in call
name='conv1')
File "/data/val/PaddleOCR/ppocr/modeling/architectures/rec_model.py", line 111, in call
conv_feas = self.backbone(inputs)
File "/data/val/PaddleOCR/tools/program.py", line 176, in build
dataloader, outputs = model(mode=mode)
File "tools/train.py", line 58, in main
config, eval_program, startup_program, mode='eval')
File "tools/train.py", line 121, in
main()


Error Message Summary:

Error: An error occurred here. There is no accurate error hint for this error yet. We are continuously in the process of increasing hint for this kind of error che
ck. It would be helpful if you could inform us of how this conversion went by opening a github issue. And we will resolve it with high priority.

  • New issue link: https://github.com/PaddlePaddle/Paddle/issues/new
  • Recommended issue content: all error stack information
    [Hint: CUDNN_STATUS_EXECUTION_FAILED] at (/paddle/paddle/fluid/operators/conv_cudnn_op.cu:286)
    [operator < conv2d > error]
@dyning
Copy link
Collaborator

dyning commented Aug 13, 2020

使用的我们提供的docker环境,还是?看问题像cuda或者cudnn安装的问题。

@trundleyrg
Copy link
Author

使用的我们提供的docker环境,还是?看问题像cuda或者cudnn安装的问题。
自己配的环境。那我试试docker吧。

@trundleyrg
Copy link
Author

在docker环境中执行可以。

an1018 pushed a commit to an1018/PaddleOCR that referenced this issue Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants