Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

识别模型训练报错 #627

Closed
JawerZ opened this issue Aug 27, 2020 · 3 comments
Closed

识别模型训练报错 #627

JawerZ opened this issue Aug 27, 2020 · 3 comments

Comments

@JawerZ
Copy link

JawerZ commented Aug 27, 2020

`aistudio@jupyter-363593-683413:~/work/PaddleOCR-develop$ python tools/train.py -c configs/rec/rec_chinese_common_train.yml
2020-08-27 22:01:04,734-INFO: {'Global': {'debug': False, 'algorithm': 'CRNN', 'use_gpu': True, 'epoch_num': 3000, 'log_smooth_window': 20, 'print_batch_step': 10, 'save_model_dir': './output/rec_CRNN', 'save_epoch_step': 3, 'eval_batch_step': 2000, 'train_batch_size_per_card': 128, 'test_batch_size_per_card': 128, 'image_shape': [3, 32, 320], 'max_text_length': 25, 'character_type': 'ch', 'character_dict_path': './ppocr/utils/new.txt', 'loss_type': 'ctc', 'distort': False, 'use_space_char': True, 'reader_yml': './configs/rec/rec_chinese_reader.yml', 'pretrain_weights': None, 'checkpoints': './pretrain_models/rec_r34_vd_none_bilstm_ctc/best_accuracy', 'save_inference_dir': './inference/rec', 'infer_img': None}, 'Architecture': {'function': 'ppocr.modeling.architectures.rec_model,RecModel'}, 'Backbone': {'function': 'ppocr.modeling.backbones.rec_resnet_vd,ResNet', 'layers': 34}, 'Head': {'function': 'ppocr.modeling.heads.rec_ctc_head,CTCPredict', 'encoder_type': 'rnn', 'SeqRNN': {'hidden_size': 256}}, 'Loss': {'function': 'ppocr.modeling.losses.rec_ctc_loss,CTCLoss'}, 'Optimizer': {'function': 'ppocr.optimizer,AdamDecay', 'base_lr': 0.0005, 'beta1': 0.9, 'beta2': 0.999}, 'TrainReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader', 'num_workers': 8, 'img_set_dir': './train_data', 'label_file_path': './train_data/train_rec_label.txt'}, 'EvalReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader', 'img_set_dir': './train_data', 'label_file_path': './train_data/test_rec_label.txt'}, 'TestReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader'}}
2020-08-27 22:01:05,075-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000000] in Optimizer will not take effect, and it will only be applied to other Parameters!
2020-08-27 22:01:05,886-INFO: places would be ommited when DataLoader is not iterable
W0827 22:01:05.933650 20571 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W0827 22:01:05.938043 20571 device_context.cc:260] device: 0, cuDNN Version: 7.3.
2020-08-27 22:01:11,382-INFO: Finish initing model from ./pretrain_models/rec_r34_vd_none_bilstm_ctc/best_accuracy
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 123, in
main()
File "tools/train.py", line 100, in main
program.train_eval_rec_run(config, exe, train_info_dict, eval_info_dict)
File "/home/aistudio/work/PaddleOCR-develop/tools/program.py", line 336, in train_eval_rec_run
return_numpy=False)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run
six.reraise(*sys.exc_info())
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run
return_merged=return_merged)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl
return_merged=return_merged)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel
tensors = exe.run(fetch_var_names, return_merged)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 paddle::operators::WarpCTCOp::InferShape(paddle::framework::InferShapeContext*) const
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
5 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
6 paddle::framework::details::ComputationOpHandle::RunImpl()
7 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*)
8 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*)
9 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&)
10 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
11 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Python Call Stacks (More useful to users):

File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/loss.py", line 628, in warpctc
'norm_by_times': norm_by_times,
File "/home/aistudio/work/PaddleOCR-develop/ppocr/modeling/losses/rec_ctc_loss.py", line 34, in call
input=predict, label=label, blank=self.char_num, norm_by_times=True)
File "/home/aistudio/work/PaddleOCR-develop/ppocr/modeling/architectures/rec_model.py", line 197, in call
loss = self.loss(predicts, labels)
File "/home/aistudio/work/PaddleOCR-develop/tools/program.py", line 170, in build
dataloader, outputs = model(mode=mode)
File "tools/train.py", line 50, in main
config, train_program, startup_program, mode='train')
File "tools/train.py", line 123, in
main()

Error Message Summary:

Error: The value of Attr(blank) should be in interval [0, 37). at (/paddle/paddle/fluid/operators/warpctc_op.cc:52)
[operator < warpctc > error]`
请问是什么问题

@tink2123
Copy link
Collaborator

'checkpoints': './pretrain_models/rec_r34_vd_none_bilstm_ctc/best_accuracy',

如果用中文模型训练的话,请加载中文的预训练模型:下载链接
图片

@JawerZ
Copy link
Author

JawerZ commented Aug 28, 2020

'checkpoints': './pretrain_models/rec_r34_vd_none_bilstm_ctc/best_accuracy',

如果用中文模型训练的话,请加载中文的预训练模型:下载链接
图片

image
请问这是什么问题

@tink2123
Copy link
Collaborator

tink2123 commented Sep 1, 2020

加载预训练模型时 使用的是checkpoints? 如果是的话请使用pretrain_weights 加载参数,否则修改字典的话,会有字符数目不一致的问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants