Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleOCR安装报错 #9426

Closed
DazzlingGalaxy opened this issue Mar 14, 2023 · 8 comments
Closed

PaddleOCR安装报错 #9426

DazzlingGalaxy opened this issue Mar 14, 2023 · 8 comments
Assignees

Comments

@DazzlingGalaxy
Copy link

系统CentOS 7.9.2009 x64,conda新建的虚拟环境,python 3.8.16,pip源是阿里源。

我用下面的命令安装

pip install paddlepaddle-gpu==2.4.2.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
pip install paddleocr

paddlepaddle安装成功,paddleocr安装报错

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
paddlepaddle-gpu 2.4.2.post112 requires protobuf<=3.20.0,>=3.1.0, but you have protobuf 3.20.3 which is incompatible.

此时查看版本是
paddleocr==2.6.1.3
paddlepaddle-gpu==2.4.2.post112

然后我卸载protobuf 3.20.3再安装3.20.0,报错

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
onnx 1.13.1 requires protobuf<4,>=3.20.2, but you have protobuf 3.20.0 which is incompatible.

再卸载onnx和protobuf,再安装protobuf==3.20.0(发布于2022年4月2日)和onnx==1.12.0(发布于2022年6月18日,上个版本1.11.0发布于2022年2月18日)。由于装onnx时会自动装上protobuf导致protobuf装的版本高,所以先手动装protobuf再装onnx。都装好后测试了一次可以OCR识别(第一次运行时自动下载了几个文件),但测试第二次时报错

ERROR in app: Exception on /bankticketocr [POST]
Traceback (most recent call last):
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "sfz_ocr_test.py", line 799, in bankticket
    return bank_ticket_ocr(img_file_path=img_file_path, shoukuan_name=shoukuan_name, shoukuan_bank_number=shoukuan_bank_number, shoukuan_bank_name=shoukuan_bank_name)
  File "sfz_ocr_test.py", line 709, in bank_ticket_ocr
    ocr = PaddleOCR(enable_mkldnn=True, use_tensorrt=True, use_angle_cls=False, lang="ch")
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 523, in __init__
    super().__init__(params)
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/paddleocr/tools/infer/predict_system.py", line 46, in __init__
    self.text_detector = predict_det.TextDetector(args)
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/paddleocr/tools/infer/predict_det.py", line 141, in __init__
    self.predictor, self.input_tensor, self.output_tensors, self.config = utility.create_predictor(
  File "/root/miniconda3/envs/ocr_gpu/lib/python3.8/site-packages/paddleocr/tools/infer/utility.py", line 277, in create_predictor
    predictor = inference.create_predictor(config)
RuntimeError: (NotFound) TensorRT is needed, but TensorRT dynamic library is not found.
  [Hint: dso_handle should not be null.] (at /paddle/paddle/fluid/platform/dynload/tensorrt.cc:47)

总结一下,两个问题:
1.为什么使用下面的命令安装时报错?

pip install paddlepaddle-gpu==2.4.2.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
pip install paddleocr

2.为什么我手动更换protobuf和onnx的版本后,第一次能识别图片中的文字,第二次就报错?

@andyjiang1116
Copy link
Collaborator

检查下tensorrt环境配置,报错提示是tensorRT问题,或者先关闭tensorrt试试

@DazzlingGalaxy
Copy link
Author

检查下tensorrt环境配置,报错提示是tensorRT问题,或者先关闭tensorrt试试

1.问题一报错的原因是什么?你们新环境安装会报这个错吗?
2.新买的物理服务器,在这之前就装过GPU版的PaddleSpeech,还是在另一个conda环境中用pip装的,怎么检查tensorrt环境配置?我是新手。
3.怎么关闭tensorrt?

@andyjiang1116
Copy link
Collaborator

安装问题可能是python版本,paddle版本,cuda版本的不匹配问题,通过use_tensorrt=False关闭

@DazzlingGalaxy
Copy link
Author

安装问题可能是python版本,paddle版本,cuda版本的不匹配问题,通过use_tensorrt=False关闭

1.应该不是版本不匹配的问题吧?因为我的PaddleSpeech正常,而且PaddleOCR的安装说明里也用的python 3.7和3.8。所以安装报错到底什么原因?你们可以在新环境试一下吗?

2.use_tensorrt无论是True还是不写,我发现识别速度差别不大啊,那关闭它有什么影响?

3.先说一下我的服务器,CPU是Intel Xeon Gold 6138,GPU是Tesla P4,之前在服务器报错的代码是这样的

def ocr_sfz(img_path):
    ocr = PaddleOCR(enable_mkldnn=True, use_tensorrt=True, use_angle_cls=False, lang="ch")
    result = ocr.ocr(img_path, cls=True)

def bank_ticket_ocr(img_file_path: str, shoukuan_name: str, shoukuan_bank_number: str, shoukuan_bank_name: str):
    ocr = PaddleOCR(enable_mkldnn=True, use_tensorrt=True, use_angle_cls=False, lang="ch")
    result = ocr.ocr(img_path, cls=True)

最开始写这些代码时还没有带独显的服务器,用的是CPU版的PaddleOCR,测试管用后才放在现在带P4的服务器上,并且换成了GPU版的PaddleOCR。上面两个函数对应两个接口,最开始我说第一次成功识别,第二次报错,是否因为第二次时重复初始化了PaddleOCR()对象?因为我想起来用PaddleSpeech是看到这两个链接:PaddlePaddle/PaddleSpeech#2881
PaddlePaddle/PaddleSpeech#2908

4.今天我又改了代码测试了下,不报错了。相对之前报错代码的变化,一是将PaddleOCR()对象初始化为全局变量,二是删除enable_mkldnn=True

ocr = PaddleOCR(use_tensorrt=True, use_angle_cls=False, lang="ch")

def ocr_sfz(img_path):
    result = ocr.ocr(img_path, cls=True)

def bank_ticket_ocr(img_file_path: str, shoukuan_name: str, shoukuan_bank_number: str, shoukuan_bank_name: str):
    result = ocr.ocr(img_path, cls=True)

4.1 是不是将PaddleOCR()对象初始化为全局变量,比在每个函数中初始化一次要好?
4.2 我现在服务器有P4,装的也是GPU版的PaddleOCR,还有必要开启enable_mkldnn吗?

@andyjiang1116
Copy link
Collaborator

将PaddleOCR()对象初始化为全局变量,比在每个函数中初始化一次要好

是的,全局变量要好一些,enable_mkldnn这个是在cpu下开启才有用

@DazzlingGalaxy
Copy link
Author

更新(以下内容仅在Windows测试过):
不用先卸载onnx和protobuf再重新安装。
paddleocr安装报错后直接执行:

pip install protobuf==3.20.0
pip install onnx==1.12.0

就能覆盖掉版本,装上3.20.0和1.12.0。

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@Kael-DWT
Copy link

Kael-DWT commented Aug 3, 2023

解决过程:
报错: ImportError: cannot import name '_print_arguments' from 'paddle.distributed.utils' (/home/kael.yang/.local/lib/python3.8/site-packages/paddle/distributed/utils/init.py)

pip3 install --upgrade pip
python3 -m pip install paddlepaddle-gpu==2.1.1 -i https://mirror.baidu.com/pypi/simple
pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip install "paddleocr>=2.2"

这里在ipython测试: from paddleocr import PPStructure
报错:
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

然后解决:
pip install protobuf==3.20.0

最后成功可使用 from paddleocr import PPStructure


还是没有成功, 最后解析时还是报错:

C++ Traceback (most recent call last):

0 paddle::framework::SignalHandle(char const*, int)
1 paddle::platform::GetCurrentTraceBackStringabi:cxx11


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system.
[TimeInfo: *** Aborted at 1691053082 (unix time) try "date -d @1691053082" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 3901491 (TID 0x7f80863a7740) from PID 0 ***]

段错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants