Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous download of ch_PP-OCRv3_det_infer.tar #10197

Closed
pepperav opened this issue Jun 17, 2023 · 14 comments
Closed

Continuous download of ch_PP-OCRv3_det_infer.tar #10197

pepperav opened this issue Jun 17, 2023 · 14 comments
Assignees
Labels
inference this is a inference phase issue. needs investigation this issue needs investigation to either narrow down, or clarify status/close triaged this issue has been looked, and triaged.

Comments

@pepperav
Copy link

pepperav commented Jun 17, 2023

ehi 👋,
I'm going to create a service by an AWS Lambda as Docker Image.

Even though I've putted rec_model_dir, det_model_dir and cls_model_dir to a local path with models already there,
every run it still download ch_PP-OCRv3_det_infer.tar

Code:
`

          import sys
          from paddleocr import PaddleOCR,draw_ocr
          
          def handler(event, context):
              ocr = PaddleOCR(ocr = PaddleOCR(det=False, det_model_dir='/tmp/whl/det/en/en_PP-OCRv3_det_infer/', rec_model_dir='/tmp/whl/rec/en/en_PP-OCRv3_det_infer/', cls_model_dir='/tmp/whl/cls/en/en_PP-OCRv3_det_infer/', use_angle_cls=False, use_gpu=False, lang='en'))
          
              return 'Hello from AWS Lambda using Python' + sys.version + '!'

`

Log:
download https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar to /home/sbx_user1051/.paddleocr/whl/det/ch/ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det_infer.tar

How can I avoid that?
Thank you

@Sayaka91
Copy link

@pepperav
I got the similar problem. It seems cause of paddleocr version. You can try down version to 2.6.1.2

@pepperav
Copy link
Author

Thank you @Sayaka91 , I will try.

Actually with this folders tree:
image

and this code line:
image

it works without any other download.

@BisarAhmed
Copy link

any solution for this error it cause the failing for the docker compose up command start downloading before the download end it give this error
FileNotFoundError: [Errno 2] No such file or directory: '/root/.paddleocr/whl/det/ml/Multilingual_PP-OCRv3_det_infer/Multilingual_PP-OCRv3_det_infer.tar'

@pepperav
Copy link
Author

pepperav commented Jun 19, 2023

I see multilingual on the log printed by you. In my case I forced to be only english lang='en'.

I guess this is the reason why it try to download other stuff.

@BisarAhmed
Copy link

i passed use_onnx as true it stopped downloading but failed at reading .tar files. i am still debugging

use_onnx=True, det_model_dir="whl/det/en/en_PP-OCRv3_rec_infer.tar", rec_model_dir="whl/rec/en/en_PP-OCRv3_rec_infer.tar", cls_model_dir="whl/cls/en/ch_ppocr_mobile_v2.0_cls_infer.tar"

    ` INVALID_PROTOBUF : Load model from whl/cls/en/ch_ppocr_mobile_v2.0_cls_infer failed:Protobuf parsing failed.`

@pepperav
Copy link
Author

I don't know what use_onnx does but you should specify folders with tar files already extracted!

eg:
wrong: det_model_dir="whl/det/en/en_PP-OCRv3_rec_infer.tar"
valid: det_model_dir="whl/det/en/en_PP-OCRv3_rec_infer/"

@khansajeel
Copy link

@BisarAhmed you should not put in tar files there, use extracted directories. it should work

@BisarAhmed
Copy link

@khansajeel @pepperav thanks it solved the download issue

@madhavi1102
Copy link

@Sayaka91 without following the mentioned structure by you for downloading det, rec, pls models, it is working fine with specifying any local path used for referring these extracted models.

@oscarjose9423
Copy link

@pepperav Hey I am doing this right now, would you mind sharing your Dockerfile? the solution and folder structure did not work for me. Additionally I am getting an error:

able to unmarshal input: 'utf-8' codec can't decode byte 0x9c in position 147: invalid start byte

when testing the container locally

@pepperav
Copy link
Author

@pepperav Hey I am doing this right now, would you mind sharing your Dockerfile? the solution and folder structure did not work for me. Additionally I am getting an error:

able to unmarshal input: 'utf-8' codec can't decode byte 0x9c in position 147: invalid start byte

when testing the container locally

Sure!

FROM public.ecr.aws/lambda/python:latest-x86_64

COPY requirements.txt ./

RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
RUN yum check-update; if [ $? -eq 100 ]; then exit 0; else exit $?; fi
RUN yum install -y libgomp mesa-libGL

COPY app.py ${LAMBDA_TASK_ROOT}
COPY whl ${LAMBDA_TASK_ROOT}/whl

CMD [ "app.handler" ]

This is the requirements.txt file:

paddleocr
paddlepaddle==2.4.2 -i https://pypi.tuna.tsinghua.edu.cn/simple

Note: LAMBDA_TASK_ROOT – The path to your Lambda function code.

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 24, 2024
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@jzhang533 jzhang533 added triaged this issue has been looked, and triaged. needs investigation this issue needs investigation to either narrow down, or clarify inference this is a inference phase issue. and removed stale labels Apr 10, 2024
@UserWangZz
Copy link
Collaborator

This issue has not been updated for a long time. This issue is temporarily closed and can be reopened if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inference this is a inference phase issue. needs investigation this issue needs investigation to either narrow down, or clarify status/close triaged this issue has been looked, and triaged.
Projects
None yet
Development

No branches or pull requests

10 participants