-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance the OCR recognition accuracy of PPStructure. #11916
Conversation
Thanks for your contribution! |
In my own opinion, this fix is essentially just a patch (rather than a complete solution), so there may be edge cases where this fix does not address. UPDATE: That being said, in my own use cases, this patch typically has much higher OCR recognition accuracy when compared to the current implementation of |
PaddleOCR/ppstructure/predict_system.py Lines 120 to 149 in c82dd64
Hi @RussellLuo. Could you modify the code here to make it detect and recognize the original image directly, so that the other interfaces don't need to be changed? |
I'm not familiar with this piece of code. To my understanding, the core logic is to first detect the layout regions (by using the LayoutPredictor): PaddleOCR/ppstructure/predict_system.py Line 114 in c82dd64
and then recognize the corresponding texts from each layout region (by using the TextSystem): PaddleOCR/ppstructure/predict_system.py Lines 120 to 125 in c82dd64
As shown above, to get the image pixels Based on the analysis, the core idea behind this fix is to:
Therefore, this fix is a hybrid solution that leverages both PaddleOCR and StructureSystem, but I'm not sure whether it is appropriate to place the changes of this hybrid solution into PaddleOCR/ppstructure/predict_system.py. Hope to hear your suggestions. |
Yes, I think that would be a more appropriate modification, with minimal impact on the overall code structure. I hope you'll give it a try. |
All changes have been centralized in The following line would keep results with low confidence and has been deleted: PaddleOCR/ppstructure/predict_system.py Line 61 in c82dd64
These two lines would cause coordinate error and have been commented out. It took me a significant amount of time to identify this issue. I don't understand the logic behind this piece of code, perhaps these lines should be deleted? PaddleOCR/ppstructure/predict_system.py Lines 165 to 166 in c82dd64
|
Why do we need to keep low confidence objectives? PaddleOCR/ppstructure/predict_system.py Line 61 in c82dd64
The code here is used to calculate the offset from the results of the layout analysis and can be removed. PaddleOCR/ppstructure/predict_system.py Lines 165 to 166 in c82dd64
|
This line is from the previous code and will lead to the return of results with low confidence (as the default value
Got it! |
ebf52cd
to
474d9b3
Compare
All suggested changes have been made and all commits have been squashed into one. |
@RussellLuo Thanks for your contribution, I will take some time to check it again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@GreatV Thanks for your patient review and excellent suggestions! |
thanks, lets merge it. |
@RussellLuo Thanks for your contribution! You will receive a beautiful PaddlePaddle gift. Please provide your mailing address by filling out the following questionnaire before October 18th. Looking forward to the future, we will walk further together in the world of open source! |
hi, @RussellLuo
|
This PR is likely to close the following issues: