-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Custom OCR #1502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Note that we won't accept contribution adding dependencies with incompatible license (Surya is licensed as GPL). This is the reason we have a plugin system for users to contribute their own third-party integration. You can read more in the plugin docs: https://docling-project.github.io/docling/concepts/plugins/ |
Hi, @dolfim-ibm I added following configuration to pyproject.toml, which is under my project's root named "fastapi_test". [project.entry-points."docling"]
custom_ocr = "fastapi_test.docling_custom" And I created a python file "docling_custom.py" under my project's root, within which ocr_engines method is defined as the following: def ocr_engines():
return {
"ocr_engines": [
CustomOcrModel,
]
} But when I ran converter, errors occurred.
I have set the ocr_options of pipeline_options with my custom ocr options, in which the value of kind is "custom_ocr". Best regards, |
@dolfim-ibm @pusapatiakhilraju
[build-system]
requires = ["setuptools >= 65.0.0"]
build-backend = "setuptools.build_meta"
[project]
name = "custom_docling"
version = "0.0.1"
dependencies = [
"docling>=2.30.0",
"openai>=1.65.0",
]
[project.entry-points."docling"]
custom_ocr = "custom_docling.plugins.custom_ocr" *path of entry-point should not begin from "src", it is ignored by python. def ocr_engines():
return {
"ocr_engines": [
CustomOcrModel,
]
}
class CustomOcrOptions(OcrOptions):
kind : ClassVar[str] = "custom_ocr"
... Some notes on implementing an OCR plugin.
After everything is done, use That's all. |
thank you. |
@Bill-XU Thanks for outlining your findings and solution. I will close this issue as resolved. |
Uh oh!
There was an error while loading. Please reload this page.
Question
Can I create my custom ocr class and pass it in to ocr_options? Any example code that can help me get started?
...
will this work?
Converting
is this the right way to use the custom OCR? I create a class and use it in
pipeline_cls
The text was updated successfully, but these errors were encountered: