You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @aborroy - first off: thanks for the OCR transformer, it looks nice and lean!
I'm struggling with a migration from a hand-rolled OCR pipeline with Alfresco 5.0 (CE) to your OCR transformer with Alfresco 7.4 (CE). The direct integration as a folder rule would be much simpler. My setup works so far that I can upload the quick.pdf from this repo and the OCR magic (new document version) works as expected. That's great!
Here's my problem: When I upload a real PDF file (426kb, one page, PDF version 1.4) then no new document version is created, never. My guess is that the issue is caused by resource limits. I've experimented with file size and I think it's more related to the execution time. A bigger file (508kb, one page, PDF version 1.4) sometimes succeeds in a new document version, but not always. I'm pretty sure it's not the file size as the OCR transformer does not configure the maxSourceSizeBytes - which defaults to -1 (no limit) according to the docs.
but this does not change the situation. Unfortunately, I was not able to figure out where the transformOptions.get(TIMEOUT) comes from or how to set it properly.
While digging into this I recognized, when the execution time is less than 5 seconds the new document version is created. I didn't found any defaults for the transformOptions regarding the timeout.
Maybe you could give me a hint? :)
The text was updated successfully, but these errors were encountered:
Hey @aborroy - first off: thanks for the OCR transformer, it looks nice and lean!
I'm struggling with a migration from a hand-rolled OCR pipeline with Alfresco 5.0 (CE) to your OCR transformer with Alfresco 7.4 (CE). The direct integration as a folder rule would be much simpler. My setup works so far that I can upload the quick.pdf from this repo and the OCR magic (new document version) works as expected. That's great!
Here's my problem: When I upload a real PDF file (426kb, one page, PDF version 1.4) then no new document version is created, never. My guess is that the issue is caused by resource limits. I've experimented with file size and I think it's more related to the execution time. A bigger file (508kb, one page, PDF version 1.4) sometimes succeeds in a new document version, but not always. I'm pretty sure it's not the file size as the OCR transformer does not configure the
maxSourceSizeBytes
- which defaults to-1
(no limit) according to the docs.Here are some screenshots:
I searched for transformer timeouts and configured on the repository the following settings:
but this does not change the situation. Unfortunately, I was not able to figure out where the
transformOptions.get(TIMEOUT)
comes from or how to set it properly.While digging into this I recognized, when the execution time is less than 5 seconds the new document version is created. I didn't found any defaults for the
transformOptions
regarding the timeout.Maybe you could give me a hint? :)
The text was updated successfully, but these errors were encountered: