Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage question #1

Closed
sstidl opened this issue Apr 1, 2020 · 7 comments
Closed

Usage question #1

sstidl opened this issue Apr 1, 2020 · 7 comments

Comments

@sstidl
Copy link

sstidl commented Apr 1, 2020

It's not really an Issue, but can you please refer me to some documentation where I can see how to use a share - rule to convert PDF to OCRed PDF?
I only found the REST call to make a rendition but no really useful way to integrate this.

My scenario is that every file which is uploaded to alfresco will automatically be OCRed and be full text searchable. Cant figure out how to do this...

any help is appreciated

regards
stefan

@aborroy
Copy link
Owner

aborroy commented Apr 1, 2020

The code in this project can help you to add this feature:
https://github.com/keensoft/alfresco-simple-ocr

@sstidl
Copy link
Author

sstidl commented Apr 1, 2020

Thank you for your answer!

I think I saw that, but it doesnt use the t-engine concept from alfresco 6+

i integrated alf-tengine-ocr with localtransform.ocr.url but dont understand how to use it with your code from https://github.com/keensoft/alfresco-simple-ocr

will the code from https://github.com/keensoft/alfresco-simple-ocr still work with dockerized alfresco 6.x stack?

thanks for your time and stay healthy,
Stefan

@aborroy
Copy link
Owner

aborroy commented Apr 1, 2020

You can see Alfresco Simple OCR in action with a Dockerized Alfresco using https://github.com/Alfresco/alfresco-docker-installer

The code from https://github.com/keensoft/alfresco-simple-ocr includes a behaviour for the repository. You could port this part to this project in order to get the OCR done every time a new PDF is uploaded.

@krutik-jayswal
Copy link

@aborroy : Do you mean we need to copy code from this project in to https://github.com/keensoft/alfresco-simple-ocr ?
Actually I am struggling with multiple pdf file uploads.So I am trying to make batch management solution using api for OCR?any suggestion for this?

@aborroy
Copy link
Owner

aborroy commented Jan 13, 2021

This project is the Transformer part. If you want to create an Action in order to run this Transformer, then you need some code from https://github.com/keensoft/alfresco-simple-ocr, despite you can fire the transformation action by using some other configuration.

@dgradecak
Copy link

@aborroy I guess with the "targetMediaType": "alfresco-metadata-embed" and the "embed-metadata" action you can close this issue

@aborroy
Copy link
Owner

aborroy commented Sep 21, 2021

Right!

@aborroy aborroy closed this as completed Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants