Issue #944: Un-hide arguments field in Text Extraction action. #945
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
GitHub Issue: (link) #944
What does this Pull Request do?
Un-hides the command-line arguments configuration option for the standard Text Extraction action.
What's new?
A site builder can now fine-tune Tesseract, including the ability to generate hOCR without needing a whole new action be defined.* Changes x feature to such that y
(i.e. Regeneration activity, etc.)? No
How should this be tested?
Go to admin/index and click on Actions
Create a new Advanced Action with the Text Extraction as a base.
Follow the help text to add hOCR generation via the arguments field
Observe that the command-line arguments field is now visible, and that the Mime Type field will save changes made to it.
Go to Contexts under Context UI
Add this action as a reaction to e.g. a page derivative context.
Remove the existing Text Extraction reaction if present.
Create a new Repository Item with Page model.
Add a media of type File and upload a TIFF with text in the image.
Observe that hOCR is now generated in the Extracted Text field .
Documentation Status
Additional Notes:
Any additional information that you think would be helpful when reviewing this
PR.
There is forthcoming work that will include a new hOCR Extracted Text Media Use so I'm leaving creation of this out for this ticket.
Interested parties
Tag (@ mention) interested parties or, if unsure, @Islandora/committers