Provenance for OCRProcessing/Processing and Content #35

Closed
mittagessen opened this Issue Feb 7, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@mittagessen

The current OCRProcessing statement is rather rudimentary in not allowing identifiers for each ProcessingStep and being able to link features in the recognition results to particular steps. For example, in our pipeline we frequently use tesseract's page segmentation with ocropus's recognition, so TextLine elements are sourced from one ProcessingStep and their text content is from another one.

A particular use case is when postprocessing like spell checkers add additional variants to String tags (something we'd like to see also) and it may be unclear if the variant is produced by the recognition engine itself or the spell checker.

@cneud

This comment has been minimized.

Show comment
Hide comment
@cneud

cneud Feb 24, 2016

Member

Thank you for your feedback/request. The whole processingStepType will be investigated in the light of this, in connection with issues #35, #27, #13.
Any suggestions or particular use cases are welcome.

Member

cneud commented Feb 24, 2016

Thank you for your feedback/request. The whole processingStepType will be investigated in the light of this, in connection with issues #35, #27, #13.
Any suggestions or particular use cases are welcome.

@cneud

This comment has been minimized.

Show comment
Hide comment
@cneud

cneud Jun 15, 2016

Member

The changes suggested in #13 should help with this - e.g. adding identifiers to processingSteps. What remains to be looked into further is how to record which elements have been produced or altered by a particular processingStep - list of ID references?

Member

cneud commented Jun 15, 2016

The changes suggested in #13 should help with this - e.g. adding identifiers to processingSteps. What remains to be looked into further is how to record which elements have been produced or altered by a particular processingStep - list of ID references?

@cneud cneud referenced this issue Jun 16, 2016

Closed

Processing history #39

6 of 6 tasks complete
@cneud

This comment has been minimized.

Show comment
Hide comment
@cneud

cneud Jun 16, 2016

Member

Continued in #39.

Member

cneud commented Jun 16, 2016

Continued in #39.

@cneud cneud closed this Jun 16, 2016

@cneud cneud added 8 published and removed 1 submitted labels Apr 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment