Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR repays tech debt in the
etloutputmodule by removing code paths for ETL Output files produced by v1 workflows on IPA 6.X versions of the platform. Dropping support for legacy versions simplifies the functional code and eliminates the need to proactively enable loading of tables OCR.v3 workflows on IPA 7.X organize OCR text, tokens, and tables into separate files linked in
etl_output.json. This makes all ETL Output information discoverable to the point where it can be loaded automatically, and all three are now loaded by default when present.The boolean flags
etloutput.load(..., text=True, tokens=True, tables=True)are now entirely optional. They've been left in because they're not very complex and the ability to disable loading of information you don't need might be useful to improve performance.Changes: