Fix DataFrame structure inconsistency between single and multi-model OCR runs #189

marwan37 · 2025-04-09T22:59:55Z

This PR fixes an issue where the DataFrame structure produced by run_ocr was inconsistent between single-model and multi-model runs, causing downstream steps to fail.

Changes:

Standardized run_ocr to always return a flat DataFrame
Updated load_ground_truth_texts to work with the new structure
Modified evaluate_models to filter by model name rather than accessing models as dictionary keys
Updated create_ocr_batch_visualization to handle the new flat structure

… runtime

dagshub · 2025-04-09T22:59:58Z

Join the discussion on DagsHub!

marwan37 added 8 commits April 9, 2025 13:39

subsection samples_for_ocr assets for easier testing

102b6c8

unify dataframe structure for single/multi model runs

b729a95

remove redundant required_integrations, can be inferred from stack at…

67e39ec

… runtime

integrate unified dataframe structure for ocr results in loaders

a127a8f

remove .items() access for model_results; it's a df now

0f02228

update dags asset

07ba031

rename to pipeline_dags to avoid broken links

b4415a3

update pipeline dag image in README

ae96523

marwan37 added the enhancement New feature or request label Apr 9, 2025

marwan37 merged commit 4d7074d into main Apr 10, 2025
3 of 4 checks passed

strickvl deleted the fix/omni-reader-unified-df-struct-for-single-and-multi-model-runs branch April 13, 2025 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DataFrame structure inconsistency between single and multi-model OCR runs #189

Fix DataFrame structure inconsistency between single and multi-model OCR runs #189

Uh oh!

marwan37 commented Apr 9, 2025

Uh oh!

dagshub bot commented Apr 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix DataFrame structure inconsistency between single and multi-model OCR runs #189

Fix DataFrame structure inconsistency between single and multi-model OCR runs #189

Uh oh!

Conversation

marwan37 commented Apr 9, 2025

Uh oh!

dagshub bot commented Apr 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants