Skip to content

Conversation

@cau-git
Copy link
Contributor

@cau-git cau-git commented Feb 18, 2025

This PR introduces a new class design for:

  • Dataset builders (class interface and implementations to encapsulate all of the create.py logic)
  • Prediction providers (class interface to generalize prediction creation beyond a docling converter, e.g for using hyperscalers SaaS, etc.)
  • Record data model, superseeding scattered codes for building a record (BenchmarkColumns enum, HF dataset features, etc)

All modules are kept in a different root directory for now: docling_eval_next.

This PR will stay WIP until the current codebase is fully ported to it and its design is evaluated.

Dataset builder porting

Prediction providers

TODO:

  • Entirely decouple PredictionProvider from DatasetBuilder. Have DatasetBuilder only produce GT columns, then read the produced parquet files in a secondary step on the prediction providers to augment it.
  • Create a PredictionDatasetRecord subclass, which carries prediction related fields, and remove these from the DatasetRecord class.
  • Add record field to carry the raw prediction output of any provider (similar to the original for the GT)
  • Provide the full DatasetRecord to the PredictionProvider.predict method instead of some select fields out of it.
  • Clean up path handling @cau-git
  • Reinstate visualization, move to prediction provider @cau-git
  • Factor out more common code between the dataset builders @cau-git
  • Adopt bugfixes from stacked PRs @praveenmidde @samiuc
  • Add new prediction provider class which can read from a directory with prediction files @nikos-livathinos
  • Upgrade evaluator classes to use DatasetRecordWithPrediction model instead of direct column data access (then, deprecate BenchmarkColumns) @nikos-livathinos
  • Update CLI to use new API @nikos-livathinos

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
@cau-git cau-git changed the base branch from main to fix/docling-dpbench February 18, 2025 18:21
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
@cau-git cau-git force-pushed the cau/new-class-design branch from 5c595a6 to d8a8a59 Compare February 18, 2025 18:38
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
@cau-git cau-git changed the base branch from fix/docling-dpbench to main February 19, 2025 13:37
@cau-git cau-git requested review from nikos-livathinos and praveenmidde and removed request for nikos-livathinos February 19, 2025 13:54
@PeterStaar-IBM PeterStaar-IBM changed the title feat: Establish new API encapsulation for dataset creation and prediction providers [WIP] feat: Establish new API encapsulation for dataset creation and prediction providers [DO NOT MERGE, only for ideas] Feb 25, 2025
@praveenmidde
Copy link
Contributor

@cau-git Just a thought, since the new design helps in extending the framework for different datasets and models, can we create a library wheel for docling-eval so that one can install and extend it?

@cau-git cau-git mentioned this pull request Mar 10, 2025
9 tasks
cau-git added 2 commits March 17, 2025 15:40
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
cau-git and others added 21 commits March 31, 2025 15:41
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Move common evaluator code to BaseEvaluator.
Add more unit tests. Introduce pytest dependencies.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Remove the BaseReadingOrderEvaluator. Add unit test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
… used by the BaseEvaluator.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
…t test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
@cau-git cau-git requested review from PeterStaar-IBM and nikos-livathinos and removed request for praveenmidde and samiuc April 1, 2025 09:48
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Copy link
Contributor

@nikos-livathinos nikos-livathinos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@cau-git cau-git merged commit a3d99b9 into main Apr 1, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants