Skip to content

ToDo's for making it public #35

@PeterStaar-IBM

Description

@PeterStaar-IBM

Now

  • make a folder converters and move docling inside converters
    • Add the VLM converter
  • make a toplevel folder visualisation and move all related code into it
    • constants.py in current docling folder
    • utils.py in current benchmark folder
  • make the dataset columns uniform (add Modality in all of them)
  • make the README "sexy"
  • move all code from converters.utils in benchmark.utils
  • move converters.utils to evaluators.teds
  • rename to create_pdf_docling_converter, create_img_docling_converter and create_smol_docling_converter
  • rename BenchMarkColumns.DOCLING_VERSION to BenchMarkColumns.CONVERTER_VERSION

Later

  • make a BaseConverter class, which is a virtual class and has two methods (convert and get_version()). The children should be
    • PdfDoclingConverter
    • ImgDoclingConverter
    • DoclingTableFormer !!!!!
    • VLMDoclingConverter

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions