Skip to content

Roadmap

Vinayak Mehta edited this page Jan 2, 2019 · 43 revisions

This page of the wiki details the development roadmap for Camelot. You can check out the HISTORY.md to see release history.

Note: This is just the planned issue list associated with each milestone. To see the actual issue list, check out the Github milestones.

v0.2.0 ✔️

  • Port to Python3 #81

v0.3.1 ✔️

  • Make matplotlib an extra and merge #179

v0.3.2 ✔️

  • Public interface to access detected geometries #186

v0.4.0 ✔️

  • Automatic table detection for Stream #102

v0.4.1 ✔️

  • Fix ModuleNotFoundError (chardet) #210

v0.5.0 ✔️

  • Update comparison with Tabula (only for Stream) and pdfplumber #183
  • Add --verbose flag #204
  • Add more plot types to Stream (text edges and table areas) #207
  • Fix text assignment behavior

v0.6.0 ✔️

  • Add support for URLs in read_pdf #91
  • Add option to pass pdfminer parameters to read_pdf #170
  • Add CLI usage examples #143
  • Fix duplicate strings inside cell

v0.7.0

  • Create debian package #214
  • Search for tables in certain page regions #209
  • Add export to sqlite database #212
  • Remove ghostscript CLI call or replace it with Pillow

TODO

  • Automatically detect between lattice/stream #211
  • Make PDFHandler more efficient #106
  • Page splitting is slow for some PDFs #79
  • Image processing/table detection bugs
  • Are stats actually usable? Add benchmarks/tests #83