Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics #12

Open
petermr opened this issue Jan 27, 2017 · 0 comments
Open

Metrics #12

petermr opened this issue Jan 27, 2017 · 0 comments

Comments

@petermr
Copy link
Member

petermr commented Jan 27, 2017

This issue will track our metrics. Please contribute your thoughts by replying to this issue and keep the theme restricted to Metrics.

Our intention is to be able to assess the achievement of this project using blinded testing, where the final evaluation kept the methods and corpus secret from the developers.

There are several metrics which can be used. for some we can use the standard "recall + precision" but others may use "accuracy" and yet others a "Likert-like" scale (L)

  1. Identification of tables in articles. This is formally out of scope - the software will be presented with the tables.
  2. Classification of table type. We may develop methods for detecting table type, but may also require the tool be be told it.
  3. identification of sections. (title, header, body, footer, optionally and in any order). This will ly be relevant to tables which humans agree have this structure.
  4. title. L?
  5. Header Structure. Identification of column names, and column trees
  6. Header content. L? will include wrapping, bleeding.
  7. Body structure. May include subtables, possibly guessed or possibly template-driven. Metrics on number of cells missed, or with corrupt content.
  8. Footer content. L?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant