New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata about detected characters: quality scores + alternatives #16

Open
danvk opened this Issue Jan 6, 2015 · 1 comment

Comments

Projects
None yet
3 participants
@danvk
Contributor

danvk commented Jan 6, 2015

The ocropus-rpred tool outputs text files of predicted text for each image. It would be nice if there were a way for it to output quality scores for each character, as well as alternatives.

For example, this line:
010004 bin

is being transcribed as:
2. 14E St. Lrand Loncourse, n.w. cor.

It's possible that G is the second most-likely candidate for the first letter in Lrand and C for Loncourse. If I were to build some kind of language model as a post-processing step, it would be clear that G and C are the better choices at those positions.

Some kind of JSON output would be helpful. It might look something like:

[
  {
    "x": 216,
    "char": "L",
    "candidates": [
      {
        "char": "L",
        "score": 0.9
      },
      {
        "char": "G",
        "score": 0.8
      },
      ...
    ]
  },
  ...
]

@danvk danvk changed the title from More metadata about detected characters to Metadata about detected characters Jan 6, 2015

@rainkinz

This comment has been minimized.

Show comment
Hide comment
@rainkinz

rainkinz commented Jan 7, 2015

👍

QuLogic pushed a commit to QuLogic/ocropy that referenced this issue Dec 15, 2015

Merge pull request #16 from tianyaqu/master
requests needs 'params' argument when passing parameters in url

@zuphilip zuphilip changed the title from Metadata about detected characters to Metadata about detected characters: quality scores + alternatives Dec 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment