Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative value as accuracy of table. #44

Open
satheeshkatipomu opened this issue Aug 1, 2019 · 3 comments
Open

Negative value as accuracy of table. #44

satheeshkatipomu opened this issue Aug 1, 2019 · 3 comments
Projects

Comments

@satheeshkatipomu
Copy link

While testing I have faced a case where table.accuracy is negative number.

PDF:page-3.pdf
Code:

tables=camelot.read_pdf('/Users/skatipomu/Table_Extraction_Camelot/page3.pdf',pages="all)
[table.accuracy for table in tables]

Output:
[99.99999999999997, -20.852716930856104]

I think the reason is because in compute_accuracy method in utils.py while calculating accuracy we are subtracting error percentage from 1. It is supposed to be in the range [0.0,1.0] but the errors passed on to this method contains error percentages in the range[0 to 100] which inturn is from get_table_index method. So dividing this error by 100 solved the issue for me.

def compute_accuracy(error_weights):
    """Calculates a score based on weights assigned to various
    parameters and their error percentages.

    Parameters
    ----------
    error_weights : list
        Two-dimensional list of the form [[p1, e1], [p2, e2], ...]
        where pn is the weight assigned to list of errors en.
        Sum of pn should be equal to 100.

    Returns
    -------
    score : float

    """
    SCORE_VAL = 100
    try:
        score = 0
        if sum([ew[0] for ew in error_weights]) != SCORE_VAL:
            raise ValueError("Sum of weights should be equal to 100.")
        for ew in error_weights:
            weight = ew[0] / len(ew[1])
            for error_percentage in ew[1]:
                **score += weight * (1 - error_percentage)**
    except ZeroDivisionError:
        score = 0
    return score

from score += weight * (1 - error_percentage) to score += weight * (1 - error_percentage/100.0)

@anakin87
Copy link
Contributor

anakin87 commented Aug 1, 2019

atlanhq/camelot#223

@satheeshkatipomu
Copy link
Author

closing as it is already raised in atlanhq/camelot repo

@vinayak-mehta
Copy link
Member

Opening this as a reference instead.

@vinayak-mehta vinayak-mehta reopened this Aug 2, 2019
@vinayak-mehta vinayak-mehta added this to Backlog in TODO! Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
TODO!
  
Awaiting triage
Development

No branches or pull requests

3 participants