You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tables=camelot.read_pdf('/Users/skatipomu/Table_Extraction_Camelot/page3.pdf',pages="all)
[table.accuracy for table in tables]
Output: [99.99999999999997, -20.852716930856104]
I think the reason is because in compute_accuracy method in utils.py while calculating accuracy we are subtracting error percentage from 1. It is supposed to be in the range [0.0,1.0] but the errors passed on to this method contains error percentages in the range[0 to 100] which inturn is from get_table_index method. So dividing this error by 100 solved the issue for me.
def compute_accuracy(error_weights):
"""Calculates a score based on weights assigned to various
parameters and their error percentages.
Parameters
----------
error_weights : list
Two-dimensional list of the form [[p1, e1], [p2, e2], ...]
where pn is the weight assigned to list of errors en.
Sum of pn should be equal to 100.
Returns
-------
score : float
"""
SCORE_VAL = 100
try:
score = 0
if sum([ew[0] for ew in error_weights]) != SCORE_VAL:
raise ValueError("Sum of weights should be equal to 100.")
for ew in error_weights:
weight = ew[0] / len(ew[1])
for error_percentage in ew[1]:
**score += weight * (1 - error_percentage)**
except ZeroDivisionError:
score = 0
return score
from score += weight * (1 - error_percentage) to score += weight * (1 - error_percentage/100.0)
The text was updated successfully, but these errors were encountered:
While testing I have faced a case where
table.accuracy
is negative number.PDF:page-3.pdf
Code:
Output:
[99.99999999999997, -20.852716930856104]
I think the reason is because in
compute_accuracy
method in utils.py while calculating accuracy we are subtracting error percentage from 1. It is supposed to be in the range [0.0,1.0] but the errors passed on to this method contains error percentages in the range[0 to 100] which inturn is fromget_table_index
method. So dividing this error by 100 solved the issue for me.from
score += weight * (1 - error_percentage)
toscore += weight * (1 - error_percentage/100.0)
The text was updated successfully, but these errors were encountered: