You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I was running RobertaForQuestionAnswering on HuggingFace's squad-v2 train sets (~86k).
The Accuracy metric at AccuracyForLanguageGeneration._compute_single_pred_single_ref threw division by zero error.
To Reproduce
Use datasets squad-v2 train set.
Run the samples through pipeline("question-answering", ...)
Expected behavior
Run without error.
Exception Traceback (if available)
If applicable, add full traceback to help explain your problem.
ration.py:107, in AccuracyForLanguageGeneration._compute_single_pred_single_ref(self, predictions, references, reduce_fn, **kwargs)
105 if token in ref_counts:
106 score += min(pred_count, ref_counts[token]) # Intersection count
--> 107 scores.append(score / max(len(pred), len(ref)))
108 avg_score = sum(scores) / len(scores)
109 return {"score": avg_score}
ZeroDivisionError: division by zero
Environment Information:
OS: Mac OS 13.2.1 (22D68)
jury version: 2.2.3
evaluate version: evaluate==0.2.2
datasets version: datasets==2.11.0
Thanks. Appreciate jury to exist. I could patch this by cloning and doing in-depth trace analysis. But, I wanted to know if there's a better way to patch this.
The text was updated successfully, but these errors were encountered:
Re:
I found the issue. It's during the AccuracyForLanguageGeneration._tokenize(...) process which is stripping off some texts. such as when both the predictions and references are just literal '$':
Describe the bug
I was running
RobertaForQuestionAnswering
on HuggingFace's squad-v2 train sets (~86k).The
Accuracy
metric atAccuracyForLanguageGeneration._compute_single_pred_single_ref
threw division by zero error.To Reproduce
datasets
squad-v2train
set.pipeline("question-answering", ...)
Expected behavior
Run without error.
Exception Traceback (if available)
If applicable, add full traceback to help explain your problem.
Environment Information:
evaluate==0.2.2
datasets==2.11.0
Thanks. Appreciate jury to exist. I could patch this by cloning and doing in-depth trace analysis. But, I wanted to know if there's a better way to patch this.
The text was updated successfully, but these errors were encountered: