-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add ROUGE to mmeval #72
Conversation
961270f
to
7f60d59
Compare
98ffb2e
to
653c45d
Compare
mmeval/metrics/bleu.py
Outdated
if isinstance(predictions, str): | ||
predictions = [predictions] | ||
|
||
if isinstance(references, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If references
is sequence[str], should it be wrapped with []?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In lines 105-113 of the code, the situation of sequence [str] is analyzed
mmeval/metrics/rouge.py
Outdated
f'`tokenizer` supports Callable, str or None, but not `{type(tokenizer)}`' # noqa: E501 | ||
self.accumulate = accumulate | ||
|
||
def add(self, predictions: Union[str, Sequence[str]], references: Union[str, Sequence[str], Sequence[Sequence[str]]]) -> None: # type: ignore # yapf: disable # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to support the string input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that there may be only one statement for reference and prediction in the simplest case, you can use the simplest method of passing in two strings instead of adding many layer brackets, which may be more convenient for users.
mmeval/metrics/rouge.py
Outdated
recall = matches / reference_len | ||
if precision == recall == 0.0: | ||
return dict( | ||
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) | |
precision=0., recall=0., fmeasure=0.) |
mmeval/metrics/rouge.py
Outdated
Dict[str, float]: Calculate the score of rougeL. | ||
""" | ||
pred_len, reference_len = len(pred), len(reference) | ||
if 0 in (pred_len, reference_len): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 0 in (pred_len, reference_len): | |
if pred_len == 0 or reference_len == 0: |
mmeval/metrics/rouge.py
Outdated
pred_len, reference_len = len(pred), len(reference) | ||
if 0 in (pred_len, reference_len): | ||
return dict( | ||
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) | |
precision=0., recall=0., fmeasure=0.) |
mmeval/metrics/rouge.py
Outdated
reference_ngarms = _create_ngrams(reference, n_gram) | ||
pred_len = sum(pred_ngarms.values()) | ||
reference_len = sum(reference_ngarms.values()) | ||
if 0 in (pred_len, reference_len): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 0 in (pred_len, reference_len): | |
if pred_len == 0 or reference_len == 0: |
mmeval/metrics/rouge.py
Outdated
reference_len = sum(reference_ngarms.values()) | ||
if 0 in (pred_len, reference_len): | ||
return dict( | ||
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0)) | |
precision=0., recall=0.0, fmeasure=0.) |
tokenizer_fn (Union[Callable, str, None]): A user's own tokenizer function. | ||
Defaults to None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokenizer_fn (Union[Callable, str, None]): A user's own tokenizer function. | |
Defaults to None. | |
tokenizer_fn (Callable or str, optional): A user's own tokenizer function. | |
Defaults to None. | |
New in version 0.3.0. |
Args: | ||
token (Sequence[str]): A series of tokens about sentences. | ||
n_gram (int): The maximum number of words contained in a phrase | ||
when calculating word fragments. Defaults to 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when calculating word fragments. Defaults to 4. | |
when calculating word fragments. |
mmeval/metrics/rouge.py
Outdated
Args: | ||
predictions (Sequence[str]): An iterable of predicted sentences. | ||
references (Sequence[Sequence[str]): An iterable of | ||
referenced sentences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
referenced sentences. | |
referenced sentences. |
|
107cb9f
to
e635ba6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Suggest rename |
This PR can be merged after resolving the above comments. |
0e4dc73
to
07e20c6
Compare
Motivation
The task of this PR is to realize ROUGE metric. I have completed the implementation of rouge and provided test files for the test function. And the BLEU metric is modified.
Modification
In order to implement the route metric, we also made amendments to the previous situation that bleu could not evaluate Chinese data.
Checklist