[QUESTION] What would be a reasonable/sound approach when we only have translation with reference? #51

alvations · 2022-01-13T14:40:10Z

Sometimes we only have translations with their references and no source. But the default COMET expects something like:

{"src": src, "mt": hyp, "ref": ref}

or QE comet

{"src": src, "mt": hyp}

Is there a way to let comet take

{"mt": hyp, "ref": ref}

Would this be a feasible approach?

{"src": ref, "mt": hyp, "ref": ref}

The text was updated successfully, but these errors were encountered:

ricardorei · 2022-01-13T17:09:34Z

Hi @alvations! We do not have any model with MT and Reference only. All our models receive the source.

The best way to have this would be to retrain the QE model but replacing the "src" with the reference.

ricardorei · 2022-01-13T17:11:25Z

{"mt": hyp, "ref": ref}
This is a bad idea for a QE model. Because the embeddings will be super close to each other and the QE model would assign a very high score.

{"src": ref, "mt": hyp, "ref": ref}
This might give you something (because the reference-based model relies much more on the reference than it does on the source). Yet, the score might be a bit biased towards higher values...

alvations · 2022-02-10T03:28:50Z

Thanks for the explanation! Yes it makes sense that {"src": ref, "mt": hyp, "ref": ref} would bias towards higher values. Would be nice to compare it vs {"src": src, "mt": hyp, "ref": ref}

ogencoglu · 2023-09-11T18:20:04Z

I am also interested in the case of only having some translation and ground truth without any source.

What is the exact effect of src to the score for "Unbabel/wmt22-comet-da"? Here are 4 experiments:

data = [
    {
        "src": "Dem Feuer konnte Einhalt geboten werden",
        "mt": "The fire could be stopped",
        "ref": "They were able to control the fire."
    }

which results in 0.8386

Then I replaced the source with some random Japanese text which is not related to mt or ref at all:

data = [
    {
        "src": "犬が郵便配達員に向かって吠えている。",
        "mt": "The fire could be stopped",
        "ref": "They were able to control the fire."
    }
]

which results in 0.8260

Then I just passed empty string:

data = [
    {
        "src": "",
        "mt": "The fire could be stopped",
        "ref": "They were able to control the fire."
    }
]

which results in 0.8229

And finally passed ref:

data = [
    {
        "src": "They were able to control the fire.",
        "mt": "The fire could be stopped",
        "ref": "They were able to control the fire."
    }
]

which results in 0.8423

There does not seem to be a significant difference between these scores. Any comments would be appreciated.

ogencoglu · 2023-09-11T18:28:18Z

Difference in scores even less significant considering the fact that the following gives 0.4236 score.

data = [
    {
        "src": "",
        "mt": "",
        "ref": "They were able to control the fire."
    }
]

ricardorei · 2023-10-03T09:10:20Z

Hey! Two things to consider for Unbabel/wmt22-comet-da:

Its relies much more on reference than source. Source seems to help just a little to disambiguate a few phenomena but overall the model seems to give much more importance to reference.
Empty strings are OOD. Its likely that there is no empty string in WMT data which is used to train the model. The model sometimes can have weird behaviours on this very low quality translations because they typically never occur. Thats why I believe its important to pair COMET with a lexical metric like chrF.

From your examples, the range of changes according to your perturbations seems to be little (this is not desirable), yet if we look at the rankings, they seem correct. For empty src and mt the score is the lowest, and empty source/random src have similar score. The highest score is when you use all data correctly.

alvations added the question Further information is requested label Jan 13, 2022

ricardorei closed this as completed Jan 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] What would be a reasonable/sound approach when we only have translation with reference? #51

[QUESTION] What would be a reasonable/sound approach when we only have translation with reference? #51

alvations commented Jan 13, 2022

ricardorei commented Jan 13, 2022

ricardorei commented Jan 13, 2022

alvations commented Feb 10, 2022

ogencoglu commented Sep 11, 2023

ogencoglu commented Sep 11, 2023

ricardorei commented Oct 3, 2023

[QUESTION] What would be a reasonable/sound approach when we only have translation with reference? #51

[QUESTION] What would be a reasonable/sound approach when we only have translation with reference? #51

Comments

alvations commented Jan 13, 2022

ricardorei commented Jan 13, 2022

ricardorei commented Jan 13, 2022

alvations commented Feb 10, 2022

ogencoglu commented Sep 11, 2023

ogencoglu commented Sep 11, 2023

ricardorei commented Oct 3, 2023