Skip to content

How was pearson correlation calculated in the experiments ? #6

@thangld201

Description

@thangld201

Hi, thanks for the great works !

There are a bit of details regarding correlation of LaSE in the paper that I did not quite understand. For each target language, the top-5 source languages were used to evaluate LaSE's correlation with ROUGE-2 for out-lang scenario.

Let's assume those 5 languages are lan1, lan2 .. lan5, with the target language being tgt_lan0. I'm assuming that the procedure is like this: generate summaries in tgt_lan with 5 src_lans to obtain 5 prediction sets pred1, pred2 .. pred5. Aggregate those prediction sets, and evaluate LaSE score with references similar to source languages (ref1, ref2 .. ref5), then calculate ROUGE-2 with ref0. In total, we have len(pred1)+len(pred2) + .. len(pred5) scores each for LaSE and ROUGE-2.

After this, we calculate pearson correlation based on two 1D arrays formed of these two lines of scores. Is this interpretation correct ? If it is, since scores of different references-predictions pair might differ (e.g. a similar score of 0.5 might be bad for certain pairs, but considered good for some other pairs), do you think aggregating them this way is suboptimal ?

Could you help clarify this @Tahmid04 ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions