Skip to content

Interpreting the results

Adrián edited this page Mar 18, 2019 · 1 revision

In our paper we explored different combinations of trivergence, the scores generated by all of them, except for "KL τc" (order="Q,PR" measure="KL" in the XML format), can be interpreted as follows: The lower, the better. Ideally, the best summary is the one with a trivergence of zero, however, in practice this can mean that all the summaries used in the evaluation are equal to the original text. This can be seen on the fact that the correlation between the Trivergence combinations and either Pyramid or Responsiveness is negative; these manual evaluation metrics give always a score of 1 to the best summary, while a zero to the worst one.

From all the combinations explored, we consider that the combination "KL τc" (order="Q,PR" measure="KL" in the XML format) should be avoided. The reasons are that the correlations of this combination with either Pyramid or Responsiveness tend to be zero and in most cases the correlations are not statistically significant. In other words, sometimes the correlation approaches to zero on the positive side, while in occasions the correlation approaches to zero on the negative side.

Which is the best summary in the example?

The best summary would be those related to puces_ots while the worst those linked to puces_baseline_last.

Is there a maximum value of Trivergence?

There is not a theoretical maximum. The reason is that the Trivergence, as happens with divergence, measure how different are the distributions of probability instead of how similar they are. In simple words, the trivergence is a distance measure.

However, in the case of the evaluation of summaries, it might be possible to calculate a maximum if we take into account that the number of possible words is finite, and, in consequence, the number of differences is finite as well. This has to be studied in depth, to be sure that this is possible.

Which Trivergence combination should I use?

Based on the results obtained in "SummTriver: A new trivergent model to evaluate summaries automatically without human references", if the goal is to get closer to Pyramid results the best combinations are "JS τc" or "sJS τc" (order="Q,PR" measure="JS" or order="Q,PR" measure="sJS" in the XML format). If the objective is to have similar results to Responsiveness, then, the best combination is "KL τm" (order="QP,QR,PR" measure="KL" in the XML format).