Log probabilities for discriminative models ( issue from StereoSet repo ) #3

Lynx1820 · 2022-05-03T17:50:48Z

Hi all, it says on the StereoSet paper that you compute the average log probability of attribute terms for BERT/RoBERTa, but here it seems like you're just taking the average of probabilities. Is this intentional?

ncmeade · 2022-05-03T21:50:38Z

Hi @Lynx1820,

Thank you for raising this issue! Indeed, this does look like an inconsistency between the StereoSet paper and code. As you state, the code does compute the average token probability of the attribute words. Computing the average log token probability should not change the reported results though because log is a monotonically increasing function.

Hope this helps. Happy to answer any additional questions.

Lynx1820 · 2022-05-04T20:59:20Z

I think you might get conflicting results when you take the average, since log is nonlinear. Say you have probabilities [7.2118e-07, 5.8076e-07] for sentence A and [1.3232e-06, 2.2212e-07] for sentence B. If you take the average of probabilities you would get mean 6.509703e-07 for A and mean 7.726641e-07 for B. Here, B has the higher value. But taking the average of logs, you get (ln(7.2118e-07) + ln(5.8076e-07))/2 = -14.25 for A and (ln(2.2212e-07) + ln(1.3232e-06))/2=-14.42 for B. Here, A has the higher value.

ncmeade · 2022-05-05T00:10:16Z

Yes, you're correct -- thank you for pointing that out! Intuitively, I would expect both scoring methods (average log probability and average probability) to produce similar scores on aggregate across StereoSet (many attribute words are composed of a single token as well), however, I'll need to re-run the models reported in the StereoSet paper to verify. For practical purposes, both methods are sensible scoring techniques.

sivareddyg assigned ncmeade May 3, 2022

ncmeade closed this as completed May 10, 2022

ncmeade mentioned this issue Jun 29, 2022

Log probabilities for discriminative models moinnadeem/StereoSet#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log probabilities for discriminative models ( issue from StereoSet repo ) #3

Log probabilities for discriminative models ( issue from StereoSet repo ) #3

Lynx1820 commented May 3, 2022

ncmeade commented May 3, 2022

Lynx1820 commented May 4, 2022

ncmeade commented May 5, 2022

Log probabilities for discriminative models ( issue from StereoSet repo ) #3

Log probabilities for discriminative models ( issue from StereoSet repo ) #3

Comments

Lynx1820 commented May 3, 2022

ncmeade commented May 3, 2022

Lynx1820 commented May 4, 2022

ncmeade commented May 5, 2022