You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, it says on the StereoSet paper that you compute the average log probability of attribute terms for BERT/RoBERTa, but here it seems like you're just taking the average of probabilities. Is this intentional?
The text was updated successfully, but these errors were encountered:
Thank you for raising this issue! Indeed, this does look like an inconsistency between the StereoSet paper and code. As you state, the code does compute the average token probability of the attribute words. Computing the average log token probability should not change the reported results though because log is a monotonically increasing function.
Hope this helps. Happy to answer any additional questions.
I think you might get conflicting results when you take the average, since log is nonlinear. Say you have probabilities [7.2118e-07, 5.8076e-07] for sentence A and [1.3232e-06, 2.2212e-07] for sentence B. If you take the average of probabilities you would get mean 6.509703e-07 for A and mean 7.726641e-07 for B. Here, B has the higher value. But taking the average of logs, you get (ln(7.2118e-07) + ln(5.8076e-07))/2 = -14.25 for A and (ln(2.2212e-07) + ln(1.3232e-06))/2=-14.42 for B. Here, A has the higher value.
Yes, you're correct -- thank you for pointing that out! Intuitively, I would expect both scoring methods (average log probability and average probability) to produce similar scores on aggregate across StereoSet (many attribute words are composed of a single token as well), however, I'll need to re-run the models reported in the StereoSet paper to verify. For practical purposes, both methods are sensible scoring techniques.
Hi all, it says on the StereoSet paper that you compute the average log probability of attribute terms for BERT/RoBERTa, but here it seems like you're just taking the average of probabilities. Is this intentional?
The text was updated successfully, but these errors were encountered: