How do you calculate Efficacy Score for multi-token words in your counterfact dataset? #24

Zce1112zslx · 2022-09-13T06:28:41Z

Efficacy score can be calculated by P[o] and P[o*]. The object o can sometimes be tokenized to several sub-words (tokens) and in this case how do you calculate P[o]?

kmeng01 · 2022-09-19T19:46:37Z

Hi! You can see how we do this in the eval code. tl;dr we multiply the probabilities, which is equivalent to summing logprobs: $$p(o) = \prod_{i \in o} p(w_i \mid w_{j < i}) = \exp \left( \log \prod_i p(w_i \mid w_{j < i}) \right) = \exp \left( \sum_i \log p(w_i \mid w_{j < i}) \right).$$

You'll notice that, in practice, we use negative log probabilities: $$p(o) = \exp \left( -\sum_i -\log p(w_i \mid w_{j < i}) \right)$$

Given $p(o), p(o^*)$, we can perform a direct comparison.

kmeng01 closed this as completed Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do you calculate Efficacy Score for multi-token words in your counterfact dataset? #24

How do you calculate Efficacy Score for multi-token words in your counterfact dataset? #24

Zce1112zslx commented Sep 13, 2022

kmeng01 commented Sep 19, 2022 •

edited

How do you calculate Efficacy Score for multi-token words in your counterfact dataset? #24

How do you calculate Efficacy Score for multi-token words in your counterfact dataset? #24

Comments

Zce1112zslx commented Sep 13, 2022

kmeng01 commented Sep 19, 2022 • edited

kmeng01 commented Sep 19, 2022 •

edited