Description
Hi everyone,
I have a question about LLM contribution, in the picture. this is Perturbation-based Attribution method. the basic idea is that replace the words in order for example
I love you
after tokenized
10 20 30
then it use a "0"(you can change the "0" to something else) to replace the order to see the target_id log_softmax change. it is using baseline's log_softmax(target_id) - replace the words's log_softmax(target_id)
'0 20 30'
'10,0,30'
'10.20,0'
so in my opinion, need we use "absolute value" to evaluate the importance of tokens?
For example, if contribution is [-3.5,3.6,1], the first important token is token_1(3.6) and the second is token_0(-3.5) and third is token_2(1)
also in the LLMGradientAttribution method
the final step is , https://github.com/pytorch/captum/blob/master/captum/attr/_core/llm_attr.py#L570 , it will sum the gradients on the last dim. I have a question how to eval the importance of tokens, Does bigger mean more important?
for example, after sum , contribution is [-3.5,3.6,1], so ,is that means the first important token is token_1(3.6) and the second is token_2(1) and third is token_0(-3.5) ?
Thanks