# This notebook uses the greedy decoding probabilites to obtain the stance label using AfT decoding


In [2]:
from stance_detector.aft_decoding import AFTDecoder

In [3]:
aft = AFTDecoder()
df = aft.decode(
    data = "../data/semeval_taskA_targetAtheism_prompt1_instruction1_prompts--output.pkl",
    context_free_data= "../data/semeval_taskA_targetAtheism_prompt1_instruction1_prompts_free--output.pkl",
    output_path = "../data/semeval_taskA_targetAtheism_prompt1_instruction1_prompts--aft-output.pkl"
    )

2025-05-14 17:23:20,255 - stance_detector.aft_decoding - INFO - Loading context data from: ../data/semeval_taskA_targetAtheism_prompt1_instruction1_prompts--output.pkl
2025-05-14 17:23:20,256 - stance_detector.aft_decoding - INFO - Loading context-free data from: ../data/semeval_taskA_targetAtheism_prompt1_instruction1_prompts_free--output.pkl
2025-05-14 17:23:20,263 - stance_detector.aft_decoding - INFO - Loaded 221 rows after concatenation.
2025-05-14 17:23:20,264 - stance_detector.aft_decoding - INFO - Cleaning class_token_log_probs and computing totals...
2025-05-14 17:23:20,275 - stance_detector.aft_decoding - INFO - One of prompt_template or instruction is not specified. Assuming filename to be of the kind *prompt3f_instruction1* and extracting prompt_template (3f) and instruction (1) from filename...
2025-05-14 17:23:20,276 - stance_detector.aft_decoding - INFO - Filtering data for prompt_template: 1 and instruction: 1
2025-05-14 17:23:20,278 - stance_detector.aft_decoding - INF

In [6]:
df.columns

Index(['ID', 'Target', 'Tweet', 'Stance', 'Prompt', 'label', 'class_tokens',
       'generated_token', 'generated_out_tokens_log_prob',
       'generated_overall_log_prob', 'class_token_log_probs',
       'class_token_log_probs_total', 'aft_label', 'aft_class_token_probs'],
      dtype='object')

- `ID` is the datapoint ID

- `Target` is the target towards which stance is being inferred

- `Tweet` is the tweet to be used to infer stance.

- `label` is of the type x--y where x is an indentifier for the prompt_template and y is the identifier for instruction.

- `Stance` is the ground truth stance label

- `Prompt` is the prompt to the LLM

- `class_tokens` refers to the set of candidate tokens (options) for which we compute the probabilities that they are the outputs of the LLM given the prompt, i.e., P(y|x).

- **`generated_token` refers to the label obtained by greedy decoding**  

- `generated_out_tokens_log_prob` is the log probability of the tokens making up the output by greedy decoding

- `generated_overall_log_prob` is the log probability of the label obtained by greedy decoding 

- `class_token_log_probs` contains the log probabilities of the tokens making up options of interest to us.

- `class_token_log_probs_total` is the total log probabilities obtained by summing up the tokens making up the options of interest to us.

- **`aft_label` refers to the label obtained by AfT decoding**

- `aft_class_token_probs` are the qs corresponding to the class tokens, obtained from Eq.1 [here](https://arxiv.org/pdf/2102.09690)

In [8]:
df.head()

Unnamed: 0,ID,Target,Tweet,Stance,Prompt,label,class_tokens,generated_token,generated_out_tokens_log_prob,generated_overall_log_prob,class_token_log_probs,class_token_log_probs_total,aft_label,aft_class_token_probs
0,10001,Atheism,He who exalts himself shall be humbled; a...,AGAINST,Your response to the question should be either...,1--1,"[Positive, Negative, Neutral, Positive., Negat...",negative,"{'negative': -0.20337235927581787, '</s>': -1....",-0.203387,"[{'Positive': -7.101714611053467, '</s>': -1.4...","[-7.101729512325619, -26.051087379455566, -29....",negative,"[0.03225609830530999, 0.04188946711369155, 0.0..."
1,10002,Atheism,RT @prayerbullets: I remove Nehushtan -previou...,AGAINST,Your response to the question should be either...,1--1,"[Positive, Negative, Neutral, Positive., Negat...",negative,"{'negative': -0.18029488623142242, '</s>': -1....",-0.180311,"[{'Positive': -7.135680198669434, '</s>': -1.5...","[-7.135696053630454, -25.47551918029785, -28.5...",negative,"[0.018233294800946893, 0.04081898693148269, 0...."
2,10003,Atheism,@Brainman365 @heidtjj @BenjaminLives I have so...,AGAINST,Your response to the question should be either...,1--1,"[Positive, Negative, Neutral, Positive., Negat...",positive,"{'positive': -0.7586599588394165, '</s>': -3.1...",-0.758691,"[{'Positive': -4.416901111602783, '</s>': -3.1...","[-4.416932344924135, -26.667831420898438, -27....",Positive.,"[0.008906033427673586, 2.7125137195794932e-05,..."
3,10004,Atheism,#God is utterly powerless without Human interv...,AGAINST,Your response to the question should be either...,1--1,"[Positive, Negative, Neutral, Positive., Negat...",neutral,"{'neutral': -0.6100170016288757, '</s>': -2.44...",-0.610041,"[{'Positive': -5.93135404586792, '</s>': -2.44...","[-5.931378484070592, -26.357500553131104, -29....",Positive.,"[0.08078995420148434, 0.03441531108412011, 0.0..."
4,10005,Atheism,@David_Cameron Miracles of #Multiculturalism...,AGAINST,Your response to the question should be either...,1--1,"[Positive, Negative, Neutral, Positive., Negat...",negative,"{'negative': -0.31088513135910034, '</s>': -2....",-0.310907,"[{'Positive': -6.788811206817627, '</s>': -2.2...","[-6.788833379991047, -25.646247386932373, -28....",Neutral,"[0.01115218160440675, 0.017439696723030893, 0...."
