#Task
Sort the attention heads in a decreasing order of the last layer based on their operational contribution. (use Few-shot In-context Learning). Please follow the steps in order to generate a attention head’s operational score:<br>
**Step 1:** Get access to the given dataset <br>
**Step 2:** Get a small size generative LM ( like GPT-Neo) <br>
**Step 3:** Evaluate the dataset and collect the performance scores, let’s call it result_original. <br>
**Step 4:** Now, for each attention head in the last layer <br>
&nbsp;&nbsp;&nbsp;&nbsp;**Step 4.1:** output(attention_head_i) += small noise<br>
&nbsp;&nbsp;&nbsp;&nbsp;**Step 4.2:** Propagate the noisy output of the attention_head_i <br>
&nbsp;&nbsp;&nbsp;&nbsp;**Step 4.3:** Evaluate the dataset. <br>
&nbsp;&nbsp;&nbsp;&nbsp;**Step 4.5:** Collect the performance scores, let’s call it result_atth_i<br>
**Step 5:** For each result_atth_i, measure the it’s deviation from the result_original.<br>
**Step 6:** Sort and print the output.<br>

In [69]:
!pip install -q transformers

##Step 1: Get access to the given dataset

In [71]:
import pandas as pd
df = pd.read_csv("/content/IMDB Dataset_100.csv")

##Step 2: Get a small size generative LM (GPT-Neo)

In [72]:
from transformers import GPTNeoForSequenceClassification, GPT2Tokenizer, GPTNeoConfig, GPTNeoModel

model_name = "EleutherAI/gpt-neo-125M"
model = GPTNeoForSequenceClassification.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Some weights of GPTNeoForSequenceClassification were not initialized from the model checkpoint at EleutherAI/gpt-neo-125M and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


##Step 3: Evaluate and store into `result_original`.

In [73]:
import torch
import numpy as np

result_original = []

model.to('cuda')

for index, row in df.iterrows():
    review = row['review']
    inputs = tokenizer(review, return_tensors="pt")
    inputs = {key: value.to('cuda') for key, value in inputs.items()}
    outputs = model(**inputs)
    logits = outputs.logits.cpu()
    probabilities = logits.softmax(dim=1)
    positive_probability = probabilities[0, 1].item()
    result_original.append(positive_probability)
print(result_original)

[0.721659243106842, 0.8406364917755127, 0.689924955368042, 0.665015697479248, 0.8650954961776733, 0.9385914206504822, 0.595157265663147, 0.8775373697280884, 0.9504719376564026, 0.5759947896003723, 0.8984001874923706, 0.789953351020813, 0.7452144622802734, 0.721065104007721, 0.6482610106468201, 0.35021474957466125, 0.9162788391113281, 0.686420202255249, 0.8041203022003174, 0.95430988073349, 0.7971043586730957, 0.5591995716094971, 0.8720271587371826, 0.6464569568634033, 0.7098718881607056, 0.8491998314857483, 0.8745180368423462, 0.9682626128196716, 0.891862690448761, 0.9238184690475464, 0.8022550940513611, 0.7947177886962891, 0.918543815612793, 0.9366992115974426, 0.8995645046234131, 0.8472121357917786, 0.9358227849006653, 0.5351129174232483, 0.9151209592819214, 0.8351203799247742, 0.7154499292373657, 0.8338116407394409, 0.8348850607872009, 0.915425717830658, 0.3634653687477112, 0.8799289464950562, 0.9410078525543213, 0.9338600635528564, 0.9617289900779724, 0.7853178381919861, 0.60681277

##Step 4: Few-shot In-context Learning & Evaluation

In [74]:
result_atth_i = []

model.to('cuda')

for attention_head_i in range(model.config.num_heads):
    noise = np.random.normal(0, 0.01, size=(model.config.hidden_size)) # Step 4.1.: output(attention_head_i) += small noise
    noise_tensor = torch.tensor(noise, device='cuda')
    model.transformer.h[-1].attn.attention.out_proj.weight.data[attention_head_i] += noise_tensor # Step 4.2.: Propagate the noisy output of the attention_head_i
    result_i = []

    for index, row in df.iterrows(): #Evaluate the dataset
        review = row['review']
        inputs = tokenizer(review, return_tensors="pt")
        inputs = {key: value.to('cuda') for key, value in inputs.items()}
        outputs = model(**inputs)
        logits = outputs.logits.cpu()
        probabilities = logits.softmax(dim=1)
        positive_probability = probabilities[0, 1].item()
        result_i.append(positive_probability)

    result_atth_i.append(result_i) #Collect the performance scores, let’s call it result_atth_i

print(result_atth_i)

[[0.7212514877319336, 0.8404009938240051, 0.6894004344940186, 0.6646501421928406, 0.8648650646209717, 0.9384351372718811, 0.5948479175567627, 0.8773234486579895, 0.9504034519195557, 0.5754055380821228, 0.8984019160270691, 0.789556622505188, 0.745013952255249, 0.7206598520278931, 0.6481087803840637, 0.349952757358551, 0.9161557555198669, 0.6860204935073853, 0.8038477897644043, 0.9542045593261719, 0.7968939542770386, 0.5586882829666138, 0.8717269897460938, 0.6461482048034668, 0.709387481212616, 0.8490222096443176, 0.874403178691864, 0.9681879878044128, 0.8916956186294556, 0.9236579537391663, 0.8019634485244751, 0.7941597104072571, 0.9183496832847595, 0.9366347193717957, 0.899427056312561, 0.846964955329895, 0.9356762766838074, 0.5350834727287292, 0.9149095416069031, 0.8350349068641663, 0.7150207757949829, 0.8335070013999939, 0.834594190120697, 0.9152635931968689, 0.36336371302604675, 0.8799185156822205, 0.9408536553382874, 0.9336664080619812, 0.9616639614105225, 0.785409152507782, 0.6063

##Step 5: For each result_atth_i, measure the it’s deviation from the result_original

In [75]:
deviations = [(np.mean(np.abs(np.array(result_original) - np.array(result_i)))*100)/(np.mean(np.array(result_original))) for result_i in result_atth_i]

##Step 6: Sort and print the output

In [76]:
sorted_attention_heads = np.argsort(deviations)[::-1]

for i, attention_head_index in enumerate(sorted_attention_heads):
    print(f"Attention Head {attention_head_index} - Deviation: {deviations[attention_head_index]:.4f}")

Attention Head 5 - Deviation: 0.0587
Attention Head 6 - Deviation: 0.0574
Attention Head 11 - Deviation: 0.0571
Attention Head 2 - Deviation: 0.0509
Attention Head 1 - Deviation: 0.0506
Attention Head 10 - Deviation: 0.0493
Attention Head 4 - Deviation: 0.0481
Attention Head 3 - Deviation: 0.0479
Attention Head 8 - Deviation: 0.0422
Attention Head 9 - Deviation: 0.0417
Attention Head 7 - Deviation: 0.0387
Attention Head 0 - Deviation: 0.0294
