In [4]:
import os
import torch
from dotenv import load_dotenv
from module import Model

load_dotenv()
token = os.environ.get("HF_TOKEN")

if torch.backends.mps.is_available():
    torch.mps.empty_cache()

## Links

- https://huggingface.co/docs/transformers/v4.53.3/en/main_classes/text_generation#transformers.GenerationConfig

- https://huggingface.co/google/gemma-2-2b-it

- https://huggingface.co/google/gemma-2-2b-it/blob/main/config.json

## Gemma 2B Config

```json
{
  "architectures": [
    "Gemma2ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "attn_logit_softcapping": 50.0,
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "eos_token_id": [
    1,
    107
  ],
  "final_logit_softcapping": 30.0,
  "head_dim": 256,
  "hidden_act": "gelu_pytorch_tanh",
  "hidden_activation": "gelu_pytorch_tanh",
  "hidden_size": 2304,
  "initializer_range": 0.02,
  "intermediate_size": 9216,
  "max_position_embeddings": 8192,
  "model_type": "gemma2",
  "num_attention_heads": 8,
  "num_hidden_layers": 26,
  "num_key_value_heads": 4,
  "pad_token_id": 0,
  "query_pre_attn_scalar": 256,
  "rms_norm_eps": 1e-06,
  "rope_theta": 10000.0,
  "sliding_window": 4096,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.42.4",
  "use_cache": true,
  "vocab_size": 256000
}
```

In [5]:
name = "google/gemma-2-2b-it"
model = Model(name, token)

Loading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00,  3.20s/it]


In [None]:
message = """
The supermodels name is goblin master.

And finally, bird-watchers everywhere have reported that the nation’s
owls have been behaving very unusually today. Although owls normally hunt
at night and are hardly ever seen in daylight, there have been hundreds of
sightings of these birds flying in every direction since sunrise. Experts are
unable to explain why the owls have suddenly changed their sleeping
pattern.
” The newscaster allowed himself a grin.
“Most mysterious. And now,
over to Jim McGuffin with the weather. Going to be any more showers of
owls tonight, Jim?”
“Well, Ted,
” said the weatherman,
“I don’t know about that, but it’s not
only the owls that have been acting oddly today. Viewers as far apart as Kent,
Y orkshire, and Dundee have been phoning in to tell me that instead of the rain
I promised yesterday, they’ve had a downpour of shooting stars! Perhaps
people have been celebrating Bonfire Night early — it’s not until next week,
folks! But I can promise a wet night tonight.
”
Mr. Dursley sat frozen in his armchair. Shooting stars all over Britain?
Owls flying by daylight? Mysterious people in cloaks all over the place? And
a whisper, a whisper about the Potters . . .
Mrs. Dursley came into the living room carrying two cups of tea. It was no
good. He’d have to say something to her. He cleared his throat nervously.
“Er
— Petunia, dear — you haven’t heard from your sister lately, have you?”
As he had expected, Mrs. Dursley looked shocked and angry. After all, they
normally pretended she didn’t have a sister.
“No,
” she said sharply.
“Why?”
“Funny stuff on the news,
” Mr. Dursley mumbled.
“Owls . . . shooting
stars . . . and there were a lot of funny-looking people in town today . . .
”
“So?” snapped Mrs. Dursley.
“Well, I just thought . . . maybe . . . it was something to do with . . . you
know . . . her crowd.
”
Mrs. Dursley sipped her tea through pursed lips. Mr. Dursley wondered
whether he dared tell her he’d heard the name “Potter.
” He decided he didn’t
dare. Instead he said, as casually as he could,
Dudley’s age now, wouldn’t he?”
“Their son — he’d be about
“I suppose so,
” said Mrs. Dursley stiffly.
“What’s his name again? Howard, isn’t it?”
“Harry. Nasty, common name, if you ask me.
”
“Oh, yes,
” said Mr. Dursley, his heart sinking horribly.
“Y es, I quite agree.
”
He didn’t say another word on the subject as they went upstairs to bed.
While Mrs. Dursley was in the bathroom, Mr. Dursley crept to the bedroom
window and peered down into the front garden. The cat was still there. It was
staring down Privet Drive as though it were waiting for something.
Was he imagining things? Could all this have anything to do with the
Potters? If it did . . . if it got out that they were related to a pair of — well, he
didn’t think he could bear it.
The Dursleys got into bed. Mrs. Dursley fell asleep quickly but Mr. Dursley
lay awake, turning it all over in his mind. His last, comforting thought before
he fell asleep was that even if the Potters were involved, there was no reason
for them to come near him and Mrs. Dursley. The Potters knew very well
what he and Petunia thought about them and their kind. . . . He couldn’t see
how he and Petunia could get mixed up in anything that might be going on —
he yawned and turned over — it couldn’t affect them. . . .
How very wrong he was.
Mr. Dursley might have been drifting into an uneasy sleep, but the cat on
the wall outside was showing no sign of sleepiness. It was sitting as still as a
statue, its eyes fixed unblinkingly on the far corner of Privet Drive. It didn’t
so much as quiver when a car door slammed on the next street, nor when two
owls swooped overhead. In fact, it was nearly midnight before the cat moved
at all.
A man appeared on the corner the cat had been watching, appeared so
suddenly and silently you’d have thought he’d just popped out of the ground.
The cat’s tail twitched and its eyes narrowed.
Nothing like this man had ever been seen on Privet Drive. He was tall, thin,
and very old, judging by the silver of his hair and beard, which were both
long enough to tuck into his belt. He was wearing long robes, a purple cloak
that swept the ground, and high-heeled, buckled boots. His blue eyes were
light, bright, and sparkling behind half-moon spectacles and his nose was very
long and crooked, as though it had been broken at least twice. This man’s
name was Albus Dumbledore.
Albus Dumbledore didn’t seem to realize that he had just arrived in a street
where everything from his name to his boots was unwelcome. He was busy
rummaging in his cloak, looking for something. But he did seem to realize he
was being watched, because he looked up suddenly at the cat, which was still
staring at him from the other end of the street. For some reason, the sight of
the cat seemed to amuse him. He chuckled and muttered,
“I should have
known.
”
He found what he was looking for in his inside pocket. It seemed to be a
silver cigarette lighter. He flicked it open, held it up in the air, and clicked it.
The nearest street lamp went out with a little pop. He clicked it again — the
next lamp flickered into darkness. Twelve times he clicked the Put-Outer,
until the only lights left on the whole street were two tiny pinpricks in the
distance, which were the eyes of the cat watching him. If anyone looked out
of their window now, even beady-eyed Mrs. Dursley, they wouldn’t be able to
see anything that was happening down on the pavement. Dumbledore slipped
the Put-Outer back inside his cloak and set off down the street toward number
four, where he sat down on the wall next to the cat. He didn’t look at it, but
after a moment he spoke to it.
“Fancy seeing you here, Professor McGonagall.
”
He turned to smile at the tabby, but it had gone. Instead he was smiling at a
rather severe-looking woman who was wearing square glasses exactly the
shape of the markings the cat had had around its eyes. She, too, was wearing a
cloak, an emerald one. Her black hair was drawn into a tight bun. She looked
distinctly ruffled.
“How did you know it was me?” she asked.

What is the supermodels name? 
"""

In [7]:
results = model.invoke_model(message)

In [8]:
print(results["text_outputs"])

user
The supermodels name is goblin master.

And finally, bird-watchers everywhere have reported that the nation’s
owls have been behaving very unusually today. Although owls normally hunt
at night and are hardly ever seen in daylight, there have been hundreds of
sightings of these birds flying in every direction since sunrise. Experts are
unable to explain why the owls have suddenly changed their sleeping
pattern.
” The newscaster allowed himself a grin.
“Most mysterious. And now,
over to Jim McGuffin with the weather. Going to be any more showers of
owls tonight, Jim?”
“Well, Ted,
” said the weatherman,
“I don’t know about that, but it’s not
only the owls that have been acting oddly today. Viewers as far apart as Kent,
Y orkshire, and Dundee have been phoning in to tell me that instead of the rain
I promised yesterday, they’ve had a downpour of shooting stars! Perhaps
people have been celebrating Bonfire Night early — it’s not until next week,
folks! But I can promise a wet night ton

## Prompt Attention Of a Single Head and Layer

In [9]:
prompt_attn_results = model.prompt_attention(results, layer=0, head=4, top_n=5)

<bos>        → <bos>        | attn: 1.0000
ley          → Dud          | attn: 0.9282
on           → ▁McG         | attn: 0.9214
led          → ▁buck        | attn: 0.9033
ters         → Pot          | attn: 0.9004


## Attention Of Generated Text at a Single Generation Step

In [10]:
gen_attn_results = model.gen_attention(results, layer=1, head=3, top_n=5)

gen_txt[30]  → <bos>        | attn: 0.1339
gen_txt[30]  → <bos>        | attn: 0.1338
gen_txt[30]  → context[1629] | attn: 0.0546
gen_txt[30]  → <end_of_turn> | attn: 0.0359
gen_txt[30]  → context[1633] | attn: 0.0342


## Attention Of Generated Text at ALL Generation Steps

In [11]:
all_gen_attn_results = model.gen_attention_all_steps(results, 0, 0)

Generation step 1 

gen_txt[0]   → <bos>        | attn: 1.0000
gen_txt[0]   → didn         | attn: 0.9971
gen_txt[0]   → <start_of_turn> | attn: 0.9956
gen_txt[0]   → ▁isn         | attn: 0.9951
gen_txt[0]   → ▁didn        | attn: 0.9946


Generation step 2 

gen_txt[1]   → 
            | attn: 0.7451
gen_txt[1]   → model        | attn: 0.0870
gen_txt[1]   → <bos>        | attn: 0.0447
gen_txt[1]   → <bos>        | attn: 0.0446
gen_txt[1]   → context[1617] | attn: 0.0134


Generation step 3 

gen_txt[2]   → context[1617] | attn: 0.5352
gen_txt[2]   → <bos>        | attn: 0.1052
gen_txt[2]   → <bos>        | attn: 0.1044
gen_txt[2]   → context[1618] | attn: 0.0737
gen_txt[2]   → 
            | attn: 0.0446


Generation step 4 

gen_txt[3]   → context[1618] | attn: 0.7422
gen_txt[3]   → <bos>        | attn: 0.0585
gen_txt[3]   → <bos>        | attn: 0.0565
gen_txt[3]   → context[1617] | attn: 0.0467
gen_txt[3]   → context[1619] | attn: 0.0162


Generation step 5 

gen_txt[4]   → context[

## Prompt Attention of ALL Heads and a Single Layer

In [12]:

num_heads = model.model.config.num_attention_heads
all_heads_prompt_attn = model.prompt_attn_all_heads(results, layer=2, num_heads=num_heads)

HEAD 1

<bos>        → <bos>        | attn: 1.0000
four         → ▁number      | attn: 0.9868
▁asleep      → ▁fell        | attn: 0.9771
▁asleep      → ▁fell        | attn: 0.9600
▁is          → What         | attn: 0.9590


HEAD 2

<bos>        → <bos>        | attn: 1.0000
pour         → ▁down        | attn: 0.9966
ma           → rum          | attn: 0.9761
vet          → ▁Pri         | attn: 0.9731
vet          → ▁Pri         | attn: 0.9712


HEAD 3

<bos>        → <bos>        | attn: 1.0000
ned          → ▁yaw         | attn: 0.9634
led          → hee          | attn: 0.9434
cks          → pri          | attn: 0.9199
ed           → ▁flick       | attn: 0.9019


HEAD 4

<bos>        → <bos>        | attn: 1.0000
ped          → ped          | attn: 0.9961
’            → ’            | attn: 0.9951
sed          → sed          | attn: 0.9897
cks          → cks          | attn: 0.9868


HEAD 5

<bos>        → <bos>        | attn: 1.0000
▁into        → ▁tuck        | attn: 0.9902
ed    

## Attention of Generated Text of ALL Heads and a Single Layer

In [13]:
all_heads_gen_attn = model.gen_attn_all_heads(results, layer=3, num_heads=num_heads)

HEAD 1

gen_txt[30]  → <bos>        | attn: 0.2244
gen_txt[30]  → <bos>        | attn: 0.2244
gen_txt[30]  → context[1644] | attn: 0.1346
gen_txt[30]  → context[1642] | attn: 0.0720
gen_txt[30]  → context[1643] | attn: 0.0410


HEAD 2

gen_txt[30]  → context[1629] | attn: 0.1346
gen_txt[30]  → context[1632] | attn: 0.1083
gen_txt[30]  → <bos>        | attn: 0.0690
gen_txt[30]  → <bos>        | attn: 0.0685
gen_txt[30]  → context[1631] | attn: 0.0573


HEAD 3

gen_txt[30]  → <bos>        | attn: 0.2145
gen_txt[30]  → <bos>        | attn: 0.2136
gen_txt[30]  → context[1646] | attn: 0.0935
gen_txt[30]  → 
            | attn: 0.0506
gen_txt[30]  → 
            | attn: 0.0284


HEAD 4

gen_txt[30]  → model        | attn: 0.4216
gen_txt[30]  → <bos>        | attn: 0.0858
gen_txt[30]  → <bos>        | attn: 0.0855
gen_txt[30]  → context[1646] | attn: 0.0791
gen_txt[30]  → 
            | attn: 0.0552


HEAD 5

gen_txt[30]  → <bos>        | attn: 0.3416
gen_txt[30]  → <bos>        | attn: 0.340

In [14]:
pd_results = model.gen_attn_all_heads_all_gen_steps(results, 2, num_heads)

In [16]:
len(pd_results)

8

In [17]:
pd_results[0]

  has_large_values = (abs_vals > 1e6).any()


Unnamed: 0,gen_step,attention,query_token,key_token
0,0,1.000000,gen_txt[0],<bos>
1,0,0.986816,gen_txt[0],▁number
2,0,0.977051,gen_txt[0],▁fell
3,0,0.959961,gen_txt[0],▁fell
4,0,0.958984,gen_txt[0],What
...,...,...,...,...
1235,30,0.001690,gen_txt[30],context[1643]
1236,30,0.001671,gen_txt[30],context[1642]
1237,30,0.001515,gen_txt[30],\n\n
1238,30,0.001492,gen_txt[30],▁the
