<a href="https://colab.research.google.com/github/daka13/HowLLMsWork/blob/main/LLMs_embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install transformers
!pip install einops

# Inspect the embedding layers of LLMs

All LLMs have an initial layer of *embeddings* that maps tokens to vectors. In this mini-project you will practice extracting the embedding layer from an LLM, and

In [None]:
# Load model directly
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1_5")
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True)

Downloading pytorch_model.bin:   0%|          | 0.00/2.84G [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/69.0 [00:00<?, ?B/s]

Inspect the model architecture to determine where the initial embedding layer is. It will be by far the largest dimension, mapping the size of the vocabulary to the internal dimension.

In [None]:
model

MixFormerSequentialForCausalLM(
  (layers): Sequential(
    (0): Embedding(
      (wte): Embedding(51200, 2048)
      (drop): Dropout(p=0.0, inplace=False)
    )
    (1): ParallelBlock(
      (ln): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
      (resid_dropout): Dropout(p=0.0, inplace=False)
      (mixer): MHA(
        (rotary_emb): RotaryEmbedding()
        (Wqkv): Linear(in_features=2048, out_features=6144, bias=True)
        (out_proj): Linear(in_features=2048, out_features=2048, bias=True)
        (inner_attn): SelfAttention(
          (drop): Dropout(p=0.0, inplace=False)
        )
        (inner_cross_attn): CrossAttention(
          (drop): Dropout(p=0.0, inplace=False)
        )
      )
      (mlp): MLP(
        (fc1): Linear(in_features=2048, out_features=8192, bias=True)
        (fc2): Linear(in_features=8192, out_features=2048, bias=True)
        (act): NewGELUActivation()
      )
    )
    (2): ParallelBlock(
      (ln): LayerNorm((2048,), eps=1e-05, elementwis

In [None]:
embeddings = model.layers[0].wte.weight.data.numpy()

In [None]:
import numpy as np

vocab = np.empty(tokenizer.vocab_size, dtype=object)
for i in range(tokenizer.vocab_size):
  vocab[i] = tokenizer.decode(i)

In [None]:
# Sort the vocabulary according to the vector provided, print the words at both extremes
def sort_words(v, n=10):
    words = sorted(zip(v, vocab), reverse=True)
    output = [["{}".format(word) for score, word in words[:n]], "...",
            ["{}".format(word) for score, word in words[-n:]]]
    return output

def cosine_sim(w_id):
  norms = np.linalg.norm(embeddings, axis=1)
  inner_products = embeddings @ embeddings[w_id,:]
  inner_products /= (norms * norms[w_id])
  return inner_products

In [None]:
vocab[5500:5700]

array(['uts', ' Each', ' Jeff', ' stress', ' accounts', ' guarant',
       ' Ann', 'edia', ' honest', ' tree', ' African', ' Bush', '},',
       ' sch', ' Only', ' fif', 'igan', ' exercise', ' Exp',
       ' scientists', ' legislation', ' Work', ' Spr', 'Â', ' Human',
       ' �', ' survey', ' rich', 'rip', ' maintain', ' flo',
       ' leadership', 'stream', ' Islamic', ' 01', ' College', ' magic',
       ' Prime', ' figures', '2017', 'inder', 'xual', ' Dead',
       ' absolutely', ' fourth', ' presented', 'respond', 'rible',
       ' alcohol', 'ato', ' DE', 'porary', ' grab', ' vari', ' quant',
       ' Photo', ' plus', 'rick', 'arks', ' alternative', ' pil',
       ' approx', 'that', ' objects', ' Ro', ' Android', ' significantly',
       ' Road', 'kay', 'Read', 'avor', ' acknow', ' HD', ' Sing', 'Or',
       ' Mont', ' uns', 'prof', ' negoti', ' Arch', 'iki', ' television',
       ' Jewish', ' committee', ' motor', ' appearance', ' sitting',
       ' strike', ' Down', 'comp', ' His

In [None]:
tokenizer.encode(" College")

[5535]

In [None]:
sort_words(cosine_sim(5535))

[[' College',
  ' college',
  'College',
  ' colleges',
  'college',
  ' Colleges',
  ' University',
  ' School',
  ' university',
  'University'],
 '...',
 [' skirm',
  ' benign',
  'Thor',
  ' muttered',
  'Tact',
  'pull',
  ' puzz',
  '\x03',
  'ispers',
  ' ruth']]

In [None]:
sort_words(np.linalg.norm(embeddings, axis=1))

'� �  TheNitrome  guiIcon  TheNitromeFan  guiActive  unfocusedRange channelAvailability GoldMagikarp  srfN ... -  by  that  as  on  to  for  from  with  in'