<a href="https://colab.research.google.com/github/Reese-Martin/MI_practice/blob/main/streamlit_TransformerLens_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
# colab needs to have non-standard libraries reinstalled (because I am being lazy)
%pip install einops fancy_einsum torchtyping transformer_lens circuitsvis plot_utils

Collecting plot_utils
  Downloading plot_utils-0.6.14-py2.py3-none-any.whl (13.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.3/13.3 MB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: plot_utils
Successfully installed plot_utils-0.6.14


In [4]:
# imports come straight from the streamlit page
import os; os.environ["ACCELERATE_DISABLE_RICH"] = "1"
import plotly.express as px
import plotly.io as pio
pio.renderers.default = "notebook_connected" # or use "browser" if you want plots to open with browser
import plotly.graph_objects as go
import torch as t
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import einops
from fancy_einsum import einsum
from torchtyping import TensorType as TT
from typing import List, Optional, Tuple, Union
import functools
from tqdm import tqdm
from IPython.display import display

from transformer_lens.hook_points import HookPoint
from transformer_lens import utils, HookedTransformer, HookedTransformerConfig, FactoredMatrix, ActivationCache
import circuitsvis as cv

import tests
import plot_utils

# Saves computation time, since we don't need it for the contents of this notebook
t.set_grad_enabled(False)

MAIN = __name__ == "__main__"

def imshow(tensor, xaxis="", yaxis="", caxis="", **kwargs):
    return px.imshow(utils.to_numpy(tensor), color_continuous_midpoint=0.0, color_continuous_scale="RdBu", labels={"x":xaxis, "y":yaxis, "color":caxis}, **kwargs)

def line(tensor, xaxis="", yaxis="", **kwargs):
    return px.line(utils.to_numpy(tensor), labels={"x":xaxis, "y":yaxis}, **kwargs)

def scatter(x, y, xaxis="", yaxis="", caxis="", **kwargs):
    x = utils.to_numpy(x)
    y = utils.to_numpy(y)
    return px.scatter(y=y, x=x, labels={"x":xaxis, "y":yaxis, "color":caxis}, **kwargs)

In [21]:
device = t.device("cuda" if t.cuda.is_available() else "cpu")

gpt2_small = HookedTransformer.from_pretrained("gpt2-small", device=device)

Loaded pretrained model gpt2-small into HookedTransformer


In [20]:
# EXERCISE: inspect model for num layers, Heads/Layer, Maximum context
# use model.cfg to see all params, model.cfg.BLAH to see blah

print('Layers: ', gpt2_small.cfg.n_layers)
print('Heads/Layer: ', gpt2_small.cfg.n_heads)
print('Max context: ', gpt2_small.cfg.n_ctx)

# weirdly, it seems the wrong model was loaded. rather than gpt2_small, we loaded gpt2
# (main difference is 12 layers instead of 2 and 2048 ctx instead of 1048). looks like
# there were changes to the transformerLens code after this tutorial was created

Layers:  12
Heads/Layer:  12
Max context:  1024


In [22]:
# digging in to running the model and loss
model_description_text = '''## Loading Models

HookedTransformer comes loaded with over 40 open source GPT-style models. You can load any of them in with `HookedTransformer.from_pretrained(MODEL_NAME)`. Each model is loaded into the consistent HookedTransformer architecture, designed to be clean, consistent and interpretability-friendly.

For this demo notebook we'll look at GPT-2 Small, an 80M parameter model. To try the model the model out, let's find the loss on this paragraph!'''

loss = gpt2_small(model_description_text, return_type="loss")
print("Model loss:", loss)

Model loss: tensor(4.3204, device='cuda:0')


Digression for the difference between Parameters and activations.

Parameters: the weights/biases of the trained model. Will not change when model input changes. Accessible directly from the model.

Activations: temp. numbers calculated during the forward pass. Normally inacessible, functions of the input. Hooks are needed to access these values during a forward pass. **Attention Scores and patterns are activations**

Useful shortcuts:
- you can access the weights of the model in two ways
  - model.blocks[n].attn.W_Q which returns the nth blocks querry weights.
  - model.W_Q returns the [nLayers, nHeads, d_model, d_head] querry weights of the entire model. such shortcuts exist for the W_E, W_U, W_Pos matrices as well.
  - models containing MLP layers will also have W_in and W_out for the linear layers
  - all true for biases

The model stores its tokenizer, accessible by model.tokenizer
- model.to_str_tokens(text) converts a string into a tensor of tokens-as-strings.
- model.to_tokens(text) converts a string into a tensor of tokens.
- model.to_string(tokens) converts a tensor of tokens into a string.

In [23]:
# tokenizer examples
print(gpt2_small.to_str_tokens("gpt2"))
print(gpt2_small.to_tokens("gpt2"))
print(gpt2_small.to_string([50256,70,457,17]))

['<|endoftext|>', 'g', 'pt', '2']
tensor([[50256,    70,   457,    17]], device='cuda:0')
<|endoftext|>gpt2
