# Weight Loading Deep Dive

This notebook will make sense of the `load_weights` function introduced in chapter 5 and referenced in `weight-loading-hf-transformers` supplementary material. We will first load all necessary dependencies.

In [None]:
from importlib.metadata import version
import torch
import numpy as np
import transformers

pkgs = ["numpy", "torch", "transformers"]
for p in pkgs:
    print(f"{p} version: {version(p)}")

In [None]:
from transformers import GPT2Model


# allowed model names
model_names = {
    "gpt2-small (124M)": "openai-community/gpt2",
    "gpt2-medium (355M)": "openai-community/gpt2-medium",
    "gpt2-large (774M)": "openai-community/gpt2-large",
    "gpt2-xl (1558M)": "openai-community/gpt2-xl"
}

CHOOSE_MODEL = "gpt2-small (124M)"

gpt_hf = GPT2Model.from_pretrained(model_names[CHOOSE_MODEL], cache_dir="checkpoints")
gpt_hf.eval()

Create model configs. Note that we have externalized them into the `model_config` directory.

In [2]:
import json

with open("model_config/base_config.json", "r") as base_file:
	BASE_CONFIG = json.load(base_file)

with open("model_config/model_configs.json", "r") as model_file:
	model_configs = json.load(model_file)

with open("model_config/model_names.json", "r") as names_file:
	model_names = json.load(names_file)

We note that we will be working with the model `gpt2-small (124M)`. We will therefore use the `CHOOSE_MODEL` variable to index into the appropriate configurations.

In [None]:
CHOOSE_MODEL = "gpt2-small (124M)"