# Saving and loading the GPT2 model

We saved the model we created after training at:

```python
torch.save(model.state_dict(), 'model.pth')
```

Now we learn how to deal with it

In [1]:
import urllib.request

In [2]:

url = (
    "https://raw.githubusercontent.com/rasbt/"
    "LLMs-from-scratch/main/ch05/"
    "01_main-chapter-code/gpt_download.py"
)
filename = url.split('/')[-1]
urllib.request.urlretrieve(url, filename)


('gpt_download.py', <http.client.HTTPMessage at 0x105f4f590>)

In [3]:
from gpt_download import download_and_load_gpt2

settings, params = download_and_load_gpt2(
    model_size="124M", models_dir="gpt2"
)

checkpoint: 100%|██████████| 77.0/77.0 [00:00<00:00, 37.8kiB/s]
encoder.json: 100%|██████████| 1.04M/1.04M [00:00<00:00, 1.91MiB/s]
hparams.json: 100%|██████████| 90.0/90.0 [00:00<00:00, 65.4kiB/s]
model.ckpt.data-00000-of-00001: 100%|██████████| 498M/498M [06:00<00:00, 1.38MiB/s] 
model.ckpt.index: 100%|██████████| 5.21k/5.21k [00:00<00:00, 2.12MiB/s]
model.ckpt.meta: 100%|██████████| 471k/471k [00:00<00:00, 1.14MiB/s]
vocab.bpe: 100%|██████████| 456k/456k [00:00<00:00, 1.50MiB/s]


In [4]:
print("Settings", settings)
print("Parameter dictionary keys:", params.keys())

Settings {'n_vocab': 50257, 'n_ctx': 1024, 'n_embd': 768, 'n_head': 12, 'n_layer': 12}
Parameter dictionary keys: dict_keys(['blocks', 'b', 'g', 'wpe', 'wte'])


In [5]:
print(params["wte"])
print("Token embedding weight tensor dimensions:", params["wte"].shape)

[[-0.11010301 -0.03926672  0.03310751 ... -0.1363697   0.01506208
   0.04531523]
 [ 0.04034033 -0.04861503  0.04624869 ...  0.08605453  0.00253983
   0.04318958]
 [-0.12746179  0.04793796  0.18410145 ...  0.08991534 -0.12972379
  -0.08785918]
 ...
 [-0.04453601 -0.05483596  0.01225674 ...  0.10435229  0.09783269
  -0.06952604]
 [ 0.1860082   0.01665728  0.04611587 ... -0.09625227  0.07847701
  -0.02245961]
 [ 0.05135201 -0.02768905  0.0499369  ...  0.00704835  0.15519823
   0.12067825]]
Token embedding weight tensor dimensions: (50257, 768)
