**LLM Workshop 2024 by Sebastian Raschka**

<br>
<br>
<br>
<br>

# 5) Loading pretrained weights (part 2; using LitGPT)

- Now, we are loading the weights using an open-source library called LitGPT
- LitGPT is fundamentally similar to the LLM code we implemented previously, but it is much more sophisticated and supports more than 20 different LLMs (Mistral, Gemma, Llama, Phi, and more)

# ⚡ LitGPT

**20+ high-performance LLMs with recipes to pretrain, finetune, deploy at scale.**

<pre>
✅ From scratch implementations     ✅ No abstractions    ✅ Beginner friendly   
✅ Flash attention                  ✅ FSDP               ✅ LoRA, QLoRA, Adapter
✅ Reduce GPU memory (fp4/8/16/32)  ✅ 1-1000+ GPUs/TPUs  ✅ 20+ LLMs            
</pre>

## Basic usage:

```
# ligpt [action] [model]
litgpt  download  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  chat      meta-llama/Meta-Llama-3-8B-Instruct
litgpt  evaluate  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  finetune  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  pretrain  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  serve     meta-llama/Meta-Llama-3-8B-Instruct
```


- You can learn more about LitGPT in the [corresponding GitHub repository](https://github.com/Lightning-AI/litgpt), that contains many tutorials, use cases, and examples


In [None]:
# pip install litgpt

In [2]:
from importlib.metadata import version

pkgs = ["litgpt", 
        "torch",
       ]
for p in pkgs:
    print(f"{p} version: {version(p)}")

litgpt version: 0.5.8
torch version: 2.6.0


- First, let's see what LLMs are supported

In [3]:
!litgpt download list

Please specify --repo_id <repo_id>. Available values:
allenai/OLMo-1B-hf
allenai/OLMo-7B-hf
allenai/OLMo-7B-Instruct-hf
BSC-LT/salamandra-2b
BSC-LT/salamandra-2b-instruct
BSC-LT/salamandra-7b
BSC-LT/salamandra-7b-instruct
codellama/CodeLlama-13b-hf
codellama/CodeLlama-13b-Instruct-hf
codellama/CodeLlama-13b-Python-hf
codellama/CodeLlama-34b-hf
codellama/CodeLlama-34b-Instruct-hf
codellama/CodeLlama-34b-Python-hf
codellama/CodeLlama-70b-hf
codellama/CodeLlama-70b-Instruct-hf
codellama/CodeLlama-70b-Python-hf
codellama/CodeLlama-7b-hf
codellama/CodeLlama-7b-Instruct-hf
codellama/CodeLlama-7b-Python-hf
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
EleutherAI/pythia-1.4b
EleutherAI/pythia-1.4b-deduped
EleutherAI/pythia-12b
EleutherAI/pythia-12b-deduped
EleutherAI/pythia-14m
EleutherAI/pythia-160m
EleutherAI/pythia-160m-deduped
EleutherAI/pythia-1b
EleutherAI/pythia-1b-deduped
EleutherAI/pythia-2.8b
EleutherAI/pythia-2.8b-deduped
EleutherAI/pythia-31m
El

- We can then download an LLM via the following command

In [4]:
!litgpt download microsoft/phi-2

It is recommended to install hf_transfer for faster checkpoint download speeds: `pip install hf_transfer`
Fetching 7 files:   0%|                                   | 0/7 [00:00<?, ?it/s]
tokenizer_config.json:   0%|                        | 0.00/7.34k [00:00<?, ?B/s][A

config.json: 100%|██████████████████████████████| 735/735 [00:00<00:00, 751kB/s][A[A
tokenizer_config.json: 100%|███████████████| 7.34k/7.34k [00:00<00:00, 2.87MB/s]
Fetching 7 files:  14%|███▊                       | 1/7 [00:00<00:00,  7.19it/s]
generation_config.json: 100%|███████████████████| 124/124 [00:00<00:00, 129kB/s][A

tokenizer.json:   0%|                               | 0.00/2.11M [00:00<?, ?B/s][A

model.safetensors.index.json: 100%|████████| 35.7k/35.7k [00:00<00:00, 7.12MB/s][A[A

tokenizer.json: 100%|██████████████████████| 2.11M/2.11M [00:00<00:00, 17.2MB/s][A

model-00001-of-00002.safetensors:   0%|             | 0.00/5.00G [00:00<?, ?B/s][A

model-00002-of-00002.safetensors:   0%|            

- And there's also a Python API to use the model

In [5]:
from litgpt import LLM

llm = LLM.load("microsoft/phi-2")

llm.generate("What do Llamas eat?")

' Humans: Llamas are herbivores and mainly eat grass, shrubs, and leaves.\n'

In [6]:
result = llm.generate("What do Llamas eat?", stream=True, max_new_tokens=200)
for e in result:
    print(e, end="", flush=True)

 Llamas are herbivores and mainly feed on grass, shrubs, and leaves.


In [None]:
del llm
# release space

<br>
<br>
<br>
<br>

# Exercise 2: Download an LLM

- Download and try out an LLM of your own choice (recommendation: 7B parameters or smaller)
- We will finetune the LLM in the next notebook
- You can also try out the `litgpt chat` command from the terminal