**LLM Workshop 2024 by Sebastian Raschka**

<br>
<br>
<br>
<br>

# 5) Loading pretrained weights (part 2; using LitGPT)

- Now, we are loading the weights using an open-source library called LitGPT
- LitGPT is fundamentally similar to the LLM code we implemented previously, but it is much more sophisticated and supports more than 20 different LLMs (Mistral, Gemma, Llama, Phi, and more)

# ⚡ LitGPT

**20+ high-performance LLMs with recipes to pretrain, finetune, deploy at scale.**

<pre>
✅ From scratch implementations     ✅ No abstractions    ✅ Beginner friendly   
✅ Flash attention                  ✅ FSDP               ✅ LoRA, QLoRA, Adapter
✅ Reduce GPU memory (fp4/8/16/32)  ✅ 1-1000+ GPUs/TPUs  ✅ 20+ LLMs            
</pre>

## Basic usage:

```
# ligpt [action] [model]
litgpt  download  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  chat      meta-llama/Meta-Llama-3-8B-Instruct
litgpt  evaluate  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  finetune  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  pretrain  meta-llama/Meta-Llama-3-8B-Instruct
litgpt  serve     meta-llama/Meta-Llama-3-8B-Instruct
```


- You can learn more about LitGPT in the [corresponding GitHub repository](https://github.com/Lightning-AI/litgpt), that contains many tutorials, use cases, and examples


In [1]:
# pip install litgpt

In [2]:
from importlib.metadata import version

pkgs = ["litgpt", 
        "torch",
       ]
for p in pkgs:
    print(f"{p} version: {version(p)}")

litgpt version: 0.4.3.dev0
torch version: 2.2.1+cu121


- First, let's see what LLMs are supported

In [3]:
!litgpt download list

repo_id: list
Please specify --repo_id <repo_id>. Available values:
codellama/CodeLlama-13b-hf
codellama/CodeLlama-13b-Instruct-hf
codellama/CodeLlama-13b-Python-hf
codellama/CodeLlama-34b-hf
codellama/CodeLlama-34b-Instruct-hf
codellama/CodeLlama-34b-Python-hf
codellama/CodeLlama-70b-hf
codellama/CodeLlama-70b-Instruct-hf
codellama/CodeLlama-70b-Python-hf
codellama/CodeLlama-7b-hf
codellama/CodeLlama-7b-Instruct-hf
codellama/CodeLlama-7b-Python-hf
databricks/dolly-v2-12b
databricks/dolly-v2-3b
databricks/dolly-v2-7b
EleutherAI/pythia-1.4b
EleutherAI/pythia-1.4b-deduped
EleutherAI/pythia-12b
EleutherAI/pythia-12b-deduped
EleutherAI/pythia-14m
EleutherAI/pythia-160m
EleutherAI/pythia-160m-deduped
EleutherAI/pythia-1b
EleutherAI/pythia-1b-deduped
EleutherAI/pythia-2.8b
EleutherAI/pythia-2.8b-deduped
EleutherAI/pythia-31m
EleutherAI/pythia-410m
EleutherAI/pythia-410m-deduped
EleutherAI/pythia-6.9b
EleutherAI/pythia-6.9b-deduped
EleutherAI/pythia-70m
EleutherAI/pythia-70m-deduped
garage-bA

- We can then download an LLM via the following command

In [4]:
!litgpt download microsoft/phi-2

repo_id: microsoft/phi-2
Setting HF_HUB_ENABLE_HF_TRANSFER=1
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
model-00001-of-00002.safetensors: 100%|█████| 5.00G/5.00G [00:18<00:00, 266MB/s]
model-00002-of-00002.safetensors: 100%|███████| 564M/564M [00:01<00:00, 335MB/s]
Converting .safetensor files to PyTorch binaries (.bin)
checkpoints/microsoft/phi-2/model-00001-of-00002.safetensors --> checkpoints/microsoft/phi-2/model-00001-of-00002.bin
checkpoints/microsoft/phi-2/model-00002-of-00002.safetensors --> checkpoints/microsoft/phi-2/model-00002-of-00002.bin
Converting checkpoint files to LitGPT format.
{'checkpoint_dir': PosixPath('checkpoints/microsoft/phi-2'),
 'debug_mode': False,
 'dtype': None,
 'model_name': None}
Loading weights: model-00002-of-00002.bin: 100%|████████| 00:08<00:00, 11.24it/s
Saving converted checkpoint to checkpoints/microsoft/phi-2


- And there's also a Python API to use the model

In [6]:
from litgpt import LLM

llm = LLM.load("microsoft/phi-2")

llm.generate("What do Llamas eat?")

' What do Llamas typically feed on?\n'

In [7]:
result = llm.generate("What do Llamas eat?", stream=True, max_new_tokens=200)
for e in result:
    print(e, end="", flush=True)

 Llamas are herbivores and mainly feed on grass, leaves, and other vegetation. They have a grazing behavior and spend most of the day eating.


<br>
<br>
<br>
<br>

# Exercise 2: Download an LLM

- Download and try out an LLM of your own choice (recommendation: 7B parameters or smaller)
- We will finetune the LLM in the next notebook
- You can also try out the `litgpt chat` command from the terminal

In [8]:
!litgpt download openlm-research/open_llama_3b

repo_id: openlm-research/open_llama_3b
Setting HF_HUB_ENABLE_HF_TRANSFER=1
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
config.json: 100%|█████████████████████████████| 506/506 [00:00<00:00, 5.85MB/s]
generation_config.json: 100%|██████████████████| 137/137 [00:00<00:00, 1.19MB/s]
pytorch_model.bin: 100%|███████████████████| 6.85G/6.85G [01:47<00:00, 63.8MB/s]
tokenizer.model: 100%|███████████████████████| 534k/534k [00:00<00:00, 18.7MB/s]
tokenizer_config.json: 100%|███████████████████| 593/593 [00:00<00:00, 8.88MB/s]
Converting checkpoint files to LitGPT format.
{'checkpoint_dir': PosixPath('checkpoints/openlm-research/open_llama_3b'),
 'debug_mode': False,
 'dtype': None,
 'model_name': None}
Loading weights: pytorch_model.bin: 100%|███████████████| 00:11<00:00,  8.50it/s
Saving converted checkpoint to checkpoints/openlm-research/open_llama_3b


In [9]:
del llm

In [10]:
from litgpt import LLM

llm = LLM.load("openlm-research/open_llama_3b")

llm.generate("What do Llamas eat?")

'Well, that depends on who you ask. Llamas are grass-eating animals that naturally go onitorial on walking green pastures.\nWe do that too. But what do they eat in the pastures? The answer, really'