# Introduction

## Exercise 1: Downloading a Module with LitGPT

[LitGPT](https://github.com/Lightning-AI/litgpt) is a command-line tool to finetune Large Language Models (LLMs) on user-defined datasets.

`litgpt download list` shows all available models, which can be downloaded with `litgpt download <model_name>`.
An `m` in the name means "million", while a `b` in the name means "billion", which is 10^9.

You can choose to load another model and then prompt it later. But be aware that too big models might not load on your GPU, which is restricted to 40GB. If you do so, you need to call the `share.convert_litgpt_pytorch` function on the model's directory path to convert it to PyTorch format.

In [None]:
# standard library imports
import share

!litgpt download list
!litgpt download TinyLlama/TinyLlama-1.1B-Chat-v1.0
share.convert_litgpt_pytorch(share.TINYLLAMA_MODEL_DIR)

## Exercise 2: Loading a model

Run `nvidia-smi --query --data MEMORY` to check that no model is currently loaded on the GPU. Use `share.load_model` function and the `share.*_MODEL_DIR` variables to load a model. Run the `nvidia-smi` command again to confirm that the model has been loaded onto the GPU.

In [None]:
# standard library imports
import gc

# third party imports
import torch

!nvidia-smi --query -d MEMORY

model = share.load_model(...) # model to load here

!nvidia-smi --query -d MEMORY

# unload model
del model
gc.collect()
torch.cuda.clear_cache()

## Exercise 3: Prompting a model

The `share.load_tokenizer` function from the `share` is used to load the matching tokenizer for the model.

Pass the model and the tokenizer along with your prompt to the `share.prompt` function. The `max_new_tokens` keyword argument controls how many new tokens the model will generate at maximum. Experiment with different values for this argument.

In [None]:
# local imports
import share

model = share.load_model(share.TINYLLAMA_MODEL_DIR)
tokenizer = share.load_tokenizer(share.TINYLLAMA_MODEL_DIR)

In [None]:
text = "Hello, world!"
for max_new_tokens in (...): # max_new_tokens values here
    print(f"max. {max_new_tokens} new tokens: {share.prompt(model, tokenizer, text, max_new_tokens=max_new_tokens)}")

In [None]:
# unload model
del model
gc.collect()
torch.cuda.clear_cache()

## Exercise 4: Exploring a dataset

The `share` module provides pre-defined variables and helper functions for the exercises. In this exercise, the enron_spam dataset has been loaded using the `load_test_dataset` function.

`test_dataset` is an instance of the [Dataset](https://huggingface.co/docs/datasets/package_reference/main_classes#datasets.Dataset) class from the [datasets](https://huggingface.co/docs/datasets/index) library.

The `num_columns`, `num_rows` and `column_names` properties let you explore the dataset. It uses the `sort` method to sort by the `Spam/Ham` column and print the `text` column of the first 3 spam e-mails. It uses Python list slicing to select values at multiple rows at once (e.g. `test_dataset[:2]["Spam/Ham"]`).

In [None]:
# standard library imports
import os

# local imports
import share

test_dataset = share.load_test_dataset(share.ENRON_SPAM_TEST_DATASET)

print(f"Number of rows: {test_dataset.num_rows}")
print(f"Number of columns: {test_dataset.num_columns}")
print("Column names: " + ", ".join(f"'{column_name}'" for column_name in test_dataset.column_names))

for i, message in enumerate(test_dataset.sort("Spam/Ham", reverse=True)[:3], start=1):
    print(f"{os.linesep}Message #{i}: {message}")