# Warm-Up Notebook
---

You've made it to the first notebook in this workshop! As a good way to check that everything is running correctly, let's preload the model weights to save time later.

In [None]:
import torch
from transformers import pipeline

We'll be using [StableLM](https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b) throughout this workshop. To trigger the full download of the weights, set up a pipeline that caches the model with the fast NVMe storage on Anyscale.

In [None]:
p  = pipeline(model="stabilityai/stablelm-tuned-alpha-7b", task='text-generation', 
              model_kwargs={'device_map':'auto', 'torch_dtype' : torch.float16, 'cache_dir': '/mnt/local_storage/'})

Verify that the model is loaded into GPU memory.

In [None]:
! nvidia-smi

In production situations, this memory should be freed when the process exits. However, in a notebook (or other long-running dev process environment), it can be useful to purge unneeded data directly.

Additionally, we can use 🤗 Accelerate to release any unused GPU memory

In [None]:
del(p)

In [None]:
from accelerate import Accelerator

accelerator = Accelerator()
accelerator.free_memory()

Verify that the memory is freed. You can also check out the Ray Dashboard for Node GPU memory time series metrics.

In [None]:
! nvidia-smi