# Getting started

### Install PyTorch

As I have an NVIDIA GPU, I can use PyTorch with CUDA for GPU acceleration.
[PyTorch Guide](https://pytorch.org/get-started/locally/)

To check your NVIDIA CUDA version:
1. Open NVIDIA Control Panel
2. Expand Help, and click System Information
3. Go to the Components Tab
4. Look for CUDA under 3D Settings, for example 'NVCUDA64.DLL', and check the version in the Product Name, for example 'NVIDIA CUDA __12.8.51__ driver'

In [None]:
#The below needs to be run on your system terminal.

#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

### Verify PyTorch Installed

In [None]:
import torch

torch.cuda.is_available()

x = torch.rand(5, 3)
print(x)

### Import Libraries

_torch commented out as it is imported above_

In [None]:
#import torch
import transformers
import os
import time
import json

### Authenticate Hugging Face

If you do not authenticate you will get a 401 error.

In [None]:
from ipywidgets import VBox
from huggingface_hub import login
login()

### Choose the model

Download the model from [Hugging Face](https://huggingface.co/meta-llama), or use transformers to handle it. The following code block is taken from the [Llama 3.3 70B Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) model example.

In [None]:
import transformers
import torch

model_id = "meta-llama/Llama-3.3-70B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    #system: defines how the LLM behaves
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    #user: defines the input provided by the user
    {"role": "user", "content": "Who are you?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])