# LLM Inference with AMD Instinct ™ MI300X Accelerators   
add introduction

## Prerequisites
### 1.Hardware Requirements
-AMD ROCm GPUs (e.g., MI210, MI300X).
-Ensure your system meets the System Requirements, including ROCm 6.0+ and Ubuntu 22.04.
### 2.Docker
-Install Docker with GPU support
-Ensure your user has appropriate permission to access to GPU

```bash
docker run --rm --device=/dev/kfd --device=/dev/dri rocm/pytorch:rocm6.1_ubuntu22.04_py3.10_pytorch_2.1.2 rocm-smi
```
### 3.Hugging Face API Access
-Obtain an API token from Hugging Face for downloading models.
-Ensure you have a Hugging Face API token with the necessary permissions and approval to access Meta’s LLaMA checkpoints.

## Prepare Inference Environment
### 1.Pull the Docker Image
```bash
# Host machine
docker run -it --rm --device=/dev/kfd --device=/dev/dri --group-add video --shm-size 1G --security-opt seccomp=unconfined --security-opt apparmor=unconfined -v $(pwd):/workspace --env HUGGINGFACE_HUB_CACHE=/workspace rocm/pytorch:latesti

# Inside the container 
cd /workspace 
export HF_TOKEN="Your hugging face token to access gated models" 
pip install accelerate transformers 
```

### 2.Install and Launch Jupyter
Inside the Docker container, install Jupyter using the following command:
```bash
pip install --upgrade pip setuptools wheel
pip install jupyter
```
Start the Jupyter server:
```bash
jupyter-lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
```
### 3.Run a Sample LLM
Create a hf_transformer.py file inside the docker.

In [1]:
# hf_transformers.py
import transformers
import torch  

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
pipeline = transformers.pipeline( 
    "text-generation", 
    model=model_id, 
    model_kwargs={"torch_dtype": torch.bfloat16}, 
    device_map="auto", 
) 

messages = [ 
    {"role": "system", "content": "You are a chatbot in the online shopping mall!"}, 
    {"role": "user", "content": "How can I get a refund of this product?"}, 
] 

outputs = pipeline( 
    messages,
    max_new_tokens=10, 
) 

print(outputs[0]["generated_text"][-1]) 

# python hf_transformers.py 
# gives 
{'role': 'assistant', 'content': "I'd be happy to help you with the refund"}

ModuleNotFoundError: No module named 'transformers'