# 🚀 NIM Workshop Setup Guide

Welcome to the NVIDIA NIM Workshop! This notebook will help you set up everything needed for working with **Llama 3.2 1B Instruct**.

## 📋 Quick Start

1. **Run cell 1**: Set up your API keys
2. **Run cell 2**: Check prerequisites  
3. **Run cell 3**: Download Llama 3.2 1B model
4. **Run cell 4**: Verify setup

## 🤖 Model Information

**Llama 3.2 1B Instruct** (2.3GB)
- Perfect size for demos and LoRA fine-tuning
- NeMo 2 format with distributed checkpoints
- Works with standard NGC access

## 🛠️ Prerequisites

- **NGC Account**: Free account at [ngc.nvidia.com](https://ngc.nvidia.com)
- **NGC API Key**: Generate at [ngc.nvidia.com/setup/api-key](https://ngc.nvidia.com/setup/api-key)
- **NVIDIA API Key**: For cloud NIMs from [build.nvidia.com](https://build.nvidia.com)
- **Docker**: For local NIM deployment
- **15GB+ disk space**: For model and container

Let's get started! 🚀


## Step 1: Set Up Your API Keys

You'll need two API keys for this workshop:

1. **NGC API Key** - To download the model
2. **NVIDIA API Key** - To use cloud-hosted NIMs

Run the cell below and enter your keys when prompted:


In [1]:
import os
import getpass

print("🔐 API Key Setup\n")

# Get NGC API Key
print("Enter your NGC API Key (for model downloads):")
print("Get one at: https://ngc.nvidia.com/setup/api-key")
ngc_key = getpass.getpass("NGC API Key: ")

# Get NVIDIA API Key for cloud NIMs
print("\nEnter your NVIDIA API Key (for cloud NIMs):")
print("Get one at: https://build.nvidia.com")
nvidia_key = getpass.getpass("NVIDIA API Key: ")

# Save to environment
os.environ['NGC_API_KEY'] = ngc_key
os.environ['NGC_CLI_API_KEY'] = ngc_key  # New environment variable name
os.environ['NVIDIA_API_KEY'] = nvidia_key

# Save to .env file
with open('.env', 'w') as f:
    f.write(f"NGC_API_KEY={ngc_key}\n")
    f.write(f"NGC_CLI_API_KEY={ngc_key}\n")
    f.write(f"NVIDIA_API_KEY={nvidia_key}\n")

print("\n✅ API keys configured!")


🔐 API Key Setup

Enter your NGC API Key (for model downloads):
Get one at: https://ngc.nvidia.com/setup/api-key

Enter your NVIDIA API Key (for cloud NIMs):
Get one at: https://build.nvidia.com

✅ API keys configured!


## Step 2: Check Prerequisites

Let's verify all required tools are installed:


In [2]:
import subprocess
import shutil

print("🔍 Checking prerequisites...\n")

# 1. Check Docker
try:
    docker_version = subprocess.check_output(['docker', '--version'], text=True).strip()
    print(f"✅ Docker: {docker_version}")
except:
    print("❌ Docker: Not installed - get it from https://docs.docker.com/get-docker/")

# 2. Check/Install NGC CLI
if os.path.exists('ngc-cli/ngc'):
    result = subprocess.run(['./ngc-cli/ngc', '--version'], capture_output=True, text=True)
    if result.returncode == 0:
        print(f"✅ NGC CLI: {result.stdout.strip()}")
    else:
        print("⚠️  NGC CLI found but not working")
else:
    print("📥 Installing NGC CLI...")
    os.system('wget -q https://ngc.nvidia.com/downloads/ngccli_linux.zip')
    os.system('unzip -q ngccli_linux.zip')
    os.system('chmod +x ngc-cli/ngc')
    os.system('rm ngccli_linux.zip')
    print("✅ NGC CLI installed")

# 3. Check GPU (optional)
try:
    gpu = subprocess.check_output(['nvidia-smi', '--query-gpu=name', '--format=csv,noheader'], text=True).strip()
    print(f"✅ GPU: {gpu}")
except:
    print("ℹ️  No GPU detected (you can still use cloud NIMs)")

# 4. Check disk space
free_gb = shutil.disk_usage("/").free // (2**30)
print(f"✅ Disk space: {free_gb} GB free")

print("\n" + "="*50)


🔍 Checking prerequisites...

✅ Docker: Docker version 27.3.1, build ce12230
✅ NGC CLI: NGC CLI 3.160.1
✅ GPU: NVIDIA A100-SXM4-80GB
✅ Disk space: 307 GB free



## Step 3: Download Llama 3.2 1B Model and NIM Container

This will download:
- **Llama 3.2 1B Instruct** (2.3 GB) - The model for LoRA fine-tuning
- **NIM Docker Container** - For local deployment

⏱️ Takes 5-15 minutes depending on internet speed

### 📝 Note about NGC CLI Output
The NGC CLI shows detailed progress information. Don't worry about all the progress bars and symbols - just look for:
- `Download status: Completed` - This means success!
- The download summary at the bottom shows total files and size transferred


In [5]:
import subprocess

print("📥 Downloading workshop assets...\n")

# Check prerequisites
if not os.environ.get('NGC_API_KEY'):
    print("❌ Please run the API key setup cell first")
else:
    # Configure NGC CLI
    os.system(f"./ngc-cli/ngc config set --api-key {os.environ['NGC_API_KEY']} >/dev/null 2>&1")
    
    # 1. Download Llama 3.2 1B model
    model_dir = "lora_tutorial/models/llama-3_2-1b-instruct"
    os.makedirs(model_dir, exist_ok=True)
    
    # Check if already downloaded
    model_subdir = f"{model_dir}/llama-3_2-1b-instruct_v2.0"
    if os.path.exists(model_subdir):
        print("✅ Model already downloaded!")
        print(f"📂 Location: {model_subdir}")
        # Check size
        total_size = 0
        for root, dirs, files in os.walk(model_subdir):
            for file in files:
                total_size += os.path.getsize(os.path.join(root, file))
        print(f"💾 Size: {total_size/(1024**3):.1f} GB")
    else:
        print("📥 Downloading Llama 3.2 1B Instruct model (2.3 GB)...")
        print("   This may take a few minutes depending on your connection speed...")
        print("   Please wait...\n")
        
        # Download with correct model name, capturing output to reduce verbosity
        cmd = f"cd {model_dir} && ../../../ngc-cli/ngc registry model download-version nvidia/nemo/llama-3_2-1b-instruct:2.0"
        
        # Run command and capture output
        import subprocess
        process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        
        # Show simplified progress
        print("⏳ Download in progress...")
        stdout, stderr = process.communicate()
        result = process.returncode
        
        # Check if successful and show summary
        if result == 0:
            # Extract key information from output
            if "Download status: Completed" in stdout:
                # Parse the summary from stdout
                lines = stdout.split('\n')
                summary_started = False
                for line in lines:
                    if "----" in line and not summary_started:
                        summary_started = True
                    elif summary_started and ("Download status:" in line or 
                                            "Total files downloaded:" in line or 
                                            "Total transferred:" in line or
                                            "Duration taken:" in line):
                        print(f"   {line.strip()}")
            print("\n" + "="*60)
            print("✅ MODEL DOWNLOADED SUCCESSFULLY!")
            print("="*60)
            # The model will be in a subdirectory
            model_subdir = f"{model_dir}/llama-3_2-1b-instruct_v2.0"
            if os.path.exists(model_subdir):
                print(f"\n📂 Model location: {model_subdir}")
                # List what's in the model directory
                print("\n📁 Downloaded files:")
                for root, dirs, files in os.walk(model_subdir):
                    level = root.replace(model_subdir, '').count(os.sep)
                    indent = ' ' * 2 * level
                    print(f"{indent}{os.path.basename(root)}/")
                    subindent = ' ' * 2 * (level + 1)
                    for file in files[:5]:  # Show first 5 files
                        print(f"{subindent}{file}")
                    if len(files) > 5:
                        print(f"{subindent}... and {len(files)-5} more files")
        else:
            print("\n❌ Download failed - please check your NGC API key")
    
    # 2. Download Docker container
    print("\n📥 Checking Docker container...")
    image = "nvcr.io/nim/meta/llama-3.2-1b-instruct:latest"
    
    # Check if image exists
    if os.system(f"docker images -q {image} 2>/dev/null | grep -q .") == 0:
        print("✅ Docker container already downloaded")
    else:
        print("📥 Downloading NIM Docker container...")
        # Docker login
        os.system(f"echo {os.environ['NGC_API_KEY']} | docker login nvcr.io -u \\$oauthtoken --password-stdin >/dev/null 2>&1")
        
        # Pull container with cleaner output
        print("⏳ Pulling Docker container...")
        pull_result = subprocess.run(f"docker pull {image}", shell=True, capture_output=True, text=True)
        if pull_result.returncode == 0:
            print("✅ Docker container downloaded")
        else:
            print("⚠️  Container download failed - check Docker and NGC access")

print("\n✅ Setup complete!")


📥 Downloading workshop assets...



📥 Downloading Llama 3.2 1B Instruct model (2.3 GB)...
   This may take a few minutes depending on your connection speed...
   Please wait...

⏳ Download in progress...
   Download status: Completed
   Total files downloaded: 10
   Total transferred: 2.32 GB
   Duration taken: 5s

✅ MODEL DOWNLOADED SUCCESSFULLY!

📂 Model location: lora_tutorial/models/llama-3_2-1b-instruct/llama-3_2-1b-instruct_v2.0

📁 Downloaded files:
llama-3_2-1b-instruct_v2.0/
  weights/
    .metadata
    __0_0.distcp
    __0_1.distcp
    metadata.json
    common.pt
  context/
    io.json
    model.yaml
    nemo_tokenizer/
      tokenizer_config.json
      tokenizer.json
      special_tokens_map.json

📥 Checking Docker container...
✅ Docker container already downloaded

✅ Setup complete!


## Step 4: Verify Setup

Let's make sure everything is ready for the workshop:


In [6]:
print("🔍 Verifying setup...\n")

# Quick checks
checks = {
    "Model downloaded": os.path.exists("lora_tutorial/models/llama-3_2-1b-instruct/llama-3_2-1b-instruct_v2.0"),
    "Docker container": bool(subprocess.run(['docker', 'images', '-q', 'nvcr.io/nim/meta/llama-3.2-1b-instruct:latest'],
                                       capture_output=True, text=True).stdout.strip()),
    "NGC API Key": bool(os.environ.get('NGC_API_KEY')),
    "NVIDIA API Key": bool(os.environ.get('NVIDIA_API_KEY'))
}

# Print results
for item, status in checks.items():
    print(f"{'✅' if status else '❌'} {item}")

# Test cloud API connection
try:
    import requests
    headers = {"Authorization": f"Bearer {os.environ.get('NVIDIA_API_KEY', '')}"}
    response = requests.get("https://integrate.api.nvidia.com/v1/models", headers=headers, timeout=5)
    print(f"\n📡 Cloud API: {'✅ Connected' if response.status_code == 200 else f'⚠️  Status {response.status_code}'}")
except:
    print("\n📡 Cloud API: ⚠️  Could not test connection")

# Summary
if all(checks.values()):
    print("\n🎉 All set! You're ready for the NIM workshop!")
    print("\n📂 Model location: lora_tutorial/models/llama-3_2-1b-instruct/llama-3_2-1b-instruct_v2.0/")
    print("🐳 Container: nvcr.io/nim/meta/llama-3.2-1b-instruct:latest")
else:
    print("\n⚠️  Some components missing - please check above")
    
# Create data directory for later use
os.makedirs("lora_tutorial/data", exist_ok=True)


🔍 Verifying setup...

✅ Model downloaded
✅ Docker container
✅ NGC API Key
✅ NVIDIA API Key

📡 Cloud API: ✅ Connected

🎉 All set! You're ready for the NIM workshop!

📂 Model location: lora_tutorial/models/llama-3_2-1b-instruct/llama-3_2-1b-instruct_v2.0/
🐳 Container: nvcr.io/nim/meta/llama-3.2-1b-instruct:latest


## 🎯 Next Steps

### Model Information
- **Model**: Llama 3.2 1B Instruct 
- **Format**: NeMo 2 distributed checkpoint (`.distcp` files)
- **Location**: `lora_tutorial/models/llama-3_2-1b-instruct/llama-3_2-1b-instruct_v2.0/`

### What's Next?
1. **01_NIM_API_Tutorial.ipynb** - Learn to use cloud-hosted NIMs
2. **02_Local_NIM_Deployment.ipynb** - Deploy NIMs locally with Docker
3. **03_LoRA_Training.ipynb** - Fine-tune the model with LoRA
4. **04_Deploy_LoRA_with_NIM.ipynb** - Deploy your fine-tuned model

### Troubleshooting

**If download fails:**
- Verify your NGC API key is correct
- Check your internet connection
- Try running the download cell again (downloads can be resumed)

**Docker issues:**
- Make sure Docker daemon is running
- On Linux: `sudo systemctl start docker`
- Test with: `docker run hello-world`

**Understanding the download output:**
The NGC CLI shows detailed progress with many symbols and progress bars. This is normal! 
The key indicators of success are:
- `Download status: Completed`
- Summary showing total files and GB transferred
- Exit code 0 (success)

**Model format:**
The Llama 3.2 model uses NeMo 2 format with distributed checkpoints:
- `weights/` - Model weights in `.distcp` format
- `context/` - Configuration files
- This is different from the older single `.nemo` file format

---

**Ready to start?** Open `01_NIM_API_Tutorial.ipynb` to begin the workshop! 🚀
