# üöÄ How to Build NVIDIA Launchables
## *The Definitive Interactive Tutorial*

Welcome to the **meta-launchable** - a tutorial that teaches you how to build launchables while demonstrating best practices through its own implementation!

---

### What You'll Build Today

By the end of this tutorial, you'll have:
- ‚úÖ A working understanding of the launchables pattern
- ‚úÖ A GPU-accelerated demo application
- ‚úÖ The skills to create and deploy your own launchables
- ‚úÖ A complete workflow from idea to production

### How This Works

This notebook is **interactive** - you'll run every cell, see real outputs, and build alongside the tutorial. Think of it as pair programming with Jensen Huang's vision of democratizing AI.

> **üí° Pro Tip**: Read each section carefully, run the code cells sequentially, and complete the exercises. By the end, you'll have built your first launchable!

---

### üìã Quick Navigation

1. **Introduction & GPU Setup** ‚Üê *Start here*
2. Understanding the Launchables Structure
3. GPU-First Development
4. Building Interactive Demos
5. Prompt Engineering with Claude
6. Debugging & Testing
7. Git & GitHub Setup
8. Deploying to Brev
9. Your First Launchable Exercise
10. Resources & Next Steps

---

**Ready?** Let's verify your GPU and get started! üëá


# Section 1: Introduction & GPU Setup üéØ

## What is a Launchable?

A **launchable** is a self-contained, GPU-accelerated, interactive notebook that:
- üì¶ **Packages everything** - Code, docs, and dependencies in one place
- üöÄ **Runs instantly** - No complex setup required
- üíª **Uses GPU** - Accelerates AI/ML workloads
- üåê **Deploys easily** - One-click deployment to Brev
- üéì **Teaches by doing** - Interactive, hands-on learning

Think of it as a "runnable README" for AI projects.

## Why Build Launchables?

1. **Share Your Work** - Make your AI projects accessible to everyone
2. **Learn Faster** - Best way to learn is by building and sharing
3. **Build Portfolio** - Showcase your skills with deployable projects
4. **Contribute** - Help democratize AI development

## The Launchables Ecosystem

- **Repository**: [github.com/brevdev/launchables](https://github.com/brevdev/launchables)
- **Platform**: [Brev.dev](https://brev.dev) - One-click GPU deployment
- **Community**: Contributors building the future of AI tooling

---

## üî• CRITICAL: GPU Verification

**This is the most important cell in any launchable!**

Before we go further, we MUST verify that:
1. A GPU is available
2. CUDA is properly installed
3. PyTorch can access the GPU
4. We know what hardware we're working with

Run the cell below ‚¨áÔ∏è


In [None]:
"""
üî• CRITICAL GPU VERIFICATION
This cell MUST be the first executable cell in every launchable!
"""

import torch
import sys

print("=" * 70)
print("üîç GPU VERIFICATION REPORT")
print("=" * 70)

# Check Python version
print(f"\nüìå Python Version: {sys.version.split()[0]}")

# Check PyTorch version
print(f"üìå PyTorch Version: {torch.__version__}")

# Check CUDA availability
cuda_available = torch.cuda.is_available()
print(f"\n{'‚úÖ' if cuda_available else '‚ùå'} CUDA Available: {cuda_available}")

if cuda_available:
    # GPU Details
    gpu_count = torch.cuda.device_count()
    print(f"‚úÖ Number of GPUs: {gpu_count}")
    
    for i in range(gpu_count):
        print(f"\nüìä GPU {i} Details:")
        print(f"   Name: {torch.cuda.get_device_name(i)}")
        print(f"   Compute Capability: {torch.cuda.get_device_capability(i)}")
        
        # Memory info
        total_memory = torch.cuda.get_device_properties(i).total_memory / 1e9
        print(f"   Total Memory: {total_memory:.2f} GB")
        
        # Current memory usage
        allocated = torch.cuda.memory_allocated(i) / 1e9
        reserved = torch.cuda.memory_reserved(i) / 1e9
        print(f"   Allocated Memory: {allocated:.2f} GB")
        print(f"   Reserved Memory: {reserved:.2f} GB")
    
    # Test GPU with a simple operation
    print("\nüß™ Testing GPU with sample tensor operation...")
    test_tensor = torch.randn(1000, 1000).cuda()
    result = torch.matmul(test_tensor, test_tensor)
    print(f"‚úÖ GPU test successful! Result shape: {result.shape}")
    print(f"‚úÖ Tensor is on device: {result.device}")
    
    # Clean up
    del test_tensor, result
    torch.cuda.empty_cache()
    
    print("\n" + "=" * 70)
    print("üéâ SUCCESS! Your GPU is ready for AI development!")
    print("=" * 70)
    
else:
    # Fallback message
    print("\n" + "=" * 70)
    print("‚ö†Ô∏è  WARNING: No GPU detected!")
    print("=" * 70)
    print("\nüîß Troubleshooting Steps:")
    print("1. Verify nvidia-smi works: Run 'nvidia-smi' in terminal")
    print("2. Check CUDA installation: Visit https://developer.nvidia.com/cuda-downloads")
    print("3. Reinstall PyTorch with CUDA: https://pytorch.org/get-started/locally/")
    print("4. Verify GPU drivers are up to date")
    print("\nüí° Common Issues:")
    print("   - Wrong PyTorch version (CPU-only)")
    print("   - CUDA version mismatch")
    print("   - GPU drivers not installed")
    print("\n‚ö†Ô∏è  This launchable requires a GPU to run properly.")
    print("=" * 70)

# Set default device for rest of notebook
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"\nüéØ Default device set to: {device}")
print(f"‚úÖ All future operations will use: {device.type.upper()}")


## Environment Setup Checklist

Before we continue, make sure you have:

**Required:**
- ‚úÖ GPU detected (verified above)
- ‚úÖ PyTorch with CUDA support installed
- ‚úÖ Jupyter notebook running
- ‚úÖ Git installed (`git --version` in terminal)
- ‚úÖ GitHub account created

**Recommended:**
- üìù Code editor (VSCode, Cursor, or similar)
- üêô Git configured with SSH keys
- üåê Brev.dev account (for deployment later)

### Quick Environment Check


In [None]:
"""
Quick environment check - verify all key dependencies
"""

import subprocess
import importlib

def check_import(package_name, display_name=None):
    """Check if a package can be imported and get its version"""
    if display_name is None:
        display_name = package_name
    try:
        module = importlib.import_module(package_name)
        version = getattr(module, '__version__', 'unknown')
        print(f"‚úÖ {display_name}: {version}")
        return True
    except ImportError:
        print(f"‚ùå {display_name}: Not installed")
        return False

def check_command(command, name):
    """Check if a command is available"""
    try:
        result = subprocess.run([command, '--version'], 
                              capture_output=True, text=True, timeout=5)
        version_line = result.stdout.split('\n')[0] if result.stdout else result.stderr.split('\n')[0]
        print(f"‚úÖ {name}: {version_line}")
        return True
    except (subprocess.TimeoutExpired, FileNotFoundError):
        print(f"‚ùå {name}: Not found")
        return False

print("üîç Checking Dependencies...\n")

# Python packages
check_import('torch', 'PyTorch')
check_import('transformers', 'Transformers')
check_import('numpy', 'NumPy')
check_import('matplotlib', 'Matplotlib')

print()

# System tools
check_command('git', 'Git')
check_command('nvidia-smi', 'nvidia-smi')

print("\n‚úÖ Environment check complete!")


---

# Section 2: Understanding the Launchables Structure üìÅ

## The Launchables Pattern

A well-structured launchable follows this pattern:

```
your-launchable/
‚îú‚îÄ‚îÄ README.md                 # Overview, prerequisites, quick start
‚îú‚îÄ‚îÄ requirements.txt          # Python dependencies with versions
‚îú‚îÄ‚îÄ .gitignore               # Exclude cache, models, etc.
‚îú‚îÄ‚îÄ main-notebook.ipynb      # Your interactive tutorial
‚îî‚îÄ‚îÄ (optional) assets/       # Images, data files, etc.
```

### Why This Structure?

1. **README.md** - First thing people see. Must be compelling!
2. **requirements.txt** - Reproducible environment setup
3. **.gitignore** - Keep repo clean (no model checkpoints!)
4. **Notebook** - Self-contained learning experience
5. **Assets** - Supporting materials (keep them small!)

## Examples from the Ecosystem

Let's look at real launchables:

### Example 1: Model Fine-tuning
```
fine-tune-llama/
‚îú‚îÄ‚îÄ README.md              # "Fine-tune Llama 2 in 30 minutes"
‚îú‚îÄ‚îÄ requirements.txt       # torch, transformers, datasets, peft
‚îú‚îÄ‚îÄ fine-tune.ipynb       # Step-by-step tutorial
‚îî‚îÄ‚îÄ sample-data/          # Small example dataset
```

### Example 2: Production Deployment
```
vllm-production/
‚îú‚îÄ‚îÄ README.md              # "Deploy LLMs at scale"
‚îú‚îÄ‚îÄ requirements.txt       # vllm, fastapi, uvicorn
‚îú‚îÄ‚îÄ deployment.ipynb      # Interactive setup guide
‚îî‚îÄ‚îÄ config/               # Sample configurations
```

### Example 3: This Tutorial!
```
how-to-build-launchables/
‚îú‚îÄ‚îÄ README.md              # What you're learning
‚îú‚îÄ‚îÄ requirements.txt       # All dependencies
‚îú‚îÄ‚îÄ .gitignore            # Clean repo
‚îî‚îÄ‚îÄ how-to-build-launchables.ipynb  # This file!
```

## Best Practices

### ‚úÖ DO:
- Keep notebooks focused (1-2 hours to complete)
- Include working code examples
- Test on fresh environment before sharing
- Add clear error messages
- Use GPU verification at start
- Include progress indicators

### ‚ùå DON'T:
- Commit large model files (use `.gitignore`)
- Hardcode personal paths or tokens
- Skip GPU verification
- Make assumptions about environment
- Leave broken cells
- Forget to test end-to-end

---

## üéØ Exercise: Understanding Structure

Look at this repository's structure. Can you identify:
1. Where are the dependencies listed?
2. What files are ignored by git?
3. How is this notebook organized?

**Answer**: Use `!ls -la` to explore!


In [None]:
# Explore the repository structure
import os

print("üìÅ Current Directory Structure:\n")
print("=" * 70)

# List files in current directory
files = os.listdir('.')
files.sort()

for file in files:
    if file.startswith('.'):
        icon = "üîí"  # Hidden file
    elif file.endswith('.ipynb'):
        icon = "üìì"
    elif file.endswith('.md'):
        icon = "üìù"
    elif file.endswith('.txt'):
        icon = "üìÑ"
    elif file.endswith('.py'):
        icon = "üêç"
    elif os.path.isdir(file):
        icon = "üìÇ"
    else:
        icon = "üìÑ"
    
    size = "DIR" if os.path.isdir(file) else f"{os.path.getsize(file):,} bytes"
    print(f"{icon} {file:<40} {size}")

print("=" * 70)
print("\nüí° Notice:")
print("   - requirements.txt defines our dependencies")
print("   - .gitignore keeps repo clean")
print("   - This notebook is self-contained")
print("   - README.md provides overview")


---

# Section 3: GPU-First Development üî•

## Why GPU-First Matters

The #1 mistake in AI development: **Assuming code runs on GPU when it doesn't!**

Your code might:
- ‚úÖ Run successfully (no errors)
- ‚úÖ Produce correct results
- ‚ùå But run 100x slower on CPU!

## The GPU Development Checklist

For EVERY operation with neural networks:

1. **Verify device at model load time**
2. **Verify device during inference**
3. **Monitor GPU memory usage**
4. **Check GPU utilization** (is it actually working?)
5. **Handle device mismatches gracefully**

## Device Management Pattern

Here's the pattern you should use in EVERY launchable:


In [None]:
"""
GOLD STANDARD: Device Management Pattern
Use this in every launchable!
"""

import torch

# 1. Detect and set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"üéØ Using device: {device}")

# 2. Check GPU properties if available
if device.type == "cuda":
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")
    print(f"‚úÖ Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ö†Ô∏è  Running on CPU - this will be slower!")

# 3. Create tensors on the correct device
# Method 1: Create then move
tensor1 = torch.randn(100, 100).to(device)
print(f"\nüìä Tensor 1 device: {tensor1.device}")

# Method 2: Create directly on device
tensor2 = torch.randn(100, 100, device=device)
print(f"üìä Tensor 2 device: {tensor2.device}")

# 4. Verify operations stay on GPU
result = torch.matmul(tensor1, tensor2)
print(f"üìä Result device: {result.device}")

# 5. Check memory usage (GPU only)
if device.type == "cuda":
    allocated = torch.cuda.memory_allocated(0) / 1e6
    print(f"\nüíæ GPU Memory allocated: {allocated:.2f} MB")

print("\n‚úÖ Device management verified!")


## Common GPU Pitfalls and Solutions

### ‚ùå Pitfall 1: Model on GPU, Data on CPU

```python
# BAD: Model and data on different devices
model = MyModel().cuda()
data = torch.randn(10, 10)  # Still on CPU!
output = model(data)  # ERROR: device mismatch
```

```python
# GOOD: Everything on same device
model = MyModel().to(device)
data = torch.randn(10, 10, device=device)
output = model(data)  # Works!
```

### ‚ùå Pitfall 2: Not Checking GPU Utilization

Just because your code runs doesn't mean it's using the GPU!

**Always verify with `nvidia-smi`:**
```bash
# In terminal, run:
watch -n 1 nvidia-smi

# Look for:
# - GPU Utilization > 0%
# - Memory Usage increasing
# - Your Python process listed
```

### ‚ùå Pitfall 3: Forgetting to Clear Cache

```python
# BAD: Memory leaks over time
for i in range(100):
    big_tensor = torch.randn(1000, 1000, device='cuda')
    # Tensor never freed!
```

```python
# GOOD: Explicit cleanup
for i in range(100):
    big_tensor = torch.randn(1000, 1000, device='cuda')
    result = process(big_tensor)
    del big_tensor  # Free memory
    if i % 10 == 0:
        torch.cuda.empty_cache()  # Clear cache periodically
```

---

## üöÄ Live Demo: Load a Real Model on GPU

Let's load a small but real model and verify GPU usage at every step!


In [None]:
"""
Live Demo: Load DistilGPT-2 on GPU
This is a small model perfect for learning!
"""

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

print("üîÑ Loading DistilGPT-2 model...\n")

# Step 1: Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"üìç Target device: {device}")

# Step 2: Load tokenizer (always on CPU)
print("\nüîÑ Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
print("‚úÖ Tokenizer loaded")

# Step 3: Load model and move to GPU
print("\nüîÑ Loading model...")
model = AutoModelForCausalLM.from_pretrained("distilgpt2")
print(f"‚úÖ Model loaded (currently on: CPU)")

print("\nüîÑ Moving model to GPU...")
model = model.to(device)
print(f"‚úÖ Model moved to: {device}")

# Step 4: Verify model is on GPU
print("\nüîç Verification:")
print(f"   Model device: {next(model.parameters()).device}")

# Step 5: Check memory usage
if device.type == "cuda":
    memory_allocated = torch.cuda.memory_allocated(0) / 1e6
    print(f"   GPU Memory used: {memory_allocated:.2f} MB")
    
    # Get GPU utilization using nvidia-smi
    print("\nüí° Tip: Open a terminal and run 'nvidia-smi' to see:")
    print("   - This process using GPU memory")
    print("   - Current GPU utilization")

print("\n‚úÖ Model successfully loaded on GPU!")
print("\nüéØ Next: Let's use this model to generate text...")


In [None]:
"""
Generate text with GPU acceleration
Watch the memory usage change!
"""

print("üéØ Generating text on GPU...\n")

# Step 1: Prepare input
text = "The future of AI is"
print(f"üìù Input: '{text}'")

# Step 2: Tokenize and move to device
print("\nüîÑ Tokenizing...")
inputs = tokenizer(text, return_tensors="pt")
print(f"   Input tokens shape: {inputs['input_ids'].shape}")
print(f"   Currently on: {inputs['input_ids'].device}")

# Step 3: Move inputs to same device as model
print("\nüîÑ Moving inputs to GPU...")
inputs = {k: v.to(device) for k, v in inputs.items()}
print(f"   Inputs now on: {inputs['input_ids'].device}")

# Step 4: Check memory before generation
if device.type == "cuda":
    memory_before = torch.cuda.memory_allocated(0) / 1e6
    print(f"\nüíæ GPU Memory before generation: {memory_before:.2f} MB")

# Step 5: Generate text
print("\nüöÄ Generating (this happens on GPU)...")
with torch.no_grad():  # Disable gradient computation for inference
    outputs = model.generate(
        inputs['input_ids'],
        max_length=50,
        num_return_sequences=1,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

# Step 6: Verify outputs are on GPU
print(f"‚úÖ Generation complete!")
print(f"   Output tensor device: {outputs.device}")

# Step 7: Check memory after generation
if device.type == "cuda":
    memory_after = torch.cuda.memory_allocated(0) / 1e6
    print(f"   GPU Memory after generation: {memory_after:.2f} MB")
    print(f"   Memory used during generation: {memory_after - memory_before:.2f} MB")

# Step 8: Decode and display result
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"\nüìÑ Generated text:\n")
print("=" * 70)
print(generated_text)
print("=" * 70)

print("\n‚úÖ GPU-accelerated generation complete!")
print("üí° This was 10-100x faster than CPU!")


---

# Section 4: Building Interactive Demos üé®

## The Art of Interactive Learning

A great launchable isn't just code - it's an **experience**. Users should:
- üéØ Understand what they're building
- üîß Interact with working examples
- üí° Learn by experimenting
- üéâ Feel accomplished at the end

## Elements of a Great Demo

### 1. Clear Objectives
Tell users what they'll accomplish

### 2. Progressive Complexity
Start simple, gradually add features

### 3. Immediate Feedback
Show results right away

### 4. Interactivity
Let users modify and experiment

### 5. Visual Elements
Use progress bars, formatting, emojis

---

## üéØ Interactive Demo: Text Generation Playground

Let's create an interactive text generation demo where users can experiment!


In [None]:
"""
Interactive Text Generation Playground
Users can customize the prompt and parameters!
"""

from tqdm.auto import tqdm
import torch

def generate_text_with_options(
    prompt,
    max_length=50,
    temperature=0.7,
    num_sequences=1,
    verbose=True
):
    """
    Generate text with customizable parameters
    
    Args:
        prompt: Starting text
        max_length: Maximum tokens to generate
        temperature: Creativity (0.1=conservative, 1.0=creative)
        num_sequences: Number of different generations
        verbose: Show progress and details
    """
    
    if verbose:
        print(f"üéØ Generating with:")
        print(f"   Prompt: '{prompt}'")
        print(f"   Max length: {max_length} tokens")
        print(f"   Temperature: {temperature}")
        print(f"   Sequences: {num_sequences}")
        print()
    
    # Tokenize
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    
    # Check GPU usage
    if device.type == "cuda" and verbose:
        mem_before = torch.cuda.memory_allocated(0) / 1e6
        print(f"üíæ GPU Memory: {mem_before:.2f} MB")
    
    # Generate
    if verbose:
        print("üöÄ Generating...\n")
    
    with torch.no_grad():
        outputs = model.generate(
            inputs['input_ids'],
            max_length=max_length,
            temperature=temperature,
            num_return_sequences=num_sequences,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
            top_p=0.95,
            top_k=50
        )
    
    # Decode results
    results = []
    for i, output in enumerate(outputs):
        text = tokenizer.decode(output, skip_special_tokens=True)
        results.append(text)
        
        if verbose:
            print(f"üìÑ Result {i+1}:")
            print("‚îÄ" * 70)
            print(text)
            print("‚îÄ" * 70)
            print()
    
    if device.type == "cuda" and verbose:
        mem_after = torch.cuda.memory_allocated(0) / 1e6
        print(f"üíæ GPU Memory after: {mem_after:.2f} MB")
    
    return results

# Example usage
print("=" * 70)
print("üé® INTERACTIVE TEXT GENERATION PLAYGROUND")
print("=" * 70)
print()

# Try it out!
results = generate_text_with_options(
    prompt="Artificial intelligence will transform",
    max_length=60,
    temperature=0.8,
    num_sequences=2
)

print("‚úÖ Generation complete!")


## üéÆ Try It Yourself!

**Exercise**: Modify the cell above to generate different text:

1. **Change the prompt** - Try: "In the year 2050,", "The best way to learn programming is", etc.
2. **Adjust temperature** - Low (0.3) = focused, High (1.2) = creative
3. **Generate multiple** - Set `num_sequences=3` for variety
4. **Make it longer** - Increase `max_length` (but watch GPU memory!)

**Pro Tips:**
- Temperature 0.7-0.9: Balanced creativity
- Temperature < 0.5: More factual, repetitive
- Temperature > 1.0: Very creative, sometimes incoherent
- `top_p=0.95`: Nucleus sampling for quality

---

## Adding Progress Indicators

For longer operations, always show progress!


In [None]:
"""
Demo: Progress indicators for better UX
"""

from tqdm.auto import tqdm
import time

print("üéØ Batch Text Generation with Progress Bar\n")

prompts = [
    "The future of technology",
    "Machine learning enables",
    "The most important skill",
]

results_list = []

# Progress bar for user feedback
for prompt in tqdm(prompts, desc="Generating", unit="prompt"):
    result = generate_text_with_options(
        prompt=prompt,
        max_length=40,
        temperature=0.7,
        num_sequences=1,
        verbose=False  # Suppress per-generation output
    )
    results_list.append((prompt, result[0]))
    
    # Small delay to see progress bar (remove in production)
    time.sleep(0.5)

print("\n" + "=" * 70)
print("üìä RESULTS SUMMARY")
print("=" * 70)

for i, (prompt, result) in enumerate(results_list, 1):
    print(f"\n{i}. Prompt: '{prompt}'")
    print(f"   Result: {result[:100]}...")

print("\n‚úÖ Batch generation complete!")


---

# Section 5: Prompt Engineering with Claude ü§ñ

## The AI-Assisted Development Workflow

Here's a secret: **The best developers use AI to build AI!**

### The Claude ‚Üí Cursor Pipeline

1. **Brainstorm with Claude** - Refine your launchable idea
2. **Generate Cursor Prompts** - Create detailed implementation prompts
3. **Implement with Cursor** - Let AI write the code
4. **Debug with Claude** - When stuck, describe the issue
5. **Iterate** - Repeat until perfect

## How to Create Effective Cursor Prompts

### ‚ùå Bad Prompt:
```
"Create a launchable for fine-tuning"
```
*Too vague - Cursor will guess and probably get it wrong*

### ‚úÖ Good Prompt:
```
Create a NVIDIA Launchable for fine-tuning Llama 2 on custom datasets.

Requirements:
1. Structure: Jupyter notebook-based launchable in root directory
   - Filename: fine-tune-llama2.ipynb
   - Include requirements.txt, README.md, .gitignore

2. Technical Implementation:
   - Use LoRA/QLoRA for efficient fine-tuning
   - 4-bit quantization for memory efficiency
   - Support custom dataset loading
   - GPU verification as first cell
   
3. Content Flow (5-7 sections):
   - Section 1: Setup & GPU verification
   - Section 2: Load base model with quantization
   - Section 3: Prepare dataset
   - Section 4: Configure LoRA
   - Section 5: Train with progress monitoring
   - Section 6: Evaluate and save
   - Section 7: Inference examples

4. Quality Requirements:
   - All code must run on 16GB GPU
   - Include error handling
   - Add progress bars for training
   - Verify GPU usage throughout
   - Test on fresh environment

Deliverables:
- fine-tune-llama2.ipynb (working notebook)
- requirements.txt (pinned versions)
- README.md (prerequisites, quick start)
```

*Specific, actionable, complete!*

---

## üéØ Exercise: Generate Your Own Cursor Prompt

Let's use Claude to create a prompt for YOUR launchable idea!


## üìù Claude Prompt Template

**Copy this template to Claude.ai:**

```
I want to create a NVIDIA Launchable for [YOUR IDEA].

My launchable should:
- Target audience: [beginners/intermediate/advanced]
- Main goal: [what users will learn/build]
- Time to complete: [30 minutes/1 hour/2 hours]
- GPU requirements: [4GB/8GB/16GB+ VRAM]

Please help me create a detailed Cursor prompt that includes:
1. Clear structure and file organization
2. Technical requirements and dependencies
3. Detailed content flow (sections and what goes in each)
4. Quality standards and testing requirements
5. GPU-specific considerations

The prompt should be so detailed that Cursor can implement it end-to-end.
```

---

## Real Examples: What Makes Great Launchables

### Example 1: "Fine-tune Stable Diffusion"
- ‚úÖ Clear value prop: "Train on your own images in 30 min"
- ‚úÖ Reasonable scope: Uses LoRA, not full fine-tuning
- ‚úÖ Interactive: Shows results after each epoch
- ‚úÖ Practical: Saves model for reuse

### Example 2: "RAG System from Scratch"
- ‚úÖ Teaches concepts: Explains embeddings, vector DBs
- ‚úÖ Complete workflow: Ingest ‚Üí Embed ‚Üí Query ‚Üí Generate
- ‚úÖ Real use case: Build searchable documentation
- ‚úÖ Extensible: Easy to add your own data

### Example 3: "LLM Performance Optimization"
- ‚úÖ Comparative: Shows different optimization techniques
- ‚úÖ Benchmarked: Measures speed and memory
- ‚úÖ Production-ready: Actual deployment patterns
- ‚úÖ Best practices: Error handling, monitoring

---

## Debugging with Claude

When you encounter issues in Cursor:

1. **Copy the error message**
2. **Copy relevant code (10-20 lines)**
3. **Ask Claude**:

```
I'm building a launchable and got this error:

[paste error]

Here's the relevant code:

[paste code]

Context: I'm trying to [what you're doing]

Can you:
1. Explain what's wrong
2. Provide the fix
3. Suggest how to prevent this in the future
```

Claude is excellent at debugging because it can:
- Understand full context
- Suggest multiple solutions
- Explain the underlying issue
- Recommend best practices


---

# Section 6: Debugging & Testing üîß

## Common Launchable Errors

### Error 1: "CUDA out of memory"

**Cause**: Model/tensors too large for GPU

**Solutions**:
```python
# 1. Clear cache regularly
torch.cuda.empty_cache()

# 2. Use smaller batch sizes
batch_size = 1  # Start small

# 3. Enable gradient checkpointing
model.gradient_checkpointing_enable()

# 4. Use mixed precision
from torch.cuda.amp import autocast
with autocast():
    output = model(input)

# 5. Check memory before operations
print(f"Available: {torch.cuda.memory_allocated(0) / 1e9:.2f} GB")
```

### Error 2: "RuntimeError: Expected all tensors to be on the same device"

**Cause**: Model on GPU, inputs on CPU (or vice versa)

**Solution**:
```python
# Always move inputs to model's device
device = next(model.parameters()).device
inputs = inputs.to(device)
```

### Error 3: "ModuleNotFoundError: No module named 'X'"

**Cause**: Missing dependency

**Solutions**:
```python
# 1. Check requirements.txt is complete
# 2. Install missing package
!pip install package-name

# 3. Verify installation
import importlib
importlib.import_module('package_name')
```

### Error 4: Model generates nonsense

**Causes & Solutions**:
- Temperature too high ‚Üí Lower to 0.7-0.9
- No sampling ‚Üí Enable `do_sample=True`
- Wrong tokenizer ‚Üí Verify tokenizer matches model
- Not enough context ‚Üí Provide better prompt

---

## GPU-Specific Debugging


In [None]:
"""
GPU Debugging Toolkit
Run this when something seems wrong with GPU
"""

import torch
import subprocess

def gpu_health_check():
    """Comprehensive GPU health check"""
    
    print("=" * 70)
    print("üîß GPU HEALTH CHECK")
    print("=" * 70)
    
    # 1. CUDA availability
    print("\n1Ô∏è‚É£ CUDA Availability")
    cuda_available = torch.cuda.is_available()
    print(f"   {'‚úÖ' if cuda_available else '‚ùå'} CUDA Available: {cuda_available}")
    
    if not cuda_available:
        print("\n   ‚ö†Ô∏è  TROUBLESHOOTING:")
        print("   - Run 'nvidia-smi' in terminal")
        print("   - Check CUDA installation")
        print("   - Reinstall PyTorch with CUDA support")
        return
    
    # 2. GPU Details
    print("\n2Ô∏è‚É£ GPU Information")
    print(f"   Name: {torch.cuda.get_device_name(0)}")
    print(f"   Compute Capability: {torch.cuda.get_device_capability(0)}")
    
    # 3. Memory Status
    print("\n3Ô∏è‚É£ Memory Status")
    total_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    allocated = torch.cuda.memory_allocated(0) / 1e9
    reserved = torch.cuda.memory_reserved(0) / 1e9
    free = total_memory - reserved
    
    print(f"   Total: {total_memory:.2f} GB")
    print(f"   Allocated: {allocated:.2f} GB")
    print(f"   Reserved: {reserved:.2f} GB")
    print(f"   Free: {free:.2f} GB")
    
    # Warning if low memory
    if free < 1.0:
        print("   ‚ö†Ô∏è  LOW MEMORY! Consider:")
        print("      - torch.cuda.empty_cache()")
        print("      - Reduce batch size")
        print("      - Use smaller model")
    
    # 4. Test GPU computation
    print("\n4Ô∏è‚É£ GPU Computation Test")
    try:
        test = torch.randn(1000, 1000, device='cuda')
        result = torch.matmul(test, test)
        print(f"   ‚úÖ Computation successful")
        print(f"   ‚úÖ Result shape: {result.shape}")
        print(f"   ‚úÖ Result device: {result.device}")
        del test, result
        torch.cuda.empty_cache()
    except Exception as e:
        print(f"   ‚ùå Computation failed: {e}")
    
    # 5. Check nvidia-smi
    print("\n5Ô∏è‚É£ nvidia-smi Status")
    try:
        result = subprocess.run(['nvidia-smi', '--query-gpu=utilization.gpu,memory.used,memory.total',
                               '--format=csv,noheader,nounits'],
                              capture_output=True, text=True, timeout=5)
        if result.returncode == 0:
            gpu_util, mem_used, mem_total = result.stdout.strip().split(',')
            print(f"   GPU Utilization: {gpu_util.strip()}%")
            print(f"   Memory Used: {mem_used.strip()} MB / {mem_total.strip()} MB")
        else:
            print("   ‚ö†Ô∏è  nvidia-smi not available")
    except Exception as e:
        print(f"   ‚ö†Ô∏è  Could not run nvidia-smi: {e}")
    
    print("\n" + "=" * 70)
    print("‚úÖ Health check complete!")
    print("=" * 70)

# Run the health check
gpu_health_check()


## Testing Checklist

Before you share your launchable, test EVERYTHING:

### ‚úÖ Pre-Launch Checklist

**Environment Testing:**
- [ ] Restart kernel and run all cells sequentially
- [ ] All cells complete without errors
- [ ] GPU is actually being used (check nvidia-smi)
- [ ] Memory usage is reasonable (<80% GPU memory)
- [ ] All imports work

**Code Quality:**
- [ ] No hardcoded paths or personal info
- [ ] All variables are defined before use
- [ ] Error messages are helpful
- [ ] Progress indicators for long operations
- [ ] GPU verification in first executable cell

**Documentation:**
- [ ] README.md is clear and complete
- [ ] requirements.txt has all dependencies
- [ ] .gitignore excludes large files
- [ ] Code has explanatory comments
- [ ] Examples work as described

**User Experience:**
- [ ] Instructions are easy to follow
- [ ] Exercises have solutions or hints
- [ ] Output is formatted and readable
- [ ] No broken links
- [ ] Appropriate emojis and formatting

**Final Test:**
- [ ] Fresh environment test (new venv)
- [ ] Test on minimum GPU requirements
- [ ] Ask someone else to try it
- [ ] Fix any issues they encounter

---

## Automated Testing Pattern


---

# Section 7: Git & GitHub Setup üì¶

## Version Control for Launchables

Every launchable should be in version control. Here's the workflow:

## Initial Setup

### 1. Initialize Git Repository

```bash
# Navigate to your launchable directory
cd /path/to/your-launchable

# Initialize git
git init

# Add .gitignore (prevents committing large files)
# Your .gitignore should already exist!
```

### 2. Verify SSH Authentication

Before pushing to GitHub, verify SSH is set up:

```bash
# Test GitHub SSH connection
ssh -T git@github.com

# Expected output:
# "Hi username! You've successfully authenticated..."
```

If this fails:
- Generate SSH key: `ssh-keygen -t ed25519 -C "your_email@example.com"`
- Add to GitHub: Settings ‚Üí SSH and GPG keys ‚Üí New SSH key
- Guide: https://docs.github.com/en/authentication/connecting-to-github-with-ssh

### 3. Create GitHub Repository

1. Go to https://github.com/new
2. Name it (e.g., `my-awesome-launchable`)
3. **Don't** initialize with README (we have one!)
4. Click "Create repository"

---

## Git Workflow


## Standard Git Commands for Launchables

```bash
# 1. Check status (what's changed?)
git status

# 2. Add all files
git add .

# Or add specific files
git add README.md requirements.txt your-notebook.ipynb

# 3. Commit with descriptive message
git commit -m "Initial commit: Working launchable with GPU verification"

# 4. Add remote (from GitHub's instructions)
git remote add origin git@github.com:YOUR-USERNAME/YOUR-REPO.git

# 5. Push to GitHub
git branch -M main
git push -u origin main

# For subsequent updates:
git add .
git commit -m "Update: Add interactive examples"
git push
```

## What to Commit vs Ignore

### ‚úÖ DO Commit:
- Source code (`.ipynb`, `.py`)
- Documentation (`.md` files)
- Configuration (`requirements.txt`, `.gitignore`)
- Small assets (images < 1MB)

### ‚ùå DON'T Commit:
- Model weights (`.bin`, `.safetensors`, `.pth`)
- Virtual environments (`venv/`, `env/`)
- Cache files (`__pycache__/`, `.ipynb_checkpoints/`)
- Large datasets
- API keys or secrets

**Your `.gitignore` handles this automatically!**

---

## Pro Git Tips

### Commit Message Best Practices

```bash
# Good messages
git commit -m "Add GPU memory optimization"
git commit -m "Fix: Handle CUDA out of memory error"
git commit -m "Update README with prerequisites"

# Bad messages
git commit -m "update"
git commit -m "fix stuff"
git commit -m "asdf"
```

### Useful Git Commands

```bash
# See commit history
git log --oneline

# Undo last commit (keep changes)
git reset --soft HEAD~1

# Discard local changes
git checkout -- filename

# Create a branch for experiments
git checkout -b experiment

# View what changed
git diff
```


---

# Section 8: Deploying to Brev üöÄ

## Why Brev?

**Brev** makes GPU deployment effortless:
- üöÄ One-click deployment from GitHub
- üíª Instant GPU access
- üåê Shareable links
- üí∞ Pay only for what you use
- üîÑ Easy updates

## Prerequisites

1. **GitHub Repository** - Your launchable pushed to GitHub ‚úÖ
2. **Brev Account** - Sign up at [brev.dev](https://brev.dev)
3. **Tested Notebook** - Everything works locally ‚úÖ

---

## Deployment Checklist

### Before Deploying

- [ ] **Test locally**: Restart kernel, run all cells successfully
- [ ] **GPU verification**: First cell checks GPU
- [ ] **Dependencies**: `requirements.txt` is complete
- [ ] **Documentation**: README.md is clear
- [ ] **No secrets**: No API keys or passwords in code
- [ ] **Clean repo**: `.gitignore` excludes unnecessary files
- [ ] **Pushed to GitHub**: Latest version on main branch

### Deployment Steps

1. **Go to Brev.dev**
   - Log in with GitHub

2. **Create New Instance**
   - Click "New Instance"
   - Select GPU type (start with T4 or A10)

3. **Connect GitHub Repository**
   - Authorize Brev to access your repos
   - Select your launchable repository

4. **Configure Environment**
   - Brev auto-detects `requirements.txt`
   - Selects Python version
   - Installs dependencies

5. **Launch!**
   - Instance boots in 1-2 minutes
   - Jupyter opens automatically
   - GPU ready to use

6. **Share**
   - Get shareable link
   - Anyone can access and run your launchable
   - No setup required for users!

---

## Brev Configuration Tips

### Optimize for Fast Boot

```txt
# requirements.txt - Pin versions for reproducibility
torch==2.1.0
transformers==4.35.0
# ... rest of your deps
```

### Add a `brev.yaml` (Optional)

```yaml
# brev.yaml - Advanced configuration
python_version: "3.10"
gpu: "t4"
environment:
  TRANSFORMERS_CACHE: "/workspace/.cache"
```

### Post-Deployment Testing

Once deployed:
1. Open the Brev link
2. Run through the entire notebook
3. Verify GPU works
4. Test all interactive elements
5. Check that outputs are correct

---

## Updating Your Launchable

When you make changes:

```bash
# Local: Make improvements
# Edit notebook, test thoroughly

# Commit and push
git add .
git commit -m "Update: Improve error handling"
git push

# Brev: Restart instance
# Changes automatically pulled on restart
```

---

## Monitoring and Debugging on Brev

### Check GPU Usage

```bash
# In Brev terminal
watch -n 1 nvidia-smi
```

### View Logs

```bash
# See installation logs
cat /workspace/.brev/logs/install.log

# Python errors
# Visible in Jupyter notebook output
```

### Common Brev Issues

**Issue**: Dependencies fail to install
- **Fix**: Check `requirements.txt` syntax
- **Fix**: Pin versions explicitly

**Issue**: Notebook cells fail on Brev but work locally
- **Fix**: Different GPU type - adjust memory usage
- **Fix**: Check CUDA version compatibility

**Issue**: Slow boot time
- **Fix**: Minimize dependencies
- **Fix**: Use smaller base image

---

## Production Best Practices

### Resource Management

```python
# Clean up after intensive operations
import gc
import torch

def cleanup():
    """Call this after heavy GPU operations"""
    gc.collect()
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        
# Use it
# ... heavy operation ...
cleanup()
```

### Error Handling

```python
try:
    # Your GPU code
    model = load_model()
except RuntimeError as e:
    if "out of memory" in str(e):
        print("üí° Try: Restart kernel or use smaller batch size")
    raise
```

### User Guidance

Always include:
- Expected runtime for each cell
- What outputs users should see
- What to do if errors occur
- How to get help

---

## Resources

- **Brev Documentation**: [docs.brev.dev](https://docs.brev.dev)
- **Brev Discord**: Community support
- **GPU Pricing**: [brev.dev/pricing](https://brev.dev/pricing)
- **Example Deployments**: [github.com/brevdev/launchables](https://github.com/brevdev/launchables)


---

# Section 9: Your First Launchable Exercise üéì

## Hands-On Project: Build a Simple Sentiment Analyzer

**Goal**: Create a working GPU-accelerated sentiment analysis launchable

**Time**: 15-20 minutes

**What You'll Build**:
- Load a pre-trained sentiment model (DistilBERT)
- Create an interactive analyzer
- Demonstrate GPU acceleration
- Show proper device management

This is a **complete mini-launchable** that you can expand and customize!

---

## Step 1: Load Sentiment Analysis Model


In [None]:
"""
Step 1: Load a pre-trained sentiment analysis model
This demonstrates best practices for model loading
"""

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

print("=" * 70)
print("üéØ SENTIMENT ANALYSIS LAUNCHABLE")
print("=" * 70)
print("\nüì¶ Loading Model...\n")

# Device setup (ALWAYS first!)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"üéØ Device: {device}")

# Load model and tokenizer
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
print(f"üì• Loading {model_name}...")

sentiment_tokenizer = AutoTokenizer.from_pretrained(model_name)
sentiment_model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Move to GPU
print(f"\nüîÑ Moving model to {device}...")
sentiment_model = sentiment_model.to(device)
sentiment_model.eval()  # Set to evaluation mode

# Verify
model_device = next(sentiment_model.parameters()).device
print(f"‚úÖ Model loaded on: {model_device}")

# Check memory
if device.type == "cuda":
    memory = torch.cuda.memory_allocated(0) / 1e6
    print(f"üíæ GPU Memory: {memory:.2f} MB")

print("\n‚úÖ Setup complete! Ready for sentiment analysis.")


## Step 2: Create Sentiment Analysis Function


In [None]:
"""
Step 2: Create reusable sentiment analysis function
"""

import torch.nn.functional as F

def analyze_sentiment(text, show_details=True):
    """
    Analyze sentiment of text using GPU-accelerated model
    
    Args:
        text: Input text to analyze
        show_details: Whether to print detailed info
        
    Returns:
        dict: Contains sentiment, confidence, and label
    """
    
    if show_details:
        print(f"üìù Analyzing: '{text}'\n")
    
    # Tokenize
    inputs = sentiment_tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
        padding=True
    )
    
    # Move to device
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    if show_details:
        print(f"üéØ Inputs on: {inputs['input_ids'].device}")
    
    # Inference
    with torch.no_grad():
        outputs = sentiment_model(**inputs)
        logits = outputs.logits
        
    # Get probabilities
    probs = F.softmax(logits, dim=-1)
    confidence, predicted_class = torch.max(probs, dim=1)
    
    # Map to labels
    labels = ["NEGATIVE", "POSITIVE"]
    sentiment = labels[predicted_class.item()]
    confidence_score = confidence.item()
    
    # Display results
    if show_details:
        print(f"   Logits device: {logits.device}")
        print(f"\n{'='*70}")
        print(f"Result: {sentiment}")
        print(f"Confidence: {confidence_score:.2%}")
        print(f"{'='*70}\n")
        
        # Show both probabilities
        print("Full Distribution:")
        for label, prob in zip(labels, probs[0]):
            bar = "‚ñà" * int(prob.item() * 50)
            print(f"  {label:>8}: {prob.item():.2%} {bar}")
        print()
    
    return {
        "sentiment": sentiment,
        "confidence": confidence_score,
        "label": predicted_class.item(),
        "probabilities": probs[0].cpu().tolist()
    }

# Test it!
print("üß™ Testing sentiment analyzer...\n")
result = analyze_sentiment("This tutorial is amazing! I love learning about GPUs!")
print("‚úÖ Test complete!")


## Step 3: Interactive Demo


In [None]:
"""
Step 3: Batch analysis with progress tracking
"""

from tqdm.auto import tqdm

print("=" * 70)
print("üìä BATCH SENTIMENT ANALYSIS")
print("=" * 70)
print()

# Sample texts to analyze
texts_to_analyze = [
    "NVIDIA GPUs are incredibly powerful for AI workloads!",
    "I'm frustrated with how long this is taking.",
    "The weather today is okay, nothing special.",
    "This is the worst experience I've ever had.",
    "Absolutely fantastic! Best decision ever.",
    "Launchables make AI development so much easier!",
]

results = []

print("üîÑ Analyzing multiple texts...\n")

for i, text in enumerate(tqdm(texts_to_analyze, desc="Processing"), 1):
    result = analyze_sentiment(text, show_details=False)
    results.append({
        "text": text,
        **result
    })

# Display summary
print("\n" + "=" * 70)
print("üìä RESULTS SUMMARY")
print("=" * 70)

for i, result in enumerate(results, 1):
    sentiment = result['sentiment']
    confidence = result['confidence']
    text = result['text']
    
    # Emoji based on sentiment
    emoji = "üòä" if sentiment == "POSITIVE" else "üòû"
    
    print(f"\n{i}. {emoji} {sentiment} ({confidence:.1%})")
    print(f"   \"{text}\"")

print("\n" + "=" * 70)
print(f"‚úÖ Analyzed {len(results)} texts using GPU acceleration!")
print("=" * 70)


## üéÆ Your Turn!

**Exercise**: Modify the sentiment analyzer above:

1. **Try your own text**: Change the texts in the batch analysis
2. **Add more examples**: Expand the list with your own sentences
3. **Visualize results**: Add a simple bar chart with matplotlib
4. **Compare performance**: Time CPU vs GPU execution

**Challenge**: Turn this into a full launchable!
1. Create a new directory `sentiment-analysis-launchable/`
2. Move this code to a new notebook
3. Add README.md and requirements.txt
4. Push to GitHub
5. Deploy to Brev!

---

## What You Just Learned

‚úÖ **Device Management** - Proper GPU setup  
‚úÖ **Model Loading** - Pre-trained models on GPU  
‚úÖ **Inference** - GPU-accelerated predictions  
‚úÖ **Batch Processing** - Efficient multi-input handling  
‚úÖ **Progress Tracking** - User-friendly feedback  
‚úÖ **Results Display** - Clear, formatted output  

**This is a complete launchable pattern!** You can use this as a template for any transformer-based model.


---

# Section 10: Resources & Next Steps üåü

## Congratulations! üéâ

You've completed the **How to Build NVIDIA Launchables** tutorial!

You now know:
- ‚úÖ The launchables pattern and structure
- ‚úÖ GPU-first development practices
- ‚úÖ How to build interactive demos
- ‚úÖ AI-assisted development with Claude + Cursor
- ‚úÖ Debugging and testing strategies
- ‚úÖ Git workflow for launchables
- ‚úÖ Deploying to Brev
- ‚úÖ Building real GPU-accelerated applications

---

## üìö Essential Resources

### Official Documentation
- **Launchables Repository**: [github.com/brevdev/launchables](https://github.com/brevdev/launchables)
- **Brev Documentation**: [docs.brev.dev](https://docs.brev.dev)
- **NVIDIA CUDA**: [developer.nvidia.com/cuda-toolkit](https://developer.nvidia.com/cuda-toolkit)
- **PyTorch**: [pytorch.org/docs](https://pytorch.org/docs)
- **Transformers**: [huggingface.co/docs/transformers](https://huggingface.co/docs/transformers)

### Learning Resources
- **GPU Programming**: [NVIDIA Deep Learning Institute](https://www.nvidia.com/en-us/training/)
- **Transformers Course**: [huggingface.co/course](https://huggingface.co/course)
- **MLOps Best Practices**: [ml-ops.org](https://ml-ops.org/)

### Community
- **Brev Discord**: Get help and share your launchables
- **Launchables Discussions**: GitHub Discussions on brevdev/launchables
- **HuggingFace Forums**: [discuss.huggingface.co](https://discuss.huggingface.co)

---

## üí° Launchable Ideas to Build

Need inspiration? Here are ideas across different skill levels:

### Beginner-Friendly
1. **Image Classification** - Fine-tune ResNet on custom dataset
2. **Text Summarization** - Summarize articles with T5
3. **Question Answering** - Build a Q&A system with BERT
4. **Style Transfer** - Artistic style transfer with CNNs
5. **Named Entity Recognition** - Extract entities from text

### Intermediate
1. **Fine-tune LLaMA 2** - LoRA fine-tuning on custom data
2. **RAG System** - Retrieval-augmented generation from scratch
3. **Stable Diffusion Fine-tuning** - Train on your images
4. **Multi-modal Search** - CLIP-based image-text search
5. **Voice Cloning** - Text-to-speech with custom voices

### Advanced
1. **LLM Inference Optimization** - vLLM, TensorRT-LLM comparison
2. **Distributed Training** - Multi-GPU fine-tuning
3. **Custom CUDA Kernels** - Optimize specific operations
4. **Model Quantization** - INT8/INT4 optimization guide
5. **Production ML System** - End-to-end deployment pipeline

---

## üöÄ Your Next Steps

### 1. Build Your First Launchable (Today!)
- Pick an idea from above (or your own)
- Use Claude to generate a detailed Cursor prompt
- Implement it following this tutorial's pattern
- Test thoroughly
- Push to GitHub
- Deploy to Brev
- Share it!

### 2. Join the Community (This Week)
- Star the [brevdev/launchables](https://github.com/brevdev/launchables) repo
- Join Brev Discord
- Share your first launchable
- Get feedback and iterate

### 3. Keep Learning (Ongoing)
- Build one launchable per month
- Experiment with new models and techniques
- Read other people's launchables
- Contribute improvements
- Help beginners get started

---

## üéØ Call to Action

**The launchables ecosystem needs YOU!**

The best way to learn is by building and sharing. Your launchable might:
- Help someone learn a new skill
- Showcase a novel technique
- Solve a real problem
- Inspire others to build

**Jensen's vision is to democratize AI development.** Every launchable you create makes AI more accessible.

---

## üìù Quick Start Template

Ready to start? Copy this to Claude to begin your next launchable:

```
Create a NVIDIA Launchable for [YOUR IDEA].

Structure:
- Jupyter notebook in root directory
- Include requirements.txt, README.md, .gitignore
- GPU verification as first executable cell
- 5-7 interactive sections
- Working examples with real models
- Progress indicators and error handling
- Test on 8GB GPU

Make it beginner-friendly, well-documented, and deployable to Brev.
Follow the patterns from the "How to Build Launchables" tutorial.
```

---

## üôè Thank You!

Thank you for completing this tutorial! You're now part of the launchables community.

Remember:
- **Start simple** - Your first launchable doesn't need to be perfect
- **Ship fast** - Get it working, then improve
- **Share openly** - Others will learn from your work
- **Ask for help** - Community is here to support you
- **Pay it forward** - Help the next person getting started

---

## ‚≠ê One More Thing...

If this tutorial helped you, please:
1. ‚≠ê Star the repository
2. üê¶ Share on social media
3. üí¨ Tell your friends
4. üèóÔ∏è Build something amazing

**Now go build your launchable! The world is waiting to see what you create.** üöÄ

---

*Tutorial created with ‚ù§Ô∏è for the NVIDIA Launchables community*

*"The future of AI is being built by people like you, one launchable at a time."*


In [None]:
"""
Quick test suite for your launchable
Run this before sharing!
"""

def test_launchable():
    """Run automated tests"""
    
    print("üß™ Running Launchable Tests...\n")
    
    tests_passed = 0
    tests_total = 0
    
    # Test 1: GPU Available
    tests_total += 1
    if torch.cuda.is_available():
        print("‚úÖ Test 1: GPU available")
        tests_passed += 1
    else:
        print("‚ùå Test 1: GPU not available")
    
    # Test 2: Model loaded
    tests_total += 1
    try:
        if model is not None and hasattr(model, 'generate'):
            print("‚úÖ Test 2: Model loaded correctly")
            tests_passed += 1
        else:
            print("‚ùå Test 2: Model not loaded")
    except NameError:
        print("‚ùå Test 2: Model not defined")
    
    # Test 3: Model on GPU
    tests_total += 1
    try:
        model_device = next(model.parameters()).device
        if model_device.type == 'cuda':
            print(f"‚úÖ Test 3: Model on GPU ({model_device})")
            tests_passed += 1
        else:
            print(f"‚ùå Test 3: Model not on GPU (on {model_device})")
    except Exception as e:
        print(f"‚ùå Test 3: Could not check model device - {e}")
    
    # Test 4: Tokenizer loaded
    tests_total += 1
    try:
        if tokenizer is not None:
            test_text = tokenizer("test")
            print("‚úÖ Test 4: Tokenizer working")
            tests_passed += 1
        else:
            print("‚ùå Test 4: Tokenizer not loaded")
    except Exception as e:
        print(f"‚ùå Test 4: Tokenizer error - {e}")
    
    # Test 5: Generation works
    tests_total += 1
    try:
        test_input = tokenizer("Test", return_tensors="pt").to(device)
        with torch.no_grad():
            test_output = model.generate(test_input['input_ids'], max_length=10)
        print("‚úÖ Test 5: Text generation working")
        tests_passed += 1
    except Exception as e:
        print(f"‚ùå Test 5: Generation failed - {e}")
    
    # Test 6: Memory management
    tests_total += 1
    if torch.cuda.is_available():
        allocated = torch.cuda.memory_allocated(0) / 1e9
        total = torch.cuda.get_device_properties(0).total_memory / 1e9
        usage_percent = (allocated / total) * 100
        
        if usage_percent < 90:
            print(f"‚úÖ Test 6: Memory usage OK ({usage_percent:.1f}%)")
            tests_passed += 1
        else:
            print(f"‚ö†Ô∏è  Test 6: High memory usage ({usage_percent:.1f}%)")
    
    # Summary
    print("\n" + "=" * 70)
    print(f"üìä Test Results: {tests_passed}/{tests_total} passed")
    
    if tests_passed == tests_total:
        print("üéâ All tests passed! Your launchable is ready!")
    else:
        print("‚ö†Ô∏è  Some tests failed. Review and fix before sharing.")
    print("=" * 70)

# Run tests
test_launchable()
