# üöÄ How to Build Brev Launchables

## TL;DR - Why This Matters

**The problem:** You built something awesome in AI. You share it. 98% of developers never see it work because they hit setup/GPU issues.

**The solution:** Turn your notebook into a Launchable. Developers click a link, land in a GPU notebook in 30 seconds, and **actually try your work**.

**The impact:** 30x more adoption, more GitHub stars, more visibility.

**Time investment:** ~30 minutes to convert your existing notebook.

---

## What You'll Learn Today

This tutorial teaches you how to transform your AI notebooks into **shareable, GPU-backed experiences** that anyone can run with one click.

**You'll build:**
- ‚úÖ GPU-verified interactive demos
- ‚úÖ Shareable prototypes that eliminate setup friction
- ‚úÖ Working examples that get TRIED, not just bookmarked
- ‚úÖ Live showcases with automatic GPU access

## Who This Is For

**Built something in AI?** This is for you if you want to:
- Share your work with the community
- Make your innovations discoverable
- Help others build faster

## Tutorial Structure

1. **Why Build Launchables?** - Value proposition and use cases
2. **Your First Simple Demo** - Load and run a model on GPU
3. **Making It Interactive** - Add engagement and user controls
4. **Best Practices** - Patterns for great launchables
5. **Sharing Your Launchable** - GitHub and Brev deployment
6. **Your Challenge** - Build your own!

**Estimated time:** 45-60 minutes

---

Let's get started! üëá


In [None]:
# Cell 2: Install Required Packages
# This installs everything needed for this tutorial in your notebook's kernel

import sys
!{sys.executable} -m pip install -q torch transformers accelerate matplotlib numpy tqdm ipywidgets

print("="  * 60)
print("‚úÖ Installation complete!")
print("="  * 60)
print("\n‚ö†Ô∏è  IMPORTANT: Please restart the kernel now")
print("   - Click 'Kernel' ‚Üí 'Restart' in the menu")
print("   - Then run all cells below")
print("="  * 60)


---

## ‚ö†Ô∏è Restart Kernel Required

**After running the installation cell above:**
1. Click `Kernel` ‚Üí `Restart` in the menu
2. Run all cells below from this point forward

This ensures the newly installed packages are loaded correctly.

---


In [None]:
# Cell 4: GPU Verification
# This verifies your GPU is available and ready to use

import torch

print("üîç GPU Verification Report")
print("=" * 50)

if torch.cuda.is_available():
    print(f"‚úÖ CUDA Available: Yes")
    print(f"‚úÖ GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"‚úÖ GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    print(f"‚úÖ CUDA Version: {torch.version.cuda}")
    device = torch.device("cuda")
    print("\nüéâ GPU is ready!")
else:
    print("‚ö†Ô∏è WARNING: No GPU detected")
    print("This tutorial requires a GPU-enabled instance.")
    print("Please deploy this launchable on a GPU-enabled environment.")
    device = torch.device("cpu")

print(f"\nüéØ Using device: {device}")
print("=" * 50)


---

# Section 1: Why Build Launchables? üéØ

## The Problem: Your Innovation Dies in Setup Hell

You've built something amazing - a faster training method, an optimized model, a breakthrough technique. You share it on GitHub with a README and demo notebook.

**Here's what happens:**

```
100 developers see your post
  ‚Üì
  95 click your repo
    ‚Üì
    50 start reading README
      ‚Üì
      20 attempt "pip install"
        ‚Üì
        5 actually get it working
          ‚Üì
          2 try your demo
```

**Your innovation:** Buried in setup friction.  
**Your impact:** 2% of interested developers.

## The Solution: Launchables = Zero-Friction Demos

A **Launchable** is a self-contained, GPU-backed notebook that runs with one click.

**Same scenario with a launchable:**

```
100 developers see your post
  ‚Üì
  95 click "Launch"
    ‚Üì
    90 land in GPU notebook (30 seconds)
      ‚Üì
      85 run your demo successfully
        ‚Üì
        60 see your innovation work
          ‚Üì
          20 star your repo, 10 tweet about it
```

**Your innovation:** Actually seen and experienced.  
**Your impact:** 60% of interested developers.  
**Result:** 30x more people trying your work.

## What Should You Turn Into a Launchable?

**Perfect for:**
- "2x faster fine-tuning" ‚Üí Run benchmark in 60 seconds
- "New architecture" ‚Üí Side-by-side comparison with baseline
- "Optimization tool" ‚Üí Before/after demo with real metrics
- "Paper reproduction" ‚Üí One-click run of your results
- "Library showcase" ‚Üí Interactive tutorial with examples

**Real examples:**
- Unsloth: Shows QLoRA speedup vs. Hugging Face (users see 2x instantly)
- Axolotl: Compare training configs side-by-side (no setup required)
- Your tool: Gets tried instead of bookmarked and forgotten

## What's In It For You?

**As a developer/company, launchables give you:**

‚úÖ **More adoption:** 30x more devs actually TRY your work  
‚úÖ **More visibility:** Featured in Brev showcase, shared on social  
‚úÖ **More credibility:** "It actually works" > "Trust me bro"  
‚úÖ **More feedback:** Users engage when friction is zero  
‚úÖ **More stars:** Working demos = GitHub stars + tweets

**Bottom line:** Your innovation gets USED, not just seen.

---

**In this section, you learned:** Why launchables 30x your reach and impact.


---

# Section 2: Your First Simple Demo üöÄ

Let's build a minimal working example. You'll:
- Load a tiny model (DistilGPT2)
- Place it explicitly on GPU
- Run simple inference
- Show GPU memory usage

**This is the foundation of every launchable.**


In [None]:
# Load and run a simple model on GPU

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import time
import warnings
warnings.filterwarnings('ignore')

print("=" * 60)
print("üöÄ Simple GPU Demo: Text Generation")
print("=" * 60)
print("\nüéØ This demo optimized for: 1x NVIDIA L40s or A100 40GB")
print("üí° Works on: Any NVIDIA GPU with 8GB+ VRAM")

# Load model (suppressing verbose output)
print("\nüì• Loading DistilGPT2...")
model_name = "distilgpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name, clean_up_tokenization_spaces=True)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Move to GPU explicitly
print(f"üîÑ Moving model to {device}...")
model = model.to(device)
_ = model.eval()

# Verify placement
model_device = next(model.parameters()).device
print(f"‚úÖ Model is on: {model_device}")

# Show GPU memory
if torch.cuda.is_available():
    memory_used = torch.cuda.memory_allocated(0) / 1e9
    print(f"üíæ GPU Memory Used: {memory_used:.2f} GB")

# Generate text
print("\nüé® Generating text...")
prompt = "Brev launchables make AI development"
inputs = tokenizer(prompt, return_tensors="pt").to(device)

start = time.time()
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"],
        max_length=40,
        do_sample=True,
        temperature=0.8,
        pad_token_id=tokenizer.eos_token_id
    )
generation_time = time.time() - start

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"\nüìù Result: {generated_text}")
print(f"‚ö° Time: {generation_time:.2f}s")

print("\n" + "=" * 60)
print("‚úÖ Demo complete! Your GPU is working.")
print("=" * 60)


### ‚úÖ What You Just Learned

You successfully:
1. Loaded a model
2. Moved it to GPU explicitly (`.to(device)`)
3. Verified GPU placement
4. Monitored GPU memory
5. Ran inference on GPU

**This pattern applies to ANY model you want to showcase!**

---


# Section 3: Making It Interactive üé®

Great launchables aren't just code - they're **engaging experiences**. Let's add:
- User-editable parameters
- Visual feedback
- Timing metrics
- Clear outputs

**In this section, you'll:** Build an interactive sentiment analyzer.


In [None]:
# Interactive sentiment analysis demo

from transformers import pipeline
import torch
import time
import warnings
warnings.filterwarnings('ignore')

print("=" * 60)
print("üé≠ Interactive Sentiment Analysis")
print("=" * 60)

# Load pipeline on GPU (suppressing verbose output)
device_id = 0 if torch.cuda.is_available() else -1
classifier = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english",
    device=device_id
)

print(f"\n‚úÖ Model loaded on: {'GPU' if device_id == 0 else 'CPU'}")

# User-editable test cases
test_texts = [
    "Launchables make sharing AI work so easy!",
    "Setup took forever and nothing worked.",
    "The tutorial is clear and examples run perfectly.",
]

print("\nüí° Try editing test_texts above and rerunning this cell!\n")

# Analyze each text
for i, text in enumerate(test_texts, 1):
    print(f"Text {i}: \"{text}\"")
    
    start = time.time()
    result = classifier(text)[0]
    inference_time = time.time() - start
    
    # Visual output
    sentiment = result['label']
    confidence = result['score'] * 100
    emoji = "üòä" if sentiment == "POSITIVE" else "üòû"
    
    print(f"  {emoji} {sentiment} ({confidence:.1f}% confident)")
    print(f"  ‚ö° {inference_time*1000:.0f}ms\n")

print("=" * 60)
print("‚úÖ Interactive demo complete!")
print("=" * 60)


### ‚úÖ What You Just Learned

You built an interactive demo with:
1. **User-editable parameters** - Edit `test_texts` and rerun
2. **Visual feedback** - Emojis and formatted output
3. **Timing metrics** - Shows inference speed
4. **Clear results** - Easy to understand

**This pattern makes your launchables engaging and shareable!**

---


# Section 4: Best Practices ‚≠ê

## Essential Patterns for Great Launchables

### ‚úÖ Always Include

- [ ] **GPU verification first** - Users need to know if GPU is available
- [ ] **Target hardware specs** - State which GPU your demo is optimized for
- [ ] **Explicit device placement** - Always use `.to(device)`
- [ ] **Clear outputs** - Show what's happening at each step
- [ ] **Timing metrics** - Prove GPU acceleration works
- [ ] **User invitation** - "Try changing X to see Y"

**Example:**
```python
print("üéØ This demo optimized for: 1x NVIDIA A100 40GB")
print("üí° Works on: L40s, A100, H100, or better")
```

### ‚úÖ Structure Your Launchable

1. **Title and intro** - What will users learn?
2. **Installation** - One cell with all dependencies
3. **GPU verification** - Verify hardware
4. **Simple demo first** - Show it works immediately
5. **Interactive demo** - Let users experiment
6. **Next steps** - How to build their own

### ‚úÖ File Requirements

Your launchable repository needs:
```
your-launchable/
‚îú‚îÄ‚îÄ README.md              # Short overview (< 200 words)
‚îú‚îÄ‚îÄ requirements.txt       # All pip dependencies
‚îú‚îÄ‚îÄ your-notebook.ipynb   # Self-contained tutorial
‚îî‚îÄ‚îÄ .gitignore            # Standard Python/Jupyter
```

### ‚úÖ Common Mistakes to Avoid

- ‚ùå **Don't** assume packages are installed
- ‚ùå **Don't** use terminal instructions (Brev opens notebook directly)
- ‚ùå **Don't** make it too complex (keep under 20 cells)
- ‚ùå **Don't** forget to test on fresh kernel
- ‚ùå **Don't** use "production" language (call it "shareable prototype")

### ‚úÖ Language Guidelines

**Instead of:**
- "Production-ready deployment" ‚Üí "Shareable prototype"
- "Enterprise-grade" ‚Üí "Working demo"
- "Build AI systems" ‚Üí "Share your AI work"

**Target audience:**
- "Developers who built something in AI"
- "Help others build 10x faster"
- "Share your work with the community"

---


# Section 5: Sharing Your Launchable üì§

Ready to share your launchable with the world? Here's how.

## Step 1: Prepare Your Repository

**Create these files:**

### `requirements.txt`
```txt
torch>=2.0.0
transformers>=4.30.0
accelerate>=0.20.0
```

### `README.md`
```markdown
# Your Launchable Title

One-sentence description of what this showcases.

## What You'll Build
- Bullet point 1
- Bullet point 2

## Prerequisites
- GPU-enabled environment
- Python 3.8+

## Get Started
Click "Open Notebook" to start.
```

### `.gitignore`
```gitignore
# Python
__pycache__/
*.pyc

# Jupyter
.ipynb_checkpoints/

# Models
*.pt
*.pth
*.bin
```

## Step 2: Push to GitHub

```bash
git init
git add .
git commit -m "Create launchable for [your project]"
git remote add origin https://github.com/yourusername/your-launchable.git
git push -u origin main
```

## Step 3: Choose Your GPU Configuration

**Important:** Launchables are fully configurable to any single-node GPU. Choose the **smallest GPU that runs your demo smoothly**.

### Available GPU Options

Brev supports configurations across multiple GPU types:
- **B200** - Latest NVIDIA Blackwell architecture
- **H200** - High-performance Hopper with HBM3e
- **H100** - Standard Hopper for demanding workloads
- **A100** - Versatile Ampere for most use cases
- **L40s** - Cost-effective for inference and visualization

**Configurations:** 1x, 2x, 4x, or 8x GPUs per instance

üìä **See full list:** [Brev Compute Options](https://brev.nvidia.com/compute)

### GPU Selection Best Practices

**‚úÖ Start Small (Recommended)**

**For most launchables:**
- **1x L40s** or **1x A100 (40GB)** - Perfect for demos, tutorials, small models
- **Cost-effective** - Lower credits per hour
- **Fast startup** - More availability
- **Sufficient** - Most demos don't need massive compute

**When to scale up:**
- **2x-4x GPUs** - Only if your demo actually uses multiple GPUs (distributed training, large batch inference)
- **8x GPUs** - AVOID unless absolutely necessary (rare for demos)

### üéØ Match GPU to Use Case

| Your Launchable Shows | Recommended GPU |
|----------------------|-----------------|
| Small models (< 7B params) | 1x L40s or 1x A100 40GB |
| Medium models (7-13B params) | 1x A100 80GB or 1x H100 |
| Large models (13-70B params) | 1x H100 or 1x H200 |
| Multi-GPU techniques | 2x or 4x matching your demo |
| Cutting-edge features | 1x B200 |

**üí° Pro Tip:** Include your target hardware in the notebook! 

```python
print("üéØ This demo optimized for: 1x NVIDIA A100 40GB")
print("üí° Works on: L40s, A100, H100, or better")
```

### Understanding Credits & Cost

**How it works:**
- **Self-service** - No fixed subscription period
- **Credit-based** - Purchase credits, use as needed
- **Auto-stop** - Instance stops when credits run out
- **Add anytime** - Top up credits to continue
- **Pay for usage** - Only charged when running

**Cost optimization:**
- Smaller GPUs = fewer credits per hour
- Efficient code = shorter runtime = lower cost
- Clear instructions = users don't waste credits debugging

## Step 4: Deploy on Brev

1. **Go to** [brev.nvidia.com](https://brev.nvidia.com)
2. **Connect** your GitHub repository
3. **Select GPU configuration** (remember: start with 1x L40s or 1x A100)
4. **Launch** and test your launchable
5. **Verify** it runs smoothly on your chosen GPU
6. **Share** the launch link with your community!

## Resources

- **Brev Documentation**: [docs.nvidia.com/brev/latest/launchables.html](https://docs.nvidia.com/brev/latest/launchables.html)
- **Launchables Ecosystem**: [github.com/brevdev/launchables](https://github.com/brevdev/launchables)
- **Examples**: Browse existing launchables for inspiration

---


# Section 6: Your Challenge üéØ

## Build Your Own Launchable!

You now have everything you need. Here are **concrete examples** to get you started:

### Example 1: "2x Faster Fine-Tuning" Demo (Unsloth-style)

**What to show:**
```python
# Load model with YOUR optimization
model = YourOptimizedLoader("llama-7b")

# Time YOUR method
start = time.time()
model.train(sample_data)
your_time = time.time() - start

# Time standard method
baseline_model = HuggingFaceLoader("llama-7b")
start = time.time()
baseline_model.train(sample_data)
baseline_time = time.time() - start

# Show the speedup
print(f"Your method: {your_time:.2f}s")
print(f"Baseline: {baseline_time:.2f}s")
print(f"üöÄ {baseline_time/your_time:.1f}x FASTER!")
```

**Result:** Developers SEE your speedup in 60 seconds.

### Example 2: "Side-by-Side Config Comparison" (Axolotl-style)

**What to show:**
- Load same model with 3 different configs
- Train for 10 steps each
- Show loss curves side-by-side with matplotlib
- Display memory usage for each config

**Result:** Developers understand which config to use.

### Example 3: "Novel Architecture Demo"

**What to show:**
- Load your architecture + baseline architecture
- Run same task on both
- Compare: inference time, memory, accuracy
- Display results in a table

**Result:** Developers see WHY your architecture matters.

### Checklist: Converting Your Existing Notebook

Take a notebook you already have:

- [ ] **Add Cell 1:** Installation (`pip install` all dependencies)
- [ ] **Add Cell 2:** GPU verification (copy from this tutorial)
- [ ] **Update code:** Add explicit `.to(device)` for all models
- [ ] **Add timing:** Wrap key operations with `time.time()`
- [ ] **Add comparison:** Show YOUR method vs baseline
- [ ] **Add visuals:** Print results clearly, use emojis for status
- [ ] **Test:** Run all cells on fresh kernel
- [ ] **Create README:** Copy template from Section 5
- [ ] **Push to GitHub:** `git push`
- [ ] **Deploy on Brev:** Connect repo, select GPU, launch

### How Developers Will Find Your Launchable

**Distribution options:**

1. **Direct link** - Share on Twitter, Discord, docs
   - "Try our 2x faster fine-tuning: [brev.nvidia.com/your-launchable]"
   
2. **Brev showcase** - Featured in launchables directory
   - Discoverable by developers browsing GPU tools
   
3. **Your website/docs** - Embed "Launch Demo" button
   - Users can try before they pip install
   
4. **GitHub README** - Add badge at the top
   - "üöÄ Try this in 30 seconds" ‚Üí instant credibility

**Best practice:** Post launch link when you announce your tool. "Here's our new optimization technique - try it live in 60 seconds [link]"

### Join the Ecosystem

When your launchable is ready:
- **Tweet it:** "#BrevLaunchables - Try [your tool] with zero setup"
- **Add to showcase:** [github.com/brevdev/launchables](https://github.com/brevdev/launchables)
- **Link in README:** Help others discover your work
- **Share here:** Help other devs with their launchables

---

## üéâ Congratulations!

You've completed the tutorial. You now know how to:
- ‚úÖ Verify GPU availability
- ‚úÖ Build interactive demos
- ‚úÖ Follow best practices
- ‚úÖ Share your work as a launchable

---

## üí° Meta Moment

**Notice something?** 

**THIS notebook you just completed IS ITSELF a launchable!**

You:
- Clicked "Open Notebook" and landed here
- Ran ONE install cell
- Restarted the kernel
- Ran all cells successfully
- Learned in a GPU-backed environment
- Can share this exact link with others

**That's the power of launchables.**

Now you can create the same experience for YOUR content!

---

**Go make your AI work discoverable and help others be 10x faster!** üöÄ

---

*Tutorial for developers who want to share their AI work*  
*Making AI innovations accessible, one launchable at a time* üíö
