# Capstone Project Kickoff

**Module:** 4.3 - Capstone Project (Domain 4: Production AI)
**Time:** 2-3 hours
**Difficulty:** ‚≠ê‚≠ê‚≠ê‚≠ê

---

## üéâ Congratulations on Reaching the Capstone!

You've completed an incredible journey through the DGX Spark AI Curriculum. From understanding the fundamentals of neural networks to fine-tuning 70B parameter models, from building RAG systems to deploying production APIs - you've acquired a remarkable set of skills.

Now it's time to put it all together.

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- [ ] Understand the four capstone project options
- [ ] Evaluate which project best matches your interests and goals
- [ ] Set up your project environment
- [ ] Complete your project proposal
- [ ] Create your project timeline

---

## üìö Prerequisites

- Completed: All modules in Domains 1-4
- Knowledge of: LLM fine-tuning, RAG systems, agents, deployment
- Access to: DGX Spark with 128GB unified memory

---

## üåç Real-World Context

The capstone project mirrors what AI engineers do in industry every day: identify a problem, design a solution, implement it with production-quality code, and evaluate its effectiveness.

Companies like OpenAI, Anthropic, Google, and Meta all follow similar processes when building AI products:

1. **Problem Definition** ‚Üí What are we solving?
2. **Architecture Design** ‚Üí How will we solve it?
3. **Implementation** ‚Üí Build the solution
4. **Optimization** ‚Üí Make it fast and efficient
5. **Evaluation** ‚Üí Does it actually work?
6. **Documentation** ‚Üí Can others use and extend it?

Your capstone follows this exact pattern, preparing you for real-world AI engineering roles.

---

## üßí ELI5: What is a Capstone Project?

> **Imagine you've been learning to cook for months.** You've mastered chopping vegetables, making sauces, baking bread, grilling meat, and plating dishes. Each skill was practiced in isolation.
>
> **Now, you're going to prepare a complete dinner party.** You need to plan a menu, prep all the ingredients, cook multiple dishes that complement each other, time everything so it's ready together, and present it beautifully.
>
> **That's a capstone.** It's not about learning one new thing - it's about combining everything you've learned into one impressive, complete creation.
>
> **In AI terms:** You've learned fine-tuning, RAG, agents, deployment, and more. Your capstone combines these into a complete, working AI system that solves a real problem.

---

## Part 1: Environment Verification

Before choosing your project, let's verify your DGX Spark environment is properly configured. Your capstone will push the hardware to its limits!

In [None]:
# Capstone Environment Verification
# This cell verifies your DGX Spark is ready for capstone development

import sys
import subprocess
from datetime import datetime

print("="*60)
print("üöÄ DGX SPARK CAPSTONE ENVIRONMENT CHECK")
print(f"üìÖ Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*60)

# Python version
print(f"\nüêç Python Version: {sys.version}")

# Check critical packages
packages_status = []
critical_packages = [
    ("torch", "PyTorch"),
    ("transformers", "Transformers"),
    ("peft", "PEFT (LoRA)"),
    ("bitsandbytes", "BitsAndBytes"),
    ("sentence_transformers", "Sentence Transformers"),
    ("langchain", "LangChain"),
    ("fastapi", "FastAPI"),
    ("gradio", "Gradio"),
]

print("\nüì¶ Package Status:")
for pkg_name, display_name in critical_packages:
    try:
        module = __import__(pkg_name)
        version = getattr(module, '__version__', 'installed')
        print(f"  ‚úÖ {display_name}: {version}")
        packages_status.append(True)
    except ImportError:
        print(f"  ‚ùå {display_name}: NOT INSTALLED")
        packages_status.append(False)

In [None]:
# GPU and Memory Check
import torch

print("\nüéÆ GPU Status:")
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"  ‚úÖ GPU: {gpu_name}")
    print(f"  ‚úÖ GPU Memory: {gpu_memory:.1f} GB")
    
    # Check for Blackwell features
    compute_capability = torch.cuda.get_device_capability(0)
    print(f"  ‚úÖ Compute Capability: {compute_capability[0]}.{compute_capability[1]}")
    
    # Memory allocation test
    print("\nüíæ Memory Test:")
    print(f"  Current allocation: {torch.cuda.memory_allocated()/1e9:.2f} GB")
    print(f"  Current reserved: {torch.cuda.memory_reserved()/1e9:.2f} GB")
else:
    print("  ‚ùå CUDA not available!")
    print("  Make sure you're running in an NGC container with GPU access.")

# System memory
import os
try:
    with open('/proc/meminfo', 'r') as f:
        for line in f:
            if 'MemTotal' in line:
                mem_gb = int(line.split()[1]) / 1e6
                print(f"\nüñ•Ô∏è System Memory: {mem_gb:.1f} GB")
                if mem_gb > 100:
                    print("  ‚úÖ Unified memory configuration detected!")
                break
except:
    print("\nüñ•Ô∏è Could not read system memory info")

In [None]:
# Disk Space Check
import shutil

print("\nüíø Disk Space:")
paths_to_check = [
    ("/workspace", "Workspace"),
    (os.path.expanduser("~/.cache/huggingface"), "HuggingFace Cache"),
]

for path, name in paths_to_check:
    if os.path.exists(path):
        total, used, free = shutil.disk_usage(path)
        print(f"  {name} ({path}):")
        print(f"    Total: {total/1e9:.1f} GB")
        print(f"    Used: {used/1e9:.1f} GB")
        print(f"    Free: {free/1e9:.1f} GB ({'‚úÖ' if free/1e9 > 100 else '‚ö†Ô∏è'})")
    else:
        print(f"  ‚ö†Ô∏è {name}: Path not found")

print("\n" + "="*60)
print("Environment check complete!")
print("="*60)

### üîç What Just Happened?

We verified:
1. **Python Environment** - All required packages are installed
2. **GPU Access** - Blackwell GPU is available with sufficient memory
3. **System Memory** - 128GB unified memory is configured
4. **Disk Space** - Enough space for models and data

If any checks failed, resolve them before proceeding. You'll need full access to DGX Spark capabilities for your capstone.

---

## Part 2: Project Options Overview

You have four project options, each emphasizing different skills. All are designed to showcase DGX Spark's unique capabilities.

### Option A: Domain-Specific AI Assistant

**Build a complete AI assistant specialized for a specific domain.**

| Component | Technology | DGX Spark Advantage |
|-----------|------------|--------------------|
| Base Model | Llama 3.3 70B | Fits in 128GB memory |
| Fine-tuning | QLoRA | Fast training with unified memory |
| RAG | Vector DB + BM25 | Large knowledge base in memory |
| Tools | Custom functions | Low-latency tool calls |
| API | FastAPI + Streaming | Real-time responses |

**Best for:** Students interested in LLM customization, RAG systems, and building practical applications.

**Example domains:** DevOps assistant, financial advisor, medical literature search, legal document analysis.

---

### Option B: Multimodal Document Intelligence

**Build a system that understands complex documents with text, images, and diagrams.**

| Component | Technology | DGX Spark Advantage |
|-----------|------------|--------------------|
| Vision | LLaVA / Qwen-VL | Large VLM in memory |
| OCR | Tesseract + LayoutLM | Batch processing |
| Extraction | Schema-based | Complex document handling |
| QA | Multimodal RAG | Images + text together |
| Export | Structured JSON/CSV | Integration ready |

**Best for:** Students interested in computer vision, document processing, and multimodal AI.

**Example applications:** Invoice processing, research paper analysis, technical manual QA.

---

### Option C: AI Agent Swarm

**Build a multi-agent system where specialized agents collaborate.**

| Component | Technology | DGX Spark Advantage |
|-----------|------------|--------------------|
| Agents | 4+ specialized agents | Multiple models loaded |
| Coordination | LangGraph / custom | Complex workflows |
| Tools | Code exec, search, etc. | Parallel tool execution |
| Memory | Shared + individual | Large context windows |
| Safety | Human-in-the-loop | Real-time approval |

**Best for:** Students interested in agentic AI, planning systems, and complex automation.

**Example applications:** Research assistant, software development team, data analysis pipeline.

---

### Option D: Custom Training Pipeline

**Build infrastructure for continuous model improvement.**

| Component | Technology | DGX Spark Advantage |
|-----------|------------|--------------------|
| Data | Collection + curation | Large dataset processing |
| Training | SFT + DPO | 70B training feasible |
| Evaluation | Auto benchmarks | Fast model comparison |
| Versioning | MLflow / custom | Experiment tracking |
| Deployment | A/B testing | Multiple models served |

**Best for:** Students interested in MLOps, training infrastructure, and model development.

**Example applications:** Domain adaptation pipeline, preference learning system, model improvement loop.

---

## Part 3: Project Selection Decision Tree

Use this interactive decision helper to find your best project match.

In [None]:
# Project Selection Helper
# Answer these questions to find your best project match

def project_selector():
    """Interactive project selection based on interests and skills."""
    
    print("üéØ CAPSTONE PROJECT SELECTOR")
    print("="*50)
    print("Answer these questions to find your ideal project.")
    print("Rate each from 1 (not interested) to 5 (very interested)\n")
    
    questions = {
        "option_a": [
            "Building chatbots and conversational AI",
            "Fine-tuning LLMs for specific domains",
            "Building RAG systems with knowledge bases",
            "Creating API-based AI services",
        ],
        "option_b": [
            "Working with images and visual data",
            "Processing PDFs and documents",
            "Extracting structured data from unstructured sources",
            "Combining vision and language models",
        ],
        "option_c": [
            "Building autonomous AI agents",
            "Multi-step planning and reasoning",
            "Tool use and function calling",
            "Coordinating multiple AI systems",
        ],
        "option_d": [
            "Training and fine-tuning workflows",
            "Building ML infrastructure and pipelines",
            "Model evaluation and benchmarking",
            "Experiment tracking and versioning",
        ],
    }
    
    scores = {"option_a": 0, "option_b": 0, "option_c": 0, "option_d": 0}
    
    option_names = {
        "option_a": "A: Domain-Specific AI Assistant",
        "option_b": "B: Multimodal Document Intelligence",
        "option_c": "C: AI Agent Swarm",
        "option_d": "D: Custom Training Pipeline",
    }
    
    # Collect all responses
    all_questions = []
    for option, q_list in questions.items():
        for q in q_list:
            all_questions.append((option, q))
    
    # Shuffle to avoid bias
    import random
    random.seed(42)  # Reproducible shuffle
    random.shuffle(all_questions)
    
    print("Rate your interest in each area (1-5):\n")
    
    for i, (option, question) in enumerate(all_questions, 1):
        while True:
            try:
                response = input(f"{i}. {question}: ")
                score = int(response)
                if 1 <= score <= 5:
                    scores[option] += score
                    break
                else:
                    print("   Please enter a number between 1 and 5.")
            except ValueError:
                print("   Please enter a number between 1 and 5.")
    
    # Calculate and display results
    print("\n" + "="*50)
    print("üìä YOUR RESULTS")
    print("="*50 + "\n")
    
    max_possible = len(questions["option_a"]) * 5
    sorted_scores = sorted(scores.items(), key=lambda x: x[1], reverse=True)
    
    for rank, (option, score) in enumerate(sorted_scores, 1):
        percentage = (score / max_possible) * 100
        bar = "‚ñà" * int(percentage / 5) + "‚ñë" * (20 - int(percentage / 5))
        medal = "ü•á" if rank == 1 else "ü•à" if rank == 2 else "ü•â" if rank == 3 else "  "
        print(f"{medal} {option_names[option]}")
        print(f"   {bar} {percentage:.0f}% ({score}/{max_possible})\n")
    
    winner = sorted_scores[0][0]
    print("\n" + "="*50)
    print(f"üéØ RECOMMENDED PROJECT: {option_names[winner]}")
    print("="*50)
    
    return winner

# Uncomment to run interactively:
# recommended = project_selector()

In [None]:
# Quick selection if you already know your preference
# Uncomment and modify the line below:

# SELECTED_PROJECT = "A"  # Options: "A", "B", "C", or "D"

project_descriptions = {
    "A": {
        "name": "Domain-Specific AI Assistant",
        "summary": "Fine-tuned LLM + RAG + Tools + API",
        "notebook": "lab-4.3.2-option-a-ai-assistant.ipynb",
    },
    "B": {
        "name": "Multimodal Document Intelligence",
        "summary": "VLM + OCR + Extraction + QA",
        "notebook": "lab-4.3.3-option-b-document-intelligence.ipynb",
    },
    "C": {
        "name": "AI Agent Swarm",
        "summary": "Multi-agent + Planning + Tools + Safety",
        "notebook": "lab-4.3.4-option-c-agent-swarm.ipynb",
    },
    "D": {
        "name": "Custom Training Pipeline",
        "summary": "Data + Training + Evaluation + Deployment",
        "notebook": "lab-4.3.5-option-d-training-pipeline.ipynb",
    },
}

print("üìã PROJECT OPTIONS SUMMARY")
print("="*60)
for key, info in project_descriptions.items():
    print(f"\nOption {key}: {info['name']}")
    print(f"  Components: {info['summary']}")
    print(f"  Guide: {info['notebook']}")

---

## Part 4: Skills Assessment

Each project builds on skills from previous modules. Let's verify you have the prerequisites.

In [None]:
# Skills Assessment Matrix

skills_matrix = {
    "A": {
        "required": [
            ("Module 3.1: LLM Fine-tuning", "QLoRA implementation"),
            ("Module 3.4: AI Agents", "RAG pipeline construction"),
            ("Module 3.3: Deployment", "FastAPI + streaming"),
        ],
        "helpful": [
            "Module 3.2: Quantization",
            "Module 2.4: Hugging Face Ecosystem",
        ]
    },
    "B": {
        "required": [
            ("Module 2.2: Computer Vision", "Image processing"),
            ("Module 4.1: Multimodal", "VLM usage"),
            ("Module 3.4: AI Agents", "RAG fundamentals"),
        ],
        "helpful": [
            "Module 2.3: NLP & Transformers",
            "Module 3.3: Deployment",
        ]
    },
    "C": {
        "required": [
            ("Module 3.4: AI Agents", "Agent frameworks"),
            ("Module 2.3: NLP & Transformers", "LLM reasoning"),
            ("Module 3.3: Deployment", "Service architecture"),
        ],
        "helpful": [
            "Module 3.1: LLM Fine-tuning",
            "Module 4.1: Multimodal",
        ]
    },
    "D": {
        "required": [
            ("Module 3.1: LLM Fine-tuning", "SFT and DPO"),
            ("Module 4.2: MLOps", "Evaluation frameworks"),
            ("Module 2.4: Hugging Face Ecosystem", "Trainer API"),
        ],
        "helpful": [
            "Module 3.2: Quantization",
            "Module 3.3: Deployment",
        ]
    },
}

def display_skills_for_project(project: str):
    """Display required and helpful skills for a project."""
    print(f"\nüìö SKILLS FOR PROJECT {project}")
    print("="*50)
    
    info = skills_matrix[project]
    
    print("\n‚úÖ Required Skills:")
    for module, skill in info["required"]:
        print(f"  ‚Ä¢ {module}")
        print(f"    Key skill: {skill}")
    
    print("\nüìò Helpful Background:")
    for module in info["helpful"]:
        print(f"  ‚Ä¢ {module}")

# Display for all projects
for project in ["A", "B", "C", "D"]:
    display_skills_for_project(project)

---

## Part 5: Project Timeline Planning

Your capstone spans 6 weeks. Here's how to structure your time effectively.

In [None]:
# Timeline Generator

from datetime import datetime, timedelta

def generate_timeline(start_date: str = None):
    """Generate a 6-week capstone timeline."""
    
    if start_date:
        start = datetime.strptime(start_date, "%Y-%m-%d")
    else:
        start = datetime.now()
    
    weeks = [
        {
            "week": 1,
            "name": "Planning",
            "hours": "6-8",
            "tasks": [
                "Complete project proposal",
                "Design system architecture",
                "Set up development environment",
                "Create project repository",
                "Identify data sources",
            ],
            "deliverable": "Approved project proposal + Architecture diagram"
        },
        {
            "week": 2,
            "name": "Foundation (Part 1)",
            "hours": "8-10",
            "tasks": [
                "Implement core component #1",
                "Set up data pipeline",
                "Create initial tests",
                "Document as you build",
            ],
            "deliverable": "Working prototype of core component"
        },
        {
            "week": 3,
            "name": "Foundation (Part 2)",
            "hours": "8-10",
            "tasks": [
                "Implement core component #2",
                "Model training/fine-tuning",
                "Basic integration tests",
                "Performance baseline",
            ],
            "deliverable": "All core components working independently"
        },
        {
            "week": 4,
            "name": "Integration",
            "hours": "8-10",
            "tasks": [
                "Connect all components",
                "Build API layer",
                "End-to-end testing",
                "Fix integration issues",
            ],
            "deliverable": "Complete integrated system"
        },
        {
            "week": 5,
            "name": "Optimization",
            "hours": "6-8",
            "tasks": [
                "Performance profiling",
                "Memory optimization",
                "Quantization (if applicable)",
                "Benchmark suite execution",
            ],
            "deliverable": "Optimized system with benchmarks"
        },
        {
            "week": 6,
            "name": "Documentation",
            "hours": "6-8",
            "tasks": [
                "Complete technical report",
                "Record demo video",
                "Prepare presentation",
                "Final code cleanup",
                "Repository polish",
            ],
            "deliverable": "All deliverables complete"
        },
    ]
    
    print("\nüìÖ YOUR CAPSTONE TIMELINE")
    print("="*60)
    
    for week_info in weeks:
        week_start = start + timedelta(weeks=week_info["week"]-1)
        week_end = week_start + timedelta(days=6)
        
        print(f"\nüìå Week {week_info['week']}: {week_info['name']}")
        print(f"   {week_start.strftime('%b %d')} - {week_end.strftime('%b %d')}")
        print(f"   Estimated hours: {week_info['hours']}")
        print("\n   Tasks:")
        for task in week_info["tasks"]:
            print(f"   [ ] {task}")
        print(f"\n   üì¶ Deliverable: {week_info['deliverable']}")
    
    final_date = start + timedelta(weeks=6)
    print("\n" + "="*60)
    print(f"üéØ Target completion: {final_date.strftime('%B %d, %Y')}")
    print("="*60)

# Generate timeline starting from today
generate_timeline()

---

## Part 6: Project Setup

Let's create your project structure.

In [None]:
# Project Structure Generator

import os
from pathlib import Path

def create_project_structure(project_name: str, project_option: str, base_path: str = "/workspace"):
    """
    Create a complete project structure for the capstone.
    
    Args:
        project_name: Name of your project (e.g., "aws-assistant")
        project_option: One of "A", "B", "C", "D"
        base_path: Where to create the project
    """
    
    project_structures = {
        "A": {  # Domain-Specific AI Assistant
            "dirs": [
                "src/models",
                "src/rag",
                "src/tools",
                "src/api",
                "data/raw",
                "data/processed",
                "data/knowledge_base",
                "training/configs",
                "training/outputs",
                "evaluation/benchmarks",
                "evaluation/results",
                "notebooks",
                "tests",
                "docs",
            ],
            "files": {
                "src/__init__.py": "",
                "src/models/__init__.py": "",
                "src/rag/__init__.py": "",
                "src/tools/__init__.py": "",
                "src/api/__init__.py": "",
                "tests/__init__.py": "",
            }
        },
        "B": {  # Multimodal Document Intelligence
            "dirs": [
                "src/ingestion",
                "src/vision",
                "src/extraction",
                "src/qa",
                "src/export",
                "data/documents",
                "data/processed",
                "data/outputs",
                "models",
                "evaluation/datasets",
                "evaluation/results",
                "notebooks",
                "tests",
                "docs",
            ],
            "files": {
                "src/__init__.py": "",
                "src/ingestion/__init__.py": "",
                "src/vision/__init__.py": "",
                "src/extraction/__init__.py": "",
                "src/qa/__init__.py": "",
                "src/export/__init__.py": "",
                "tests/__init__.py": "",
            }
        },
        "C": {  # AI Agent Swarm
            "dirs": [
                "src/agents",
                "src/coordinator",
                "src/tools",
                "src/memory",
                "src/safety",
                "workflows",
                "evaluation/tasks",
                "evaluation/results",
                "notebooks",
                "tests",
                "docs",
            ],
            "files": {
                "src/__init__.py": "",
                "src/agents/__init__.py": "",
                "src/coordinator/__init__.py": "",
                "src/tools/__init__.py": "",
                "src/memory/__init__.py": "",
                "src/safety/__init__.py": "",
                "tests/__init__.py": "",
            }
        },
        "D": {  # Custom Training Pipeline
            "dirs": [
                "src/data",
                "src/training",
                "src/evaluation",
                "src/serving",
                "configs",
                "data/raw",
                "data/processed",
                "data/splits",
                "experiments",
                "models/checkpoints",
                "models/exported",
                "notebooks",
                "tests",
                "docs",
            ],
            "files": {
                "src/__init__.py": "",
                "src/data/__init__.py": "",
                "src/training/__init__.py": "",
                "src/evaluation/__init__.py": "",
                "src/serving/__init__.py": "",
                "tests/__init__.py": "",
            }
        },
    }
    
    structure = project_structures.get(project_option.upper())
    if not structure:
        print(f"‚ùå Invalid project option: {project_option}")
        return
    
    project_path = Path(base_path) / project_name
    
    print(f"\nüèóÔ∏è Creating project structure for: {project_name}")
    print(f"   Option: {project_option}")
    print(f"   Location: {project_path}")
    print("="*50)
    
    # Create directories
    for dir_path in structure["dirs"]:
        full_path = project_path / dir_path
        full_path.mkdir(parents=True, exist_ok=True)
        print(f"  üìÅ {dir_path}/")
    
    # Create files
    for file_path, content in structure["files"].items():
        full_path = project_path / file_path
        full_path.parent.mkdir(parents=True, exist_ok=True)
        full_path.write_text(content)
        print(f"  üìÑ {file_path}")
    
    # Create common files
    common_files = {
        "README.md": f"# {project_name}\n\nCapstone Project - Option {project_option}\n",
        "requirements.txt": "# Add your dependencies here\ntorch>=2.5.0\ntransformers>=4.46.0\n",
        ".gitignore": """# Python
__pycache__/
*.pyc
.ipynb_checkpoints/

# Data
data/raw/*
!data/raw/.gitkeep

# Models
*.bin
*.safetensors
models/checkpoints/*
!models/checkpoints/.gitkeep

# Misc
.env
*.log
""",
    }
    
    for file_name, content in common_files.items():
        full_path = project_path / file_name
        full_path.write_text(content)
        print(f"  üìÑ {file_name}")
    
    # Add .gitkeep to empty directories
    for dir_path in structure["dirs"]:
        gitkeep = project_path / dir_path / ".gitkeep"
        if not any((project_path / dir_path).iterdir()):
            gitkeep.touch()
    
    print("\n" + "="*50)
    print(f"‚úÖ Project structure created at: {project_path}")
    print("\nNext steps:")
    print(f"  1. cd {project_path}")
    print("  2. git init")
    print("  3. Start coding!")
    
    return project_path

# Example usage (uncomment to run):
# project_path = create_project_structure("my-ai-assistant", "A")

---

## ‚úã Try It Yourself: Project Proposal

Now it's time to create your project proposal! 

1. Choose your project option (A, B, C, or D)
2. Copy the proposal template from `templates/project-proposal.md`
3. Fill in all sections
4. Save as `docs/proposal.md` in your project folder

<details>
<summary>üí° Tips for a Strong Proposal</summary>

- **Be specific about your domain** - "AWS infrastructure assistant" is better than "general assistant"
- **Set measurable success criteria** - "Answer 80% of test questions correctly" not "work well"
- **Identify risks early** - What might go wrong? How will you handle it?
- **Leverage DGX Spark** - Explain how you'll use the 128GB memory and Blackwell features

</details>

---

## ‚ö†Ô∏è Common Mistakes

### Mistake 1: Scope Creep
```python
# ‚ùå Wrong: Trying to do everything
project_goals = [
    "Fine-tune 70B model",
    "Build RAG with 1M documents",
    "Add multimodal support",
    "Create mobile app",
    "Deploy to cloud",
    "Build training pipeline",
]

# ‚úÖ Right: Focused scope with clear boundaries
project_goals = [
    "Fine-tune 70B model for AWS CLI help",
    "Build RAG with AWS documentation (1000 pages)",
    "Create FastAPI endpoint with streaming",
]
stretch_goals = ["Add multimodal diagrams", "Gradio UI"]  # Only if time permits
```
**Why:** A complete, polished project is better than an ambitious, unfinished one.

---

### Mistake 2: Not Testing Early
```python
# ‚ùå Wrong: Wait until Week 5 to test
week_5_tasks = ["Integration testing", "Fix all bugs", "Benchmark"]  # Too late!

# ‚úÖ Right: Test continuously
every_week_tasks = [
    "Unit tests for new code",
    "Manual testing of new features",
    "Quick performance check",
]
```
**Why:** Finding bugs early is much cheaper than finding them late.

---

### Mistake 3: Ignoring Documentation
```python
# ‚ùå Wrong: "I'll document it later"
def process_query(q, ctx, opts):
    # Complex logic with no explanation
    ...

# ‚úÖ Right: Document as you code
def process_query(
    query: str,
    context: list[Document],
    options: ProcessingOptions
) -> QueryResult:
    """
    Process a user query using RAG.
    
    Args:
        query: The user's natural language question
        context: Retrieved documents for context
        options: Processing configuration
        
    Returns:
        QueryResult with answer and sources
    """
    ...
```
**Why:** You WILL forget why you did things. Future you will thank present you.

---

## üéâ Checkpoint

You've completed the capstone kickoff! You should now have:

- ‚úÖ Verified your DGX Spark environment
- ‚úÖ Understood all four project options
- ‚úÖ Selected your project
- ‚úÖ Created your project structure
- ‚úÖ Planned your 6-week timeline

---

## üöÄ Next Steps

1. **Complete your project proposal** using the template
2. **Open the guide for your chosen project:**
   - Option A: `lab-4.3.2-option-a-ai-assistant.ipynb`
   - Option B: `lab-4.3.3-option-b-document-intelligence.ipynb`
   - Option C: `lab-4.3.4-option-c-agent-swarm.ipynb`
   - Option D: `lab-4.3.5-option-d-training-pipeline.ipynb`
3. **Review the planning notebook:** `lab-4.3.1-project-planning.ipynb`

---

## üìñ Further Reading

- [How to Write a Good Project Proposal](https://www.atlassian.com/work-management/project-management/project-proposal)
- [NVIDIA DGX Spark Documentation](https://docs.nvidia.com/dgx/)
- [Best Practices for ML Projects](https://neptune.ai/blog/how-to-organize-ml-project)

---

In [None]:
# üßπ Cleanup (no cleanup needed for this notebook)
print("‚úÖ No cleanup needed - ready to proceed!")
print("\nüéØ Next: Choose your project and open the corresponding guide notebook.")