# Task 4.5.4: Deployment Options - Sharing Your Demos

**Module:** 4.5 - Demo Building & Prototyping  
**Time:** 1-2 hours  
**Difficulty:** ‚≠ê‚≠ê‚òÜ‚òÜ‚òÜ

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- [ ] Deploy Gradio apps to Hugging Face Spaces
- [ ] Deploy Streamlit apps to Streamlit Cloud
- [ ] Understand alternative hosting options
- [ ] Configure secrets and environment variables
- [ ] Set up persistent storage for demos

---

## üìö Prerequisites

- Completed: Tasks 4.5.1-4.5.3
- Accounts: GitHub, Hugging Face (free)
- Optional: Streamlit Cloud account

---

## üåç Real-World Context

Your demo is working locally. Now you need to share it with:
- Remote team members who can't SSH into your machine
- Stakeholders who want a link they can bookmark
- Potential employers viewing your portfolio

This task covers the free, easy options for hosting AI demos.

---

## üßí ELI5: Deployment

> **Local development** is like cooking in your kitchen - only you (and people in your house) can taste the food.
>
> **Deployment** is like opening a restaurant - now anyone can come and eat. You need:
> - A location (server/hosting)
> - A sign outside (URL)
> - A kitchen that works without you watching (runs 24/7)
>
> **Hugging Face Spaces** and **Streamlit Cloud** are like food trucks with pre-built kitchens. You bring your recipe (code), they handle the rest!

---

## Deployment Options Comparison

| Platform | Best For | Free Tier | GPU Support | Custom Domain |
|----------|----------|-----------|-------------|---------------|
| **Hugging Face Spaces** | Gradio/Streamlit AI demos | ‚úÖ Generous | ‚úÖ (Paid) | ‚úÖ |
| **Streamlit Cloud** | Streamlit apps | ‚úÖ 3 apps | ‚ùå | ‚úÖ (Paid) |
| **Gradio Share** | Quick sharing (72h) | ‚úÖ | ‚ùå | ‚ùå |
| **Railway** | Any Python app | ‚úÖ $5/month | ‚ùå | ‚úÖ |
| **Render** | Any web app | ‚úÖ | ‚ùå | ‚úÖ |
| **Self-hosted** | Full control | Your costs | ‚úÖ DGX Spark | ‚úÖ |

---

## Part 1: Hugging Face Spaces

### üßí ELI5: What is Hugging Face Spaces?

> Hugging Face is like GitHub for AI. Spaces is their free hosting for demos. You push your code, they run it. Magic! ‚ú®

### Why Use Spaces?

- **Free** (with generous limits)
- **GPU option** (for paid tiers)
- **Automatic HTTPS**
- **Easy secret management**
- **HuggingFace model integration**
- **Community visibility**

### Creating a Hugging Face Space

#### Step 1: Create the Space

1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Choose a name (e.g., `my-rag-demo`)
3. Select SDK: **Gradio** or **Streamlit**
4. Choose visibility: **Public** or **Private**
5. Click "Create Space"

#### Step 2: File Structure

In [None]:
import os

# Create a sample Hugging Face Space structure
space_dir = '/tmp/hf_space_demo'
os.makedirs(space_dir, exist_ok=True)

# README.md - Required metadata file
readme_content = '''---
title: RAG Document Q&A
emoji: üìö
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: true
license: mit
---

# RAG Document Q&A Demo

Ask questions about uploaded documents using RAG (Retrieval-Augmented Generation).

## Features
- Upload PDF, TXT, or MD files
- Ask natural language questions
- Get answers with source citations

## Usage
1. Upload your documents in the "Documents" tab
2. Switch to "Chat" tab
3. Ask questions!

## Tech Stack
- Gradio for the interface
- ChromaDB for vector storage
- Sentence Transformers for embeddings
'''

with open(f'{space_dir}/README.md', 'w') as f:
    f.write(readme_content)

print("Created README.md with Space metadata")

In [None]:
# app.py - Main application file
app_content = '''import gradio as gr
import os

# Access secrets (set in Space settings)
HF_TOKEN = os.environ.get("HF_TOKEN", "")
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "")

# Your demo code
def process_documents(files):
    """Process uploaded documents."""
    if not files:
        return "Please upload at least one file."
    
    # In a real app, you would:
    # 1. Parse the documents
    # 2. Create embeddings
    # 3. Store in vector database
    
    file_names = [f.name for f in files]
    return f"Successfully processed {len(files)} files: {file_names}"

def answer_question(question, history):
    """Answer a question using RAG."""
    if not question:
        return history
    
    # In a real app, you would:
    # 1. Create query embedding
    # 2. Retrieve relevant chunks
    # 3. Generate response with LLM
    
    response = f"Based on the documents, here\'s what I found about: {question}"
    history.append((question, response))
    return history

# Build the interface
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# üìö RAG Document Q&A")
    
    with gr.Tabs():
        # Documents tab
        with gr.TabItem("üìÅ Documents"):
            files = gr.File(
                label="Upload Documents",
                file_count="multiple",
                file_types=[".pdf", ".txt", ".md"]
            )
            process_btn = gr.Button("Process Documents", variant="primary")
            status = gr.Textbox(label="Status", interactive=False)
            
            process_btn.click(process_documents, [files], [status])
        
        # Chat tab
        with gr.TabItem("üí¨ Chat"):
            chatbot = gr.Chatbot(height=400)
            msg = gr.Textbox(
                label="Ask a Question",
                placeholder="What would you like to know?"
            )
            msg.submit(answer_question, [msg, chatbot], [chatbot])
    
    gr.Markdown("---")
    gr.Markdown("*Built with Gradio on Hugging Face Spaces*")

# Launch (Spaces handles this automatically)
if __name__ == "__main__":
    demo.launch()
'''

with open(f'{space_dir}/app.py', 'w') as f:
    f.write(app_content)

print("Created app.py")

In [None]:
# requirements.txt - Dependencies
requirements = '''gradio>=4.44.0
transformers>=4.35.0
sentence-transformers>=2.2.0
chromadb>=0.4.0
pypdf>=3.0.0
'''

with open(f'{space_dir}/requirements.txt', 'w') as f:
    f.write(requirements)

print("Created requirements.txt")
print(f"\nSpace files created in: {space_dir}")
print("\nTo deploy:")
print("1. Create a new Space on huggingface.co")
print("2. Clone the Space repository")
print("3. Copy these files to the repository")
print("4. git add, commit, and push!")

### Managing Secrets on Spaces

Never hardcode API keys! Use Space secrets:

1. Go to your Space ‚Üí Settings ‚Üí Repository secrets
2. Add secrets like `HF_TOKEN`, `OPENAI_API_KEY`, etc.
3. Access them in code with `os.environ.get("SECRET_NAME")`

In [None]:
# Example: Accessing secrets safely

secrets_example = '''
import os

def get_config():
    """Get configuration from environment variables."""
    return {
        # Required secrets (will fail if not set)
        "hf_token": os.environ["HF_TOKEN"],
        
        # Optional secrets (with defaults)
        "ollama_host": os.environ.get("OLLAMA_HOST", "http://localhost:11434"),
        "model_name": os.environ.get("MODEL_NAME", "llama3.1:8b"),
        "debug_mode": os.environ.get("DEBUG", "false").lower() == "true",
    }

# Usage
config = get_config()
print(f"Using model: {config['model_name']}")
'''

print("Secret management pattern:")
print(secrets_example)

### GPU on Spaces (Paid Feature)

For GPU-enabled Spaces, add this to your README.md:

```yaml
---
hardware: t4-small   # Options: cpu, t4-small, t4-medium, a10g-small, etc.
---
```

Available hardware options:
- `cpu` - Free, 2 vCPU, 16GB RAM
- `t4-small` - $0.40/hour, T4 GPU, 4 vCPU
- `t4-medium` - $0.60/hour, T4 GPU, 8 vCPU
- `a10g-small` - $1.05/hour, A10G GPU
- `a10g-large` - $3.15/hour, A10G GPU, 24 vCPU

---

## Part 2: Streamlit Cloud

### üßí ELI5: Streamlit Cloud

> If Hugging Face is the food truck of AI demos, Streamlit Cloud is the food truck specifically for Streamlit apps. One-click deployment from GitHub!

### Key Features

- **GitHub integration** - Deploy straight from your repo
- **Auto-rebuild** - Push to GitHub ‚Üí App updates
- **Secret management** - Built-in
- **3 free apps** - Generous free tier

In [None]:
# Create a Streamlit Cloud-ready app structure
streamlit_cloud_dir = '/tmp/streamlit_cloud_demo'
os.makedirs(f'{streamlit_cloud_dir}/.streamlit', exist_ok=True)

# Main app
app_content = '''import streamlit as st
import os

st.set_page_config(
    page_title="AI Agent Playground",
    page_icon="ü§ñ",
    layout="wide"
)

# Access secrets (from Streamlit Cloud)
# In local development, use .streamlit/secrets.toml
def get_secret(key, default=""):
    try:
        return st.secrets[key]
    except:
        return os.environ.get(key, default)

# Initialize session state
if "messages" not in st.session_state:
    st.session_state.messages = []

st.title("ü§ñ AI Agent Playground")

# Sidebar
with st.sidebar:
    st.header("Settings")
    model = st.selectbox("Model", ["GPT-4", "Claude", "Llama"])
    temperature = st.slider("Temperature", 0.0, 1.0, 0.7)
    
    if st.button("Clear Chat"):
        st.session_state.messages = []
        st.rerun()

# Chat interface
for msg in st.session_state.messages:
    with st.chat_message(msg["role"]):
        st.write(msg["content"])

if prompt := st.chat_input("Ask something..."):
    st.session_state.messages.append({"role": "user", "content": prompt})
    st.chat_message("user").write(prompt)
    
    # Mock response (replace with real LLM call)
    response = f"[{model}] Response to: {prompt}"
    st.session_state.messages.append({"role": "assistant", "content": response})
    st.chat_message("assistant").write(response)
'''

with open(f'{streamlit_cloud_dir}/app.py', 'w') as f:
    f.write(app_content)

print("Created app.py for Streamlit Cloud")

In [None]:
# requirements.txt
requirements = '''streamlit>=1.30.0
openai>=1.0.0
'''

with open(f'{streamlit_cloud_dir}/requirements.txt', 'w') as f:
    f.write(requirements)

# .streamlit/config.toml - Theme and settings
config = '''[theme]
primaryColor = "#FF4B4B"
backgroundColor = "#FFFFFF"
secondaryBackgroundColor = "#F0F2F6"
textColor = "#262730"
font = "sans serif"

[server]
maxUploadSize = 200
'''

with open(f'{streamlit_cloud_dir}/.streamlit/config.toml', 'w') as f:
    f.write(config)

# .streamlit/secrets.toml - Local secrets (DO NOT COMMIT!)
# This is for local development only
secrets_template = '''# Local development secrets
# DO NOT commit this file! Add to .gitignore

OPENAI_API_KEY = "sk-your-key-here"
DATABASE_URL = "postgresql://..."

[custom_section]
api_key = "your-key"
base_url = "https://api.example.com"
'''

print("\nFor Streamlit Cloud secrets:")
print("1. Go to your app ‚Üí Settings ‚Üí Secrets")
print("2. Add secrets in TOML format")
print("3. Access with st.secrets['KEY']")

### Deploying to Streamlit Cloud

1. **Push to GitHub** - Make sure your app is in a public (or connected private) repo
2. **Go to [share.streamlit.io](https://share.streamlit.io)**
3. **Click "New app"**
4. **Select repository and branch**
5. **Choose the main file** (e.g., `app.py`)
6. **Add secrets** in Advanced settings
7. **Deploy!**

Your app will be available at: `https://your-app.streamlit.app`

---

## Part 3: Quick Sharing with Gradio

For quick, temporary sharing (72 hours), Gradio has a built-in option:

In [None]:
import gradio as gr

def echo(text):
    return f"Echo: {text}"

demo = gr.Interface(fn=echo, inputs="text", outputs="text")

# This creates a temporary public URL!
# demo.launch(share=True)

print("To create a temporary public URL:")
print("  demo.launch(share=True)")
print("")
print("This will output something like:")
print("  Running on public URL: https://abc123xyz.gradio.live")
print("")
print("‚ö†Ô∏è Note: The link expires after 72 hours!")
print("For permanent hosting, use Hugging Face Spaces.")

---

## Part 4: Self-Hosting on DGX Spark

For full control and GPU access, host on your own DGX Spark!

### Running as a Background Service

In [None]:
# Self-hosting options on DGX Spark

self_host_guide = '''
# ============================================================================
# OPTION 1: Screen/tmux (Quick & Dirty)
# ============================================================================

# Start a screen session
screen -S demo

# Run your app
streamlit run app.py --server.port 8501 --server.address 0.0.0.0

# Detach with Ctrl+A, D
# Reattach with: screen -r demo

# ============================================================================
# OPTION 2: Systemd Service (Production)
# ============================================================================

# Create service file: /etc/systemd/system/my-demo.service
"""
[Unit]
Description=My AI Demo
After=network.target

[Service]
Type=simple
User=ubuntu
WorkingDirectory=/home/ubuntu/my-demo
ExecStart=/usr/bin/python -m streamlit run app.py --server.port 8501 --server.address 0.0.0.0
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target
"""

# Enable and start
# sudo systemctl daemon-reload
# sudo systemctl enable my-demo
# sudo systemctl start my-demo

# ============================================================================
# OPTION 3: Docker Compose (Recommended)
# ============================================================================

# docker-compose.yml
"""
version: '3.8'

services:
  demo:
    build: .
    ports:
      - "8501:8501"
    volumes:
      - ./data:/app/data
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped
"""

# Run with: docker-compose up -d
'''

print(self_host_guide)

### Setting Up a Reverse Proxy (nginx)

For HTTPS and custom domains:

In [None]:
nginx_config = '''
# /etc/nginx/sites-available/my-demo

server {
    listen 80;
    server_name demo.yourdomain.com;
    
    # Redirect HTTP to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl;
    server_name demo.yourdomain.com;
    
    # SSL certificates (use Let's Encrypt)
    ssl_certificate /etc/letsencrypt/live/demo.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/demo.yourdomain.com/privkey.pem;
    
    location / {
        proxy_pass http://localhost:8501;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 86400;
    }
    
    # For Streamlit websockets
    location /_stcore/stream {
        proxy_pass http://localhost:8501/_stcore/stream;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_read_timeout 86400;
    }
}
'''

print("Nginx reverse proxy configuration:")
print(nginx_config)

---

## Part 5: Alternative Platforms

### Railway

Great for Python apps with easy scaling.

In [None]:
# Railway deployment files

railway_dir = '/tmp/railway_demo'
os.makedirs(railway_dir, exist_ok=True)

# Procfile - tells Railway how to run your app
procfile = 'web: python -m streamlit run app.py --server.port $PORT --server.address 0.0.0.0\n'

with open(f'{railway_dir}/Procfile', 'w') as f:
    f.write(procfile)

# railway.toml - Railway configuration
railway_config = '''[build]
builder = "nixpacks"

[deploy]
startCommand = "python -m streamlit run app.py --server.port $PORT --server.address 0.0.0.0"
restartPolicyType = "ON_FAILURE"
restartPolicyMaxRetries = 10
'''

with open(f'{railway_dir}/railway.toml', 'w') as f:
    f.write(railway_config)

print("Railway deployment:")
print("1. Install Railway CLI: npm i -g @railway/cli")
print("2. Login: railway login")
print("3. Initialize: railway init")
print("4. Deploy: railway up")

### Render

Similar to Railway, with a generous free tier.

In [None]:
# render.yaml - Render configuration (Blueprint)
render_config = '''services:
  - type: web
    name: my-ai-demo
    runtime: python
    buildCommand: pip install -r requirements.txt
    startCommand: python -m streamlit run app.py --server.port $PORT --server.address 0.0.0.0
    envVars:
      - key: OPENAI_API_KEY
        sync: false  # Set in Render dashboard
    autoDeploy: true
'''

print("Render deployment:")
print("1. Push your code to GitHub")
print("2. Connect repo on render.com")
print("3. Render auto-detects Python and deploys!")
print("\nrender.yaml (optional - for advanced config):")
print(render_config)

---

## Deployment Checklist

Before deploying any demo:

### Security
- [ ] No hardcoded secrets in code
- [ ] Secrets stored in platform's secret manager
- [ ] `.gitignore` includes secret files
- [ ] No debug mode in production

### Reliability
- [ ] Error handling for all external calls
- [ ] Friendly error messages for users
- [ ] Timeouts for long operations
- [ ] Health check endpoint (if applicable)

### Performance
- [ ] Models cached properly
- [ ] Static files optimized
- [ ] Resource limits configured

### User Experience
- [ ] Loading states visible
- [ ] Mobile-responsive (test with `share=True` on phone)
- [ ] Clear instructions/examples provided

---

## ‚ö†Ô∏è Common Mistakes

### Mistake 1: Committing Secrets
```python
# ‚ùå NEVER do this
OPENAI_API_KEY = "sk-abc123..."

# ‚úÖ Always use environment variables
import os
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
```

### Mistake 2: Wrong Port Configuration
```python
# ‚ùå Hardcoded port (won't work on many platforms)
demo.launch(server_port=7860)

# ‚úÖ Use environment variable
import os
port = int(os.environ.get("PORT", 7860))
demo.launch(server_port=port)
```

### Mistake 3: Missing Dependencies
```bash
# ‚ùå Forgot to include all packages
# requirements.txt is missing chromadb

# ‚úÖ Generate from environment
pip freeze > requirements.txt
# Or better, use pip-tools:
pip-compile requirements.in
```

### Mistake 4: Large Files in Git
```bash
# ‚ùå Model files in repo (100MB+ = GitHub rejects)
git add models/llama-7b.bin

# ‚úÖ Download at runtime or use external storage
# In app.py:
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="meta/llama", filename="model.bin")
```

---

## üéâ Checkpoint

You've learned:
- ‚úÖ Deploying to Hugging Face Spaces (Gradio/Streamlit)
- ‚úÖ Deploying to Streamlit Cloud
- ‚úÖ Quick sharing with Gradio's `share=True`
- ‚úÖ Self-hosting on DGX Spark
- ‚úÖ Alternative platforms (Railway, Render)
- ‚úÖ Secret management best practices

---

## üöÄ Challenge (Optional)

Deploy your favorite demo from this module to TWO platforms:
1. Hugging Face Spaces (for community visibility)
2. Streamlit Cloud or self-hosted (for your portfolio)

Compare:
- Setup difficulty
- Performance
- URL aesthetics
- Update workflow

---

## üìñ Further Reading

- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
- [Streamlit Cloud Documentation](https://docs.streamlit.io/streamlit-community-cloud)
- [Railway Documentation](https://docs.railway.app/)
- [Render Documentation](https://render.com/docs)

---

## üßπ Cleanup

In [None]:
# List all deployment files created
import os

dirs_created = [
    '/tmp/hf_space_demo',
    '/tmp/streamlit_cloud_demo',
    '/tmp/railway_demo'
]

print("Deployment template directories created:")
for d in dirs_created:
    if os.path.exists(d):
        files = os.listdir(d)
        print(f"\n{d}/")
        for f in files:
            print(f"  - {f}")

print("\n‚úÖ Ready to deploy!")