# SignSpeak Inference Server - Google Colab Deployment

Deploy the SignSpeak ML backend on Google Colab with **T4 GPU** for zero-cost inference.

## ‚ö° Quick Start
1. **Runtime** ‚Üí **Change runtime type** ‚Üí Select **T4 GPU**
2. Run all cells in order (Ctrl+F9 or Runtime ‚Üí Run all)
3. Copy the tunnel URL from Cell 4 output
4. Update `COLAB_TUNNEL_URL` in Cloudflare Pages

In [None]:
# Cell 1: Install Dependencies
print("üì¶ Installing dependencies...")

!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q fastapi uvicorn[standard] pydantic python-multipart
!pip install -q pycloudflared

print("‚úÖ Dependencies installed successfully!")

In [None]:
# Cell 2: Clone Repository
import os

print("üì• Cloning SignSpeak repository...")

# Remove existing directory if present
if os.path.exists('SignSpeak-Final-Year-Project'):
    !rm -rf SignSpeak-Final-Year-Project

!git clone https://github.com/KathiravanKOffl/SignSpeak-Final-Year-Project.git
%cd SignSpeak-Final-Year-Project/backend

print(f"‚úÖ Repository cloned!")
print(f"üìÇ Current directory: {os.getcwd()}")

In [None]:
# Cell 3: Verify GPU
import torch

print("üîç Checking GPU availability...\n")
print(f"üñ•Ô∏è  Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}")
print(f"üíæ CUDA Available: {torch.cuda.is_available()}")
print(f"üî• PyTorch Version: {torch.__version__}")

if torch.cuda.is_available():
    print(f"‚úÖ GPU ready for inference!")
else:
    print(f"‚ö†Ô∏è  GPU not available - using CPU (slower)")

In [None]:
# Cell 4: Start Cloudflare Tunnel
from pycloudflared import try_cloudflare
import threading
import time

print("üåê Starting Cloudflare Tunnel...\n")

tunnel_url = None

def start_tunnel():
    global tunnel_url
    try:
        url_obj = try_cloudflare(port=8000, verbose=True)
        tunnel_url = url_obj.tunnel
        print(f"\n" + "="*70)
        print(f"‚úÖ TUNNEL ACTIVE!")
        print(f"\nüìã COPY THIS URL:")
        print(f"    {tunnel_url}")
        print(f"\n" + "="*70)
        print(f"\nüîß Next Steps:")
        print(f"   1. Copy the URL above")
        print(f"   2. Go to Cloudflare Pages dashboard")
        print(f"   3. Settings ‚Üí Environment variables")
        print(f"   4. Update COLAB_TUNNEL_URL with this URL")
        print(f"   5. Redeploy your Pages project")
        print(f"\n" + "="*70)
    except Exception as e:
        print(f"‚ùå Tunnel error: {e}")

# Start tunnel in background
tunnel_thread = threading.Thread(target=start_tunnel, daemon=True)
tunnel_thread.start()

# Wait for tunnel to initialize
print("‚è≥ Waiting for tunnel to start...")
time.sleep(10)

if tunnel_url:
    print(f"\n‚úÖ Tunnel is ready!")
    print(f"üåê URL: {tunnel_url}")
else:
    print(f"\n‚è≥ Tunnel is starting... Check output above for URL")

In [None]:
# Cell 5: Start FastAPI Server
print("üöÄ Starting FastAPI inference server...")
print("‚ö†Ô∏è  This cell will run indefinitely - that's normal!")
print("üìù Server logs will appear below")
print("üõë Press the stop button to shut down the server")
print("="*70)

!python -m uvicorn api.inference_server:app --host 0.0.0.0 --port 8000 --reload

In [None]:
# Cell 6: Keep-Alive Monitor (Optional)
# Run this in a separate cell if Colab tends to disconnect

import time
from IPython.display import clear_output

print("‚è∞ Keep-alive monitor active...")
counter = 0

try:
    while True:
        counter += 1
        clear_output(wait=True)
        print(f"‚è∞ Server uptime: {counter} minutes")
        print(f"üåê Tunnel URL: {tunnel_url if tunnel_url else 'Check Cell 4 output'}")
        print(f"üíö Colab session: ACTIVE")
        print(f"üñ•Ô∏è  GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}")
        print(f"\nüí° Tip: Keep this tab open to prevent session timeout")
        print(f"‚è±Ô∏è  Colab free tier: 12 hours max per session")
        time.sleep(60)
except KeyboardInterrupt:
    print("\nüõë Keep-alive stopped.")

## üß™ Testing

After starting the server, test these endpoints:

### Health Check
```bash
curl https://your-tunnel-url.trycloudflare.com/health
```

### Prediction
```bash
curl -X POST https://your-tunnel-url.trycloudflare.com/predict \
  -H "Content-Type: application/json" \
  -d '{"landmarks": {...}}'
```

## üìä Resource Usage

- **GPU**: T4 (16GB VRAM)
- **RAM**: ~12GB available
- **Disk**: ~100GB available
- **Session**: 12 hours max (free tier)

## üîÑ Session Management

If session expires or disconnects:
1. Runtime ‚Üí Restart runtime
2. Run all cells again (Ctrl+F9)
3. Copy new tunnel URL
4. Update Cloudflare environment variable

## üéØ Production Tips

For 24/7 availability, consider:
- Google Colab Pro ($10/month) - longer sessions
- Paperspace Gradient (free tier available)
- Lightning AI (free tier available)
- Or deploy to Cloud Run / AWS Lambda
