# Run kangtani.ai Backend on Google Colab with ngrok

This notebook sets up the FastAPI backend for kangtani.ai, runs Ollama (CPU-only), and exposes the API via ngrok.

**Note:** Colab is not ideal for running Ollama LLMs (no GPU, limited RAM/CPU). For demo/testing only.


In [ ]:
#@title 1. Install system dependencies
!apt-get update
!apt-get install -y ffmpeg git curl unzip
!pip install --upgrade pip
!pip install fastapi uvicorn[standard] httpx python-multipart openai-whisper PyPDF2 pdfplumber python-docx pyngrok


In [ ]:
#@title 2. Download and setup Ollama (CPU-only)
import os
OLLAMA_URL = 'https://github.com/jmorganca/ollama/releases/download/v0.1.32/ollama-linux-amd64'
OLLAMA_BIN = '/usr/local/bin/ollama'
if not os.path.exists(OLLAMA_BIN):
    !curl -L $OLLAMA_URL -o ollama
    !chmod +x ollama
    !mv ollama $OLLAMA_BIN

# Start Ollama in the background
import subprocess
ollama_proc = subprocess.Popen(['ollama', 'serve'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print('Ollama server started.')


In [ ]:
#@title 3. Pull the Gemma model (may take a while)
!ollama pull gemma


In [ ]:
#@title 4. Download kangtani.ai backend code
# If you have a repo, clone it. Otherwise, download/upload the backend code manually.
REPO_URL = 'https://github.com/yourusername/kangtani.ai' # <-- CHANGE THIS if public repo exists
if not os.path.exists('kantani.ai'):
    !git clone $REPO_URL
%cd kantani.ai/backend


In [ ]:
#@title 5. Start FastAPI backend (port 8000)
import threading
def run_uvicorn():
    import uvicorn
    uvicorn.run('main:app', host='0.0.0.0', port=8000, reload=False)

threading.Thread(target=run_uvicorn, daemon=True).start()
print('FastAPI backend started.')


In [ ]:
#@title 6. Expose FastAPI backend with ngrok
from pyngrok import ngrok
public_url = ngrok.connect(8000, 'http')
print('ngrok tunnel:', public_url)
print('You can now access the API at:', public_url + '/chat')


---
## Usage
- Use the public ngrok URL (shown above) as your backend endpoint in the frontend or for API testing.
- Example: `POST {ngrok_url}/chat` with `{ "message": "your question" }`

---
## Notes
- Ollama on Colab is for demo/testing only (slow, CPU-only, limited RAM).
- For production, run on your own server with more resources.
