# üìö RAG Wissensdatenbank - Google Colab

Dieses Notebook installiert und startet die RAG Wissensdatenbank mit √∂ffentlichem Zugang √ºber Cloudflare Tunnel.

## Voraussetzungen
- **OpenAI API Key** als Colab Secret (Name: `OPENAI_API_KEY`)

### Secret einrichten
1. Links auf das üîë Symbol klicken
2. "Neues Secret hinzuf√ºgen"
3. Name: `OPENAI_API_KEY`, Wert: Ihr API Key
4. "Notebook-Zugriff" aktivieren

## 1Ô∏è‚É£ Repository klonen

In [1]:
!git clone https://github.com/janschachtschabel/simple-document-rag.git
%cd simple-document-rag

Cloning into 'simple-document-rag'...
remote: Enumerating objects: 26, done.[K
remote: Counting objects: 100% (26/26), done.[K
remote: Compressing objects: 100% (22/22), done.[K
Receiving objects: 100% (26/26), 57.05 KiB | 2.19 MiB/s, done.
remote: Total 26 (delta 3), reused 23 (delta 3), pack-reused 0 (from 0)[K
Resolving deltas: 100% (3/3), done.
/content/simple-document-rag


## 2Ô∏è‚É£ Abh√§ngigkeiten installieren

In [3]:
# Abh√§ngigkeiten der APP installieren
!pip install -q -r requirements.txt

# Cloudflare Tunnel installieren
!wget -q https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
!sudo dpkg -i cloudflared-linux-amd64.deb

Selecting previously unselected package cloudflared.
(Reading database ... 117540 files and directories currently installed.)
Preparing to unpack cloudflared-linux-amd64.deb ...
Unpacking cloudflared (2026.1.2) ...
Setting up cloudflared (2026.1.2) ...
Processing triggers for man-db (2.10.2-1) ...


## 3Ô∏è‚É£ API Key laden und Modelle konfigurieren

In [4]:
import os
from google.colab import userdata

# API Key aus Colab Secrets laden
try:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
    print("‚úÖ OpenAI API Key geladen")
except Exception as e:
    print("‚ùå Secret 'OPENAI_API_KEY' nicht gefunden!")
    print("   üîë Symbol links ‚Üí Neues Secret ‚Üí OPENAI_API_KEY")
    raise e

# Modelle (hier anpassen falls gew√ºnscht)
OPENAI_MODEL = "gpt-4.1-mini"
EMBEDDING_MODEL = "text-embedding-ada-002"

os.environ["OPENAI_MODEL"] = OPENAI_MODEL
os.environ["EMBEDDING_MODEL"] = EMBEDDING_MODEL
os.environ["CHROMA_PERSIST_DIRECTORY"] = "./chroma_db"
os.environ["CHUNK_SIZE"] = "1000"
os.environ["CHUNK_OVERLAP"] = "200"
os.environ["TOP_K_RETRIEVAL"] = "5"

print(f"‚úÖ LLM: {OPENAI_MODEL}")
print(f"‚úÖ Embedding: {EMBEDDING_MODEL}")

‚úÖ OpenAI API Key geladen
‚úÖ LLM: gpt-4.1-mini
‚úÖ Embedding: text-embedding-ada-002


## 4Ô∏è‚É£ FastAPI Server starten

In [5]:
import subprocess
import time
import requests

api_process = subprocess.Popen(
    ["python", "main.py"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

print("‚è≥ Starte API...")
time.sleep(20)

try:
    r = requests.get("http://localhost:8000/health", timeout=5)
    print("‚úÖ API l√§uft auf http://localhost:8000")
except:
    print("‚ùå API nicht erreichbar")

‚è≥ Starte API...
‚ùå API nicht erreichbar


## 5Ô∏è‚É£ Streamlit + Cloudflare Tunnel starten

Nach Ausf√ºhrung erscheint eine **√∂ffentliche URL**.

In [8]:
import subprocess
import re
import time

# Streamlit im Hintergrund starten
streamlit_process = subprocess.Popen(
    ["streamlit", "run", "app.py", "--server.port", "8501", "--server.headless", "true"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)
print("‚è≥ Starte Streamlit...")
time.sleep(5)

# Cloudflare Tunnel starten und URL extrahieren
def start_cloudflare_tunnel(port):
    print(f"üåê Starte Cloudflare Tunnel f√ºr Port {port}...")
    process = subprocess.Popen(
        ["cloudflared", "tunnel", "--url", f"http://localhost:{port}"],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True
    )
    for line in process.stderr:
        print(f"Cloudflare: {line.strip()}")
        if "trycloudflare.com" in line:
            match = re.search(r'https://[\w-]+\.trycloudflare\.com', line)
            if match:
                return match.group(0), process
    return None, process

tunnel_url, tunnel_process = start_cloudflare_tunnel(8501)

if tunnel_url:
    print("\n" + "=" * 60)
    print("üéâ RAG WISSENSDATENBANK IST ONLINE!")
    print("=" * 60)
    print(f"üîó Public URL: {tunnel_url}")
    print("=" * 60)

‚è≥ Starte Streamlit...
üåê Starte Cloudflare Tunnel f√ºr Port 8501...
Cloudflare: 2026-02-05T09:55:18Z INF Thank you for trying Cloudflare Tunnel. Doing so, without a Cloudflare account, is a quick way to experiment and try it out. However, be aware that these account-less Tunnels have no uptime guarantee, are subject to the Cloudflare Online Services Terms of Use (https://www.cloudflare.com/website-terms/), and Cloudflare reserves the right to investigate your use of Tunnels for violations of such terms. If you intend to use Tunnels in production you should use a pre-created named tunnel by following: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps
Cloudflare: 2026-02-05T09:55:18Z INF Requesting new quick Tunnel on trycloudflare.com...
Cloudflare: 2026-02-05T09:55:21Z INF +--------------------------------------------------------------------------------------------+
Cloudflare: 2026-02-05T09:55:21Z INF |  Your quick Tunnel has been created! Visit it at (it m

## 6Ô∏è‚É£ Status pr√ºfen

In [7]:
import requests

print("üìä Status")
print("-" * 30)

try:
    r = requests.get("http://localhost:8000/health", timeout=5)
    print(f"‚úÖ API: OK ({r.json().get('statistics', {}).get('total_documents', 0)} Dokumente)")
except:
    print("‚ùå API: Offline")

try:
    r = requests.get("http://localhost:8501", timeout=5)
    print("‚úÖ Streamlit: OK")
except:
    print("‚ùå Streamlit: Offline")

if 'tunnel_process' in dir() and tunnel_process.poll() is None:
    print("‚úÖ Tunnel: Aktiv")
else:
    print("‚ùå Tunnel: Inaktiv")

üìä Status
------------------------------
‚úÖ API: OK (0 Dokumente)
‚úÖ Streamlit: OK
‚úÖ Tunnel: Aktiv


## üõë Prozesse beenden

In [None]:
try: api_process.terminate(); print("‚úÖ API beendet")
except: pass
try: streamlit_process.terminate(); print("‚úÖ Streamlit beendet")
except: pass
try: tunnel_process.terminate(); print("‚úÖ Tunnel beendet")
except: pass

---
## üìù Hinweise

- **Laufzeit**: Bis zu 12 Stunden (kostenlos)
- **Dokumente**: Gehen nach Sitzungsende verloren
- **Tunnel-URL**: √Ñndert sich bei jedem Neustart
- **Confluence**: In der App unter üî∑ Confluence konfigurieren