# Med-I-C · AMR-Guard
### Infection Lifecycle Orchestrator — Kaggle Demo

| Agent | Role | Model |
|---|---|---|
| 1 · Intake Historian | Patient data, CrCl, MDR risk | MedGemma 4B IT |
| 2 · Vision Specialist | Lab report → structured JSON | MedGemma 4B IT |
| 3 · Trend Analyst | MIC creep, resistance velocity | MedGemma 27B Text IT ¹ |
| 4 · Clinical Pharmacologist | Final Rx + safety check | MedGemma 4B IT + TxGemma 9B ¹ |

> ¹ Substituted with smaller variants on Kaggle T4 (16 GB GPU) — see Section 3.

**Steps:** Clone → Install → Authenticate → Download models → Init KB → Launch app

## 1 · Environment

In [None]:
import subprocess, torch

gpu_info = subprocess.run(
    ['nvidia-smi', '--query-gpu=name,memory.total', '--format=csv,noheader'],
    capture_output=True, text=True
).stdout.strip()
print(f"GPU  : {gpu_info}")
print(f"Torch: {torch.__version__} · CUDA {torch.cuda.is_available()}")

In [None]:
%%bash
if [ ! -d /kaggle/working/Med-I-C ]; then
    git clone https://github.com/benghita/Med-I-C.git /kaggle/working/Med-I-C
else
    echo "Repo already present — pulling latest"
    git -C /kaggle/working/Med-I-C pull
fi

In [None]:
%%capture
!pip install -q \
    "langgraph>=0.0.15" "langchain>=0.3.0" langchain-text-splitters langchain-community \
    "chromadb>=0.4.0" sentence-transformers \
    "transformers>=4.50.0" accelerate bitsandbytes \
    streamlit huggingface_hub \
    "pydantic>=2.0" python-dotenv openpyxl pypdf "pandas>=2.0" jq

## 2 · Hugging Face Authentication

Add your token to **Kaggle → Add-ons → Secrets** as `HF_TOKEN`.

Accept the model licences **before** running this notebook:
- MedGemma 4B IT → https://huggingface.co/google/medgemma-4b-it
- TxGemma 2B → https://huggingface.co/google/txgemma-2b-predict

In [None]:
import os
from huggingface_hub import login

try:
    from kaggle_secrets import UserSecretsClient
    hf_token = UserSecretsClient().get_secret("HF_TOKEN")
    print("Token loaded from Kaggle secrets")
except Exception:
    hf_token = os.getenv("HF_TOKEN", "")
    print("Token loaded from environment" if hf_token else "WARNING: No HF_TOKEN found")

if hf_token:
    login(token=hf_token, add_to_git_credential=False)

## 3 · Download Models

| Model | Agent | VRAM (4-bit) | Kaggle T4 |
|---|---|---|---|
| `google/medgemma-4b-it` | 1, 2, 4 primary | ~3 GB | ✓ |
| `google/medgemma-27b-text-it` | 3 (Trend Analyst) | ~14 GB | marginal — using 4B sub |
| `google/txgemma-9b-predict` | 4 safety check | ~5 GB | ✓ (stacked with 4B: ~8 GB) |
| `google/txgemma-2b-predict` | 4 safety fallback | ~1.5 GB | ✓ |

**Kaggle strategy:** download `medgemma-4b-it` and `txgemma-2b-predict`.
Agent 3 is pointed at `medgemma-4b-it` (4B sub), safety check at `txgemma-2b-predict` (2B sub).
Swap to the full 27B / 9B on a machine with ≥ 24 GB VRAM by editing the env cell below.

In [None]:
from huggingface_hub import snapshot_download

MEDGEMMA_4B   = "google/medgemma-4b-it"       # Agents 1, 2, 4 + Agent 3 sub
TXGEMMA_2B    = "google/txgemma-2b-predict"   # Agent 4 safety sub

# Full models for high-VRAM machines (uncomment to use instead)
# MEDGEMMA_27B = "google/medgemma-27b-text-it"  # Agent 3 — needs ~14 GB in 4bit
# TXGEMMA_9B   = "google/txgemma-9b-predict"    # Agent 4 safety — needs ~5 GB in 4bit

for repo_id in [MEDGEMMA_4B, TXGEMMA_2B]:
    print(f"Downloading {repo_id} …")
    snapshot_download(repo_id=repo_id, ignore_patterns=["*.gguf", "*.ot"])
    print(f"  done")

print("\nAll models downloaded.")

In [None]:
# Embedding model for RAG (CPU-only, no licence needed)
from sentence_transformers import SentenceTransformer
SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
print("Embedding model ready")

## 4 · Configure & Initialise

In [None]:
# Write .env — edit MEDGEMMA_27B / TXGEMMA_9B lines to use full models on high-VRAM machines
env = f"""
MEDIC_ENV=kaggle
MEDIC_DEFAULT_BACKEND=local
MEDIC_USE_VERTEX=false
MEDIC_QUANTIZATION=4bit

# Agent 1, 2, 4 — MedGemma 4B IT
MEDIC_LOCAL_MEDGEMMA_4B_MODEL={MEDGEMMA_4B}

# Agent 3 — MedGemma 27B Text IT  (subbed with 4B for Kaggle T4)
# To use full 27B: set to google/medgemma-27b-text-it
MEDIC_LOCAL_MEDGEMMA_27B_MODEL={MEDGEMMA_4B}

# Agent 4 safety — TxGemma 9B  (subbed with 2B for Kaggle T4)
# To use full 9B: set to google/txgemma-9b-predict
MEDIC_LOCAL_TXGEMMA_9B_MODEL={TXGEMMA_2B}
MEDIC_LOCAL_TXGEMMA_2B_MODEL={TXGEMMA_2B}

MEDIC_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
MEDIC_DATA_DIR=/kaggle/working/Med-I-C/data
MEDIC_CHROMA_DB_DIR=/kaggle/working/Med-I-C/data/chroma_db
""".strip()

with open("/kaggle/working/Med-I-C/.env", "w") as f:
    f.write(env)

print(".env written")
print(env)

In [None]:
import sys
sys.path.insert(0, "/kaggle/working/Med-I-C")

# Populate SQLite + ChromaDB knowledge base from data files
!python /kaggle/working/Med-I-C/setup_demo.py

## 5 · Launch the App

In [None]:
%%capture
!pip install -q localtunnel

In [None]:
import subprocess, time, requests

streamlit_proc = subprocess.Popen(
    ["streamlit", "run", "/kaggle/working/Med-I-C/app.py",
     "--server.port", "8501",
     "--server.headless", "true",
     "--server.enableCORS", "false"],
    stdout=subprocess.DEVNULL,
    stderr=subprocess.DEVNULL,
)

for _ in range(15):
    try:
        if requests.get("http://localhost:8501", timeout=2).status_code == 200:
            print("Streamlit running on :8501")
            break
    except Exception:
        time.sleep(2)
else:
    print("Streamlit may still be starting…")

In [None]:
tunnel_proc = subprocess.Popen(
    ["npx", "localtunnel", "--port", "8501"],
    stdout=subprocess.PIPE,
    stderr=subprocess.DEVNULL,
    text=True,
)

for line in tunnel_proc.stdout:
    if "https://" in line:
        print("\n" + "="*50)
        print(f"  App URL: {line.strip()}")
        print("="*50)
        break