# Graph Writing Studio ‚Äî Google Colab

Run the full **Graph Writing Studio** pipeline in the cloud ‚Äî no local GPU, Docker, or Neo4j installation required.

**What this notebook does:**
1. Installs Ollama and starts it as a background service
2. Installs Neo4j Community Edition directly (no Docker needed)
3. Clones the project and installs Python dependencies
4. Walks through the complete entity and conversation workflows

**Hardware requirements:**
- Works on **CPU-only** Colab instances (free tier)
- Faster with a **T4 or L4 GPU** (Colab free / Pro)
- Neo4j requires ~512 MB RAM; a free Colab instance has 12‚Äì13 GB ‚Äî plenty

**Runtime:** `Runtime ‚Üí Change runtime type ‚Üí Python 3` (GPU optional but recommended)

---
> **Note:** Each Colab session starts fresh. Run cells top-to-bottom on a new session.
> To avoid re-downloading large models, see **Step 6: Persist Models to Google Drive**.

## Step 1 ‚Äî Install Ollama

In [None]:
# Install system utilities and Ollama
# pciutils / lshw are used to detect whether a GPU is present
!sudo apt-get update -qq && sudo apt-get install -y -q pciutils lshw
!curl -fsSL https://ollama.com/install.sh | sh
print("‚úÖ Ollama installed.")

## Step 2 ‚Äî Start Ollama Server

In [None]:
import subprocess
import time
import requests

def start_ollama_server():
    """Kill any stale process, then start ollama serve in the background."""
    subprocess.run("pkill ollama", shell=True, stderr=subprocess.DEVNULL)
    time.sleep(1)
    with open("/tmp/ollama.log", "w") as log:
        subprocess.Popen(["ollama", "serve"], stdout=log, stderr=log)
    time.sleep(3)

def wait_for_ollama(timeout=60):
    """Poll the Ollama API until it responds or the timeout expires."""
    for i in range(timeout):
        try:
            r = requests.get("http://localhost:11434/api/tags", timeout=2)
            if r.status_code == 200:
                return True
        except requests.exceptions.RequestException:
            pass
        time.sleep(1)
    return False

start_ollama_server()

if wait_for_ollama():
    print("‚úÖ Ollama server is running at http://localhost:11434")
else:
    print("‚ùå Ollama failed to start ‚Äî check /tmp/ollama.log")
    !tail -20 /tmp/ollama.log

## Step 3 ‚Äî Choose and Pull a Model

The cell below auto-detects whether a GPU is present and suggests the best model for your hardware.

| Model | VRAM / RAM | Speed | Quality |
|---|---|---|---|
| `llama3.2:1b` | ~1 GB | ‚ö° Very fast (CPU OK) | Basic |
| `llama3.2:3b` | ~2 GB | Fast (CPU OK) | **Good** |
| `phi3:mini` | ~2.3 GB | Fast (CPU OK) | Good |
| `llama3.1:8b` | ~5 GB | Moderate | Better |
| `llama3.1:70b` | ~40 GB | Slow (needs GPU) | Best |

You can override the auto-selected model in the form below.

In [None]:
import subprocess

def detect_gpu():
    """Return True if a CUDA-capable GPU is visible to the system."""
    try:
        result = subprocess.run(
            ["nvidia-smi", "--query-gpu=name", "--format=csv,noheader"],
            capture_output=True, text=True, timeout=5
        )
        if result.returncode == 0 and result.stdout.strip():
            return True, result.stdout.strip().splitlines()[0]
    except (FileNotFoundError, subprocess.TimeoutExpired):
        pass
    return False, None

has_gpu, gpu_name = detect_gpu()

if has_gpu:
    print(f"üü¢ GPU detected: {gpu_name}")
    print("   Recommended model: llama3.1:8b  (upgrade to llama3.1:70b for best quality)")
    suggested_model = "llama3.1:8b"
else:
    print("üü° No GPU detected ‚Äî running on CPU")
    print("   Recommended model: llama3.2:3b  (good quality, ~2 GB download)")
    suggested_model = "llama3.2:3b"

print(f"\nSuggested model: {suggested_model}")
print("Override this in the next cell if you prefer a different model.")

In [None]:
# @title Model selection { run: "auto" }
# Choose a model. llama3.2:3b works well on CPU. llama3.1:8b needs a GPU or patience.
MODEL_NAME = "llama3.2:3b"  # @param ["llama3.2:1b", "llama3.2:3b", "phi3:mini", "llama3.1:8b", "llama3.1:70b"]

print(f"Pulling {MODEL_NAME} ‚Äî this may take several minutes on first run...")
!ollama pull {MODEL_NAME}
print(f"\n‚úÖ {MODEL_NAME} is ready.")

## Step 4 ‚Äî Install and Start Neo4j

Graph Writing Studio stores all extracted entities and relationships in **Neo4j**.  
We install Neo4j Community Edition 5.x directly into the Colab VM ‚Äî no Docker needed.

> This installs the **Graph Data Science (GDS)** plugin, which is required for community detection (Louvain / Leiden algorithms).

In [None]:
# Add Neo4j's apt repository and install Community Edition + GDS plugin
!curl -fsSL https://debian.neo4j.com/neotechnology.gpg.key \
    | gpg --dearmor -o /usr/share/keyrings/neo4j.gpg
!echo "deb [signed-by=/usr/share/keyrings/neo4j.gpg] https://debian.neo4j.com stable 5" \
    > /etc/apt/sources.list.d/neo4j.list
!apt-get update -qq
!apt-get install -y -q neo4j
print("‚úÖ Neo4j installed.")

In [None]:
import subprocess
import time
import requests

NEO4J_PASSWORD = "graphstudio"  # Must match the .env written later

# ‚îÄ‚îÄ Set initial password (required before first start) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
result = subprocess.run(
    ["neo4j-admin", "dbms", "set-initial-password", NEO4J_PASSWORD],
    capture_output=True, text=True
)
if result.returncode != 0:
    # Password may already be set if the cell is re-run
    print(f"Password note: {result.stderr.strip() or result.stdout.strip()}")

# ‚îÄ‚îÄ Download and install the GDS plugin ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# GDS jar must live in the Neo4j plugins directory
GDS_VERSION = "2.6.8"  # Compatible with Neo4j 5.x Community
GDS_JAR = f"neo4j-graph-data-science-{GDS_VERSION}.jar"
GDS_URL = f"https://graphdatascience.ninja/neo4j-graph-data-science-{GDS_VERSION}.jar"
PLUGINS_DIR = "/var/lib/neo4j/plugins"

print(f"Downloading GDS plugin v{GDS_VERSION}...")
!wget -q "{GDS_URL}" -O "/tmp/{GDS_JAR}"
!cp "/tmp/{GDS_JAR}" "{PLUGINS_DIR}/{GDS_JAR}"
!chown neo4j:neo4j "{PLUGINS_DIR}/{GDS_JAR}"
print("‚úÖ GDS plugin installed.")

# ‚îÄ‚îÄ Allow GDS in neo4j.conf ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
conf_path = "/etc/neo4j/neo4j.conf"
with open(conf_path, "a") as f:
    f.write("\n# Graph Data Science plugin\n")
    f.write("dbms.security.procedures.unrestricted=gds.*\n")
    f.write("dbms.security.procedures.allowlist=gds.*\n")
print("‚úÖ neo4j.conf updated for GDS.")

# ‚îÄ‚îÄ Start Neo4j ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
subprocess.run(["neo4j", "start"], capture_output=True)
print("Starting Neo4j...")

def wait_for_neo4j(timeout=90):
    """Poll the Neo4j HTTP endpoint until it is ready."""
    for i in range(timeout):
        try:
            r = requests.get("http://localhost:7474", timeout=2)
            if r.status_code in (200, 401):
                return True
        except requests.exceptions.RequestException:
            pass
        time.sleep(1)
        if i % 10 == 9:
            print(f"  Still waiting... ({i+1}s)")
    return False

if wait_for_neo4j():
    print("‚úÖ Neo4j is running.")
    print("   Browser UI:  http://localhost:7474  (use the ngrok cell below to access it)")
    print("   Bolt:        bolt://localhost:7687")
else:
    print("‚ùå Neo4j did not start in time ‚Äî check logs below")
    !neo4j status
    !tail -30 /var/log/neo4j/neo4j.log

## Step 5 ‚Äî Clone the Project and Install Dependencies

In [None]:
import os

REPO_DIR = "/content/graph-writing-studio"

if not os.path.isdir(REPO_DIR):
    !git clone https://github.com/dms-killa/graph-writing-studio "{REPO_DIR}"
else:
    print("Repo already cloned ‚Äî pulling latest changes.")
    !git -C "{REPO_DIR}" pull

os.chdir(REPO_DIR)
print(f"\n‚úÖ Working directory: {os.getcwd()}")

In [None]:
!pip install -q -r requirements.txt
print("‚úÖ Python dependencies installed.")

## Step 6 ‚Äî Configure Environment

The project reads settings from a `.env` file. We write it here so that `config.py` picks up the right Ollama model and Neo4j credentials automatically.

In [None]:
import os

# MODEL_NAME was set in Step 3. Re-assign here if you skipped that cell.
MODEL_NAME = os.environ.get("MODEL_NAME", "llama3.2:3b")

env_content = f"""# Graph Writing Studio ‚Äî Colab configuration
# Generated automatically by graph_writing_studio_colab.ipynb

# Ollama settings
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL={MODEL_NAME}
OLLAMA_TEMPERATURE=0.1
OLLAMA_TIMEOUT=300.0

# Neo4j settings
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=graphstudio
"""

with open(".env", "w") as f:
    f.write(env_content)

# Also export to the process environment so subprocesses inherit them
os.environ["OLLAMA_BASE_URL"] = "http://localhost:11434"
os.environ["OLLAMA_MODEL"]    = MODEL_NAME
os.environ["NEO4J_URI"]       = "bolt://localhost:7687"
os.environ["NEO4J_USER"]      = "neo4j"
os.environ["NEO4J_PASSWORD"]  = "graphstudio"

print("‚úÖ .env written.")
print(f"   Ollama model : {MODEL_NAME}")
print( "   Neo4j URI    : bolt://localhost:7687")

## Step 7 ‚Äî Verify Connections

Confirm that both Ollama and Neo4j are reachable before running any workflow steps.

In [None]:
import requests
from neo4j import GraphDatabase

# ‚îÄ‚îÄ Ollama ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
try:
    r = requests.get("http://localhost:11434/api/tags", timeout=5)
    models = [m["name"] for m in r.json().get("models", [])]
    print(f"‚úÖ Ollama ‚Äî available models: {models}")
except Exception as e:
    print(f"‚ùå Ollama connection failed: {e}")

# ‚îÄ‚îÄ Neo4j ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
try:
    driver = GraphDatabase.driver(
        "bolt://localhost:7687",
        auth=("neo4j", "graphstudio")
    )
    with driver.session() as session:
        result = session.run("RETURN 'connected' AS status")
        record = result.single()
        print(f"‚úÖ Neo4j  ‚Äî {record['status']}")
    driver.close()
except Exception as e:
    print(f"‚ùå Neo4j connection failed: {e}")
    print("   If this is the first run, wait 30 seconds and retry.")

---
## Step 8 ‚Äî Entity Workflow

Run the full pipeline on the bundled contact sample files.

### 8a ‚Äî Ingest Contact Records

In [None]:
# Dry-run first: extract entities without writing to Neo4j
print("=== Dry-run extraction (no Neo4j write) ===")
!python main.py ingest --source samples/john_smith.txt --dry-run

print("\n=== Full ingestion ===")
!python main.py ingest --source samples/john_smith.txt
!python main.py ingest --source samples/sarah_chen.txt
print("\n‚úÖ Both contacts ingested.")

### 8b ‚Äî Discover the Outline via Community Detection

In [None]:
# @title Outline algorithm { run: "auto" }
OUTLINE_ALGORITHM = "louvain"  # @param ["louvain", "leiden"]

!python main.py outline --algorithm {OUTLINE_ALGORITHM}

import json, pathlib
outline_path = pathlib.Path("outline.json")
if outline_path.exists():
    outline = json.loads(outline_path.read_text())
    print(f"\n‚úÖ Outline saved ‚Äî {len(outline)} section(s) detected.")
    for i, section in enumerate(outline):
        members = section.get("members", [])
        print(f"  Section {i}: {members}")
else:
    print("outline.json not found ‚Äî check the output above for errors.")

### 8c ‚Äî Generate a Drafting Prompt

In [None]:
# @title Draft section { run: "auto" }
SECTION_INDEX = 0  # @param {type:"integer"}

!python main.py draft --section {SECTION_INDEX}

import pathlib
prompt_path = pathlib.Path(f"drafts/section_{SECTION_INDEX}_prompt.txt")
if prompt_path.exists():
    print(f"\n‚úÖ Prompt saved to {prompt_path}")
    print("\n--- Prompt preview (first 40 lines) ---")
    lines = prompt_path.read_text().splitlines()
    print("\n".join(lines[:40]))
    if len(lines) > 40:
        print(f"\n... ({len(lines) - 40} more lines)")

### 8d ‚Äî Send the Prompt to Ollama (Optional)

Pipe the generated prompt directly to the local model for an end-to-end draft.

In [None]:
# @title Generate draft with Ollama { run: "auto" }
# NOTE: On CPU this may take several minutes. Monitor /tmp/ollama.log if it hangs.
SECTION_TO_DRAFT = 0  # @param {type:"integer"}

import subprocess, pathlib

prompt_file = pathlib.Path(f"drafts/section_{SECTION_TO_DRAFT}_prompt.txt")
if not prompt_file.exists():
    print(f"Run '8c ‚Äî Generate a Drafting Prompt' for section {SECTION_TO_DRAFT} first.")
else:
    print(f"Sending {prompt_file} to {MODEL_NAME}...")
    result = subprocess.run(
        ["ollama", "run", MODEL_NAME],
        input=prompt_file.read_text(),
        capture_output=True, text=True, timeout=600
    )
    draft_output = result.stdout.strip()
    print("\n=== Draft Output ===")
    print(draft_output)

    output_path = pathlib.Path(f"drafts/section_{SECTION_TO_DRAFT}_draft.txt")
    output_path.write_text(draft_output)
    print(f"\n‚úÖ Draft saved to {output_path}")

### 8e ‚Äî Store Editorial Feedback

In [None]:
# @title Feedback form
FEEDBACK_TYPE   = "AVOID_TOPIC"        # @param ["AVOID_TOPIC", "PREFER_STYLE", "CORRECT_FACT", "MERGE_ENTITIES", "CONFIRM_RELATION", "REJECT_RELATION"]
TARGET_ENTITY   = "John Smith"         # @param {type:"string"}
INSTRUCTION     = "Don't mention salary or compensation details"  # @param {type:"string"}

!python main.py feedback \
    --type        "{FEEDBACK_TYPE}" \
    --entity      "{TARGET_ENTITY}" \
    --instruction "{INSTRUCTION}"
print("\n‚úÖ Feedback stored as a graph node.")

---
## Step 9 ‚Äî Conversation Workflow

Graph Writing Studio can also ingest conversation transcripts, detect rhetorical tactics, and cluster messages into sections using community detection.

### 9a ‚Äî Ingest Conversation

In [None]:
CONVERSATION_SOURCE = "samples/democratic_backsliding_chat.md"
CONVERSATION_ID     = "democratic_backsliding_chat"

!python main.py ingest \
    --source       "{CONVERSATION_SOURCE}" \
    --conversation
print("\n‚úÖ Conversation ingested.")

### 9b ‚Äî Conversation Outline

In [None]:
# @title Conversation outline algorithm { run: "auto" }
CONV_ALGORITHM = "leiden"  # @param ["leiden", "louvain"]

!python main.py outline-conversation "{CONVERSATION_ID}" --algorithm {CONV_ALGORITHM}

import json, pathlib
conv_outline_path = pathlib.Path("outline_conversation.json")
if conv_outline_path.exists():
    conv_outline = json.loads(conv_outline_path.read_text())
    print(f"\n‚úÖ Conversation outline: {len(conv_outline)} section(s)")
    for i, section in enumerate(conv_outline):
        msg_ids = section.get("message_ids", [])
        print(f"  Section {i}: {len(msg_ids)} message(s) ‚Äî {msg_ids}")

### 9c ‚Äî Draft Conversation Section

In [None]:
# @title Draft conversation section { run: "auto" }
CONV_SECTION = 0  # @param {type:"integer"}

!python main.py draft-conversation \
    --section       {CONV_SECTION} \
    --conversation  "{CONVERSATION_ID}"

import pathlib
conv_prompt_path = pathlib.Path(f"drafts/conversation_section_{CONV_SECTION}_prompt.txt")
if conv_prompt_path.exists():
    print(f"\n‚úÖ Prompt saved to {conv_prompt_path}")
    lines = conv_prompt_path.read_text().splitlines()
    print("\n--- Prompt preview ---")
    print("\n".join(lines[:40]))
    if len(lines) > 40:
        print(f"\n... ({len(lines) - 40} more lines)")

---
## Step 10 ‚Äî Ingest Your Own Files

Upload your own text files or conversation transcripts and run the full pipeline.

In [None]:
# @title Upload and ingest a custom file
from google.colab import files
import os

uploaded = files.upload()  # Opens a file picker dialog

for filename, data in uploaded.items():
    dest = os.path.join("samples", filename)
    with open(dest, "wb") as f:
        f.write(data)
    print(f"Saved to {dest}")

print("\nRun the cells in Step 8 or 9 with the path above to ingest your file.")

In [None]:
# @title Ingest custom file { run: "auto" }
CUSTOM_SOURCE      = "samples/your_file.txt"  # @param {type:"string"}
IS_CONVERSATION    = False                     # @param {type:"boolean"}
MIN_CONFIDENCE     = 0.5                       # @param {type:"number"}

import shlex
conv_flag = "--conversation" if IS_CONVERSATION else ""

!python main.py ingest \
    --source           "{CUSTOM_SOURCE}" \
    --min-confidence   {MIN_CONFIDENCE} \
    {conv_flag}
print("\n‚úÖ Done. Now run outline ‚Üí draft.")

---
## Step 11 ‚Äî (Optional) Expose Neo4j Browser via Ngrok

If you want to inspect the graph visually using the Neo4j Browser, you can expose port 7474 with **ngrok** (free plan, no credit card required).

1. Sign up at https://ngrok.com and copy your auth token
2. Paste it in the form below and run the cell

In [None]:
# @title Expose Neo4j Browser
NGROK_AUTH_TOKEN = ""  # @param {type:"string"}

if not NGROK_AUTH_TOKEN:
    print("Paste your ngrok auth token above and re-run this cell.")
else:
    !pip install -q pyngrok
    from pyngrok import ngrok, conf
    conf.get_default().auth_token = NGROK_AUTH_TOKEN
    tunnel = ngrok.connect(7474, "http")
    print(f"‚úÖ Neo4j Browser: {tunnel.public_url}")
    print(f"   Login: neo4j / graphstudio")
    print(f"   Bolt connection string: bolt://localhost:7687")

---
## Step 12 ‚Äî (Optional) Persist Ollama Models to Google Drive

Ollama models are 2‚Äì40 GB. Re-downloading them on every session is slow.  
This cell mounts your Google Drive and symlinks the Ollama model cache so downloads persist across sessions.

In [None]:
import subprocess, time
from google.colab import drive

# Mount Google Drive
drive.mount("/content/drive")

DRIVE_MODELS_DIR = "/content/drive/MyDrive/ollama_models"
OLLAMA_MODELS    = "/root/.ollama/models"

# Stop Ollama before moving the model directory
subprocess.run("pkill ollama", shell=True, stderr=subprocess.DEVNULL)
time.sleep(2)

import os, shutil
os.makedirs(DRIVE_MODELS_DIR, exist_ok=True)

if os.path.isdir(OLLAMA_MODELS) and not os.path.islink(OLLAMA_MODELS):
    # First time: move local models to Drive, then symlink back
    shutil.move(OLLAMA_MODELS, DRIVE_MODELS_DIR)
    print(f"Moved models to {DRIVE_MODELS_DIR}")
elif os.path.islink(OLLAMA_MODELS):
    os.unlink(OLLAMA_MODELS)
    print("Removed old symlink.")

os.symlink(DRIVE_MODELS_DIR, OLLAMA_MODELS)
print(f"‚úÖ Symlinked {OLLAMA_MODELS} ‚Üí {DRIVE_MODELS_DIR}")

# Restart Ollama with the Drive-backed model cache
start_ollama_server()  # defined in Step 2
if wait_for_ollama():
    print("‚úÖ Ollama restarted with persistent model cache.")
else:
    print("‚ùå Ollama failed to restart ‚Äî re-run the Step 2 cell.")

---
## Troubleshooting

| Symptom | Fix |
|---|---|
| `Ollama server failed to start` | `!tail -30 /tmp/ollama.log` ‚Äî often a port conflict. Re-run Step 2. |
| `Neo4j connection failed` | Wait 30 s and retry. Check `!neo4j status` and `!tail -30 /var/log/neo4j/neo4j.log`. |
| `GDS plugin not found` | Confirm the jar is in `/var/lib/neo4j/plugins/` and `neo4j.conf` has the allowlist lines. |
| Extraction is very slow | Switch to `llama3.2:1b` in Step 3 for faster (less accurate) extraction on CPU. |
| `ollama run` hangs | Increase `OLLAMA_TIMEOUT` in `.env` (default 300 s). On CPU, large prompts can exceed this. |
| Colab disconnects | Re-run all setup cells (Steps 1‚Äì7) ‚Äî the VM state is lost on disconnect. Models on Drive (Step 12) survive. |
| `ModuleNotFoundError` | Make sure `%cd /content/graph-writing-studio` ran. Re-run Step 5. |

### Useful log commands

```python
!tail -30 /tmp/ollama.log           # Ollama server log
!tail -30 /var/log/neo4j/neo4j.log  # Neo4j server log
!neo4j status                       # Neo4j process status
!ollama list                        # Confirm pulled models
```