# Tutorial: Lux CRM Quickstart (Docker + Real Data)

Audience:
- Operator/developer running the Lux CRM agent locally with real contacts and live connectors.

Prerequisites:
- Docker Desktop running
- Repo cloned at `/Users/dobrien/code/Lux/Lux_CRM_agent`
- A Postgres DSN (`NEON_PG_DSN`)
- Google credentials for n8n (Gmail, Sheets, optionally Drive)

Learning goals:
- Start the stack in Docker
- Seed your contact list
- Run a small Gmail backfill
- Enable live Gmail ingest
- Optionally enable Slack/transcript ingest
- Verify end-to-end outputs


## How To Use This Notebook

- Markdown cells explain what each step does.
- Code cells use `%%bash` where commands are runnable from Jupyter.
- n8n workflow import and credential setup are manual UI steps.
- Destructive reset steps are included as optional and clearly marked.

Recommended first run setting:
- `COGNEE_ENABLE_HEURISTIC_FALLBACK=false` (strict, production-like behavior)


## Minimal First Real Run (Short Version)

1. Configure `.env`
2. Start Docker services
3. Run Postgres migrations
4. Initialize Neo4j schema + ontology + SHACL
5. Smoke check API/worker
6. Import `sheets_sync`, `gmail_contact_backfill`, `gmail_ingest` into n8n
7. Configure Google credentials in n8n
8. Run `sheets_sync` once
9. Run a tiny Gmail backfill (`MAX_CONTACTS_PER_RUN=1`)
10. Watch worker logs
11. Verify `/v1/cases/*` and `/v1/scores/*`
12. Activate live `gmail_ingest`


In [None]:
from pathlib import Path
REPO = Path('/Users/dobrien/code/Lux/Lux_CRM_agent')
print('Repo exists:', REPO.exists())
print(REPO)


## Step 1. Configure `.env`

What this does:
- Sets runtime config for API/worker/Neo4j/n8n.
- Enables strict extraction mode when fallback is disabled.

After running the next cell, edit `.env` manually if values are missing/wrong.


In [None]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
[ -f .env ] || cp .env.example .env
rg -n '^(NEON_PG_DSN|NEO4J_URI|NEO4J_USER|NEO4J_PASSWORD|N8N_WEBHOOK_SECRET|N8N_PORT|QUEUE_MODE|COGNEE_ENABLE_HEURISTIC_FALLBACK|GRAPH_V2_ENABLED|GRAPH_V2_DUAL_WRITE)=' .env || true


Key `.env` values to verify:
- `NEON_PG_DSN=postgresql://...`
- `NEO4J_URI=neo4j://neo4j:7687`
- `NEO4J_USER=neo4j`
- `NEO4J_PASSWORD=...`
- `N8N_WEBHOOK_SECRET=<long-random-secret>`
- `QUEUE_MODE=redis`
- `COGNEE_ENABLE_HEURISTIC_FALLBACK=false`
- `GRAPH_V2_ENABLED=true`
- `GRAPH_V2_DUAL_WRITE=true`


## Step 2. Start Docker Services

What this does:
- Starts the API, worker, Neo4j, Redis, and n8n containers.
- Rebuilds images so running containers match your current code.


In [None]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
docker compose up -d --build api worker neo4j redis n8n


## Step 3. Run Postgres Migrations

What this does:
- Applies Alembic migrations so Postgres schema matches current code.


In [None]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
docker compose exec -T api sh -lc 'cd /workspace/apps/api && /opt/venv/bin/alembic -c alembic.ini upgrade head'


## Step 4. Initialize Neo4j Schema + Ontology + SHACL

What this does:
- Creates constraints/indexes for the graph.
- Loads ontology and SHACL shapes used by the CRM/Case/Evidence graph model.


In [6]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
docker compose exec -T api sh -lc 'cd /workspace/apps/api && /opt/venv/bin/python /workspace/scripts/init_neo4j_schema.py && /opt/venv/bin/python /workspace/scripts/load_ontology_and_shacl.py'


Applied: CREATE CONSTRAINT contact_id_unique IF NOT EXISTS FOR (c:Contact) REQUIRE c.contact_id IS UNIQUE
Applied: CREATE CONSTRAINT interaction_id_unique IF NOT EXISTS FOR (i:Interaction) REQUIRE i.interaction_id IS UNIQUE
Applied: CREATE CONSTRAINT claim_id_unique IF NOT EXISTS FOR (cl:Claim) REQUIRE cl.claim_id IS UNIQUE
Applied: CREATE CONSTRAINT evidence_id_unique IF NOT EXISTS FOR (e:Evidence) REQUIRE e.evidence_id IS UNIQUE
Applied: CREATE CONSTRAINT entity_id_unique IF NOT EXISTS FOR (e:Entity) REQUIRE e.entity_id IS UNIQUE
Applied: CREATE INDEX contact_primary_email IF NOT EXISTS FOR (c:Contact) ON (c.primary_email)
Applied: CREATE INDEX entity_normalized_name IF NOT EXISTS FOR (e:Entity) ON (e.normalized_name)
Applied: CREATE INDEX relation_predicate_norm IF NOT EXISTS FOR ()-[r:RELATES_TO]-() ON (r.predicate_norm)
Applied: CREATE CONSTRAINT crm_contact_external_id_unique IF NOT EXISTS FOR (c:CRMContact) REQUIRE c.external_id IS UNIQUE
Applied: CREATE CONSTRAINT crm_company_e

## Step 5. Smoke Check API + Worker

What this does:
- Confirms the API is reachable and the worker is running before you connect real accounts.


In [3]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
curl -s http://localhost:8000/v1/health
printf "\n\n"
docker compose ps
printf "\n\nWorker tail:\n"
docker compose logs --tail=50 worker


{"status":"ok","timestamp":"2026-02-22T21:37:14.020357+00:00"}

NAME                     IMAGE                     COMMAND                  SERVICE   CREATED             STATUS             PORTS
lux_crm_agent-api-1      lux-crm-agent-api:local   "uvicorn app.main:ap…"   api       About an hour ago   Up About an hour   0.0.0.0:8000->8000/tcp, [::]:8000->8000/tcp
lux_crm_agent-n8n-1      n8nio/n8n:latest          "tini -- /docker-ent…"   n8n       2 hours ago         Up 2 hours         0.0.0.0:5680->5680/tcp, [::]:5680->5680/tcp
lux_crm_agent-neo4j-1    neo4j:5.26-community      "tini -g -- /startup…"   neo4j     2 hours ago         Up 2 hours         0.0.0.0:7477->7474/tcp, [::]:7477->7474/tcp, 0.0.0.0:7690->7687/tcp, [::]:7690->7687/tcp
lux_crm_agent-redis-1    redis:7-alpine            "docker-entrypoint.s…"   redis     2 hours ago         Up 2 hours         0.0.0.0:6379->6379/tcp, [::]:6379->6379/tcp
lux_crm_agent-ui-1       node:20-alpine            "docker-entrypoint.s…"   ui      

## Optional Reset / Clear Data (Use Only If You Want a Clean Slate)

What this does:
- Removes prior test/example data before your first real run.

Common options (see `quickstart.md` for full commands):
- Reset Docker volumes: `docker compose down -v` then restart services
- Clear Postgres app tables: `raw_events`, `interactions`, `chunks`, `embeddings`, `drafts`, `resolution_tasks`, `contact_cache`
- Clear Neo4j graph: `MATCH (n) DETACH DELETE n;` then rerun schema/ontology load
- Clear n8n runtime DB (`n8n/database.sqlite*`) and restart `n8n`

## SQL to clear tables in NEON
'''
BEGIN;

TRUNCATE TABLE
  embeddings,
  chunks,
  drafts,
  resolution_tasks,
  interactions,
  raw_events,
  contact_cache
RESTART IDENTITY CASCADE;

COMMIT;
'''

## Step 6. Import n8n Workflows (Minimum Set)

What this does:
- Loads the workflows for contact sync, Gmail backfill, and live Gmail ingest.

Open n8n at `http://localhost:5679` and import these files from `/Users/dobrien/code/Lux/Lux_CRM_agent/n8n/workflows/`:
- `sheets_sync.json`
- `gmail_contact_backfill.json`
- `gmail_ingest.json`

Later (optional):
- `slack_ingest.json`
- `transcript_folder_monitor.json`
- `transcript_ingest.json`

## Use this to import all repo workflows into the running n8n container:
cd /Users/dobrien/code/Lux/Lux_CRM_agent && \
docker compose exec -T n8n n8n import:workflow --separate --input=/home/node/.n8n/workflows && \
docker compose exec -T n8n n8n list:workflow


## Step 7. Configure n8n Credentials (Manual)

What this does:
- Authorizes n8n to read your Gmail inbox and Google Sheet contact list (and optionally Google Drive for transcripts).

Create/assign in n8n:
- Gmail OAuth2 (for `gmail_ingest`, `gmail_contact_backfill`)
- Google Sheets OAuth2 (for `sheets_sync`, `gmail_contact_backfill`)
- Google Drive OAuth2 (optional, for `transcript_folder_monitor`)

Note:
- Update hardcoded/placeholder `sheetId` values in imported workflow nodes.


## Step 8. Load Your Contact List (Recommended First)

What this does:
- Seeds canonical contacts before email ingestion so known participants resolve correctly.

Preferred path: n8n `sheets_sync`
1. Open `sheets_sync`
2. Set `Read Contacts Sheet` -> `sheetId` and `range` (`Contacts!A:Z`)
3. Run manually once

Required columns:
- `contact_id`
- `primary_email`

Recommended columns:
- `display_name`, `company`, `owner_user_id`, `notes`, `use_sensitive_in_drafts`


### Optional: Seed One Contact Directly via API

What this does:
- Lets you test contact sync without configuring Sheets first.


In [5]:
%%bash
set -euo pipefail
# Replace the webhook secret before running.
curl -sS -X POST http://localhost:8000/v1/contacts/sync \
  -H "Content-Type: application/json" \
  -H "X-Webhook-Secret: 9PZ5_Iw7ttWT6GJG0XIlG4platWDswpnROejWnfhGa7YwSXduGn7SBgwH2me3NlT" \
  -d '{
    "mode": "push",
    "rows": [
      {
        "contact_id": "contact_001",
        "primary_email": "person@example.com",
        "display_name": "Example Person",
        "company": "Example Co",
        "use_sensitive_in_drafts": false
      }
    ]
  }'


{"mode":"push","upserted":1}

## Step 9. Run a Tiny Gmail Backfill (Safe First Pass)

What this does:
- Pulls a small slice of historical Gmail messages for your contacts and sends them to `/v1/ingest/interaction_event`.

In n8n `gmail_contact_backfill`, update the `Build Contact Queue` code node for a minimal first run:
- `YEARS_BACK = 1`
- `WINDOW_SIZE_MONTHS = 1`
- `MAX_CONTACTS_PER_RUN = 1`
- `BACKFILL_CONTACT_MODE = 'skip_previously_processed'`

Then run the workflow manually once.


## Step 10. Watch Worker Logs During Backfill (Important in Strict Mode)

What this does:
- Shows extraction failures, retries, and processing outcomes in real time.
- In strict mode (`COGNEE_ENABLE_HEURISTIC_FALLBACK=false`), some interactions may ingest but later fail during extraction if Cognee fails.


In [None]:
%%bash
cd /Users/dobrien/code/Lux/Lux_CRM_agent
docker compose logs -f worker


## Step 11. Verify End-to-End Outputs

What this does:
- Confirms ingestion is producing case entities and score outputs.


In [None]:
%%bash
set -euo pipefail
curl -s http://localhost:8000/v1/health
printf "\n\nScores today:\n"
curl -s http://localhost:8000/v1/scores/today
printf "\n\nOpen case contacts:\n"
curl -s "http://localhost:8000/v1/cases/contacts?status=open"
printf "\n\nOpen case opportunities:\n"
curl -s "http://localhost:8000/v1/cases/opportunities?status=open"


## Step 12. Enable Live Gmail Listening

What this does:
- Activates minute-level Gmail polling for new messages and sends them into the CRM ingestion API.

Manual steps in n8n:
1. Open `gmail_ingest`
2. Assign your Gmail OAuth2 credential
3. Activate the workflow
4. Send/receive a test email
5. Re-check worker logs and the API outputs above


## Optional: Slack and Transcript Ingestion

What this does:
- Adds non-email interactions so the CRM can capture more motivators, context, and opportunity signals.

Slack:
- Use `slack_ingest` (webhook workflow)
- Webhook URL: `http://localhost:5679/webhook/slack-ingest`

Transcripts:
- `transcript_ingest` (webhook) for your transcription pipeline
- `transcript_folder_monitor` for Google Drive text files (hourly poll)
- Note: this repo's folder monitor targets Google Drive, not a local filesystem folder


In [None]:
%%bash
set -euo pipefail
# Example Slack test payload (requires slack_ingest workflow to be imported + active)
curl -sS -X POST http://localhost:5679/webhook/slack-ingest \
  -H "Content-Type: application/json" \
  -d '{
    "external_id": "slack-test-001",
    "timestamp": "2026-02-22T12:00:00Z",
    "thread_id": "slack-thread-001",
    "channel": "customer-pilot",
    "from": {"email": "person@example.com", "name": "Example Person"},
    "to": [{"email": "owner@luxcrm.ai", "name": "Owner"}],
    "text": "We need a lightweight onboarding path and want to decide this month."
  }'


In [None]:
%%bash
set -euo pipefail
# Example transcript webhook payload (requires transcript_ingest workflow to be imported + active)
curl -sS -X POST http://localhost:5679/webhook/transcript-ingest \
  -H "Content-Type: application/json" \
  -d '{
    "external_id": "transcript-test-001",
    "timestamp": "2026-02-22T12:15:00Z",
    "thread_id": "meeting-001",
    "subject": "Acme pilot discovery call",
    "participants": {
      "from": [{"email": "owner@luxcrm.ai", "name": "Owner"}],
      "to": [{"email": "person@example.com", "name": "Example Person"}],
      "cc": []
    },
    "body_plain": "Customer wants short weekly updates, quick decisions, and a clear milestone owner/date."
  }'


## Optional: Synthetic E2E Validation Harness

What this does:
- Runs a fixture-based end-to-end validation to confirm the pipeline behavior independently of your live data sources.


In [None]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
NEO4J_ASSERT=on ./run_e2e_validation.sh


## Stop Services

What this does:
- Stops local containers.
- Use `docker compose down -v` only if you intend to remove local Docker volumes (Neo4j/Redis state).


In [None]:
%%bash
set -euo pipefail
cd /Users/dobrien/code/Lux/Lux_CRM_agent
docker compose down


## Troubleshooting Notes

- In strict mode, API ingest may return `enqueued` while the worker later marks the interaction `failed` if Cognee extraction fails.
- After wiping Neo4j, rerun schema + ontology + SHACL load before new ingest.
- If you change worker code, rebuild containers (`docker compose up -d --build api worker`) instead of only restarting.
- `gmail_contact_backfill` and `sheets_sync` workflow exports include hardcoded/placeholder sheet IDs; update them in n8n after import.
