Skip to content

Troubleshooting

Matt Dula edited this page Apr 18, 2026 · 1 revision

Troubleshooting

Fast diagnosis for the things that commonly go wrong.

"I can't hit /mcp from Claude Desktop"

Probably one of:

  1. URL is wrong. Local: http://localhost:8000/mcp. Remote: the public URL of your Railway deployment. No trailing slash.
  2. Bearer token missing. Claude Desktop's custom connector config must include Authorization: Bearer nk_<key>.
  3. Key is revoked or expired. Check GET /workspace/api-keys.
  4. Firewall. If Nakatomi is on a VPC, the MCP client needs egress to your app.

Quick sanity:

curl -s http://localhost:8000/health
curl -s http://localhost:8000/contacts -H "Authorization: Bearer nk_..."

Both should return JSON. If they do and MCP still doesn't connect, the issue is on the MCP client side — check its logs.

401 on every request

  • Missing X-Workspace when using a user JWT. API keys don't need it; JWTs do. See Authentication.
  • Typo in the header. Must be Authorization: Bearer <token> — no quoting, no Basic.
  • Key deleted. A revoked key returns 401 immediately.

429 "rate limit exceeded"

The key has a rate_limit_per_minute and you've hit it. Either:

  • Raise the limit: revoke the key, mint a new one with a higher rate_limit_per_minute
  • Wait: Retry-After header says how many seconds until the window resets
  • Disable globally: unset API_KEY_RATE_LIMIT_PER_MINUTE and any per-key overrides

See Rate-Limiting.

Migrations fail on startup

Most common: you upgraded past a migration that requires a non-null column with no default. Check the logs for the specific migration and Alembic revision.

Recovery:

docker compose exec app alembic current
docker compose exec app alembic heads
docker compose exec app alembic downgrade -1
# inspect, fix, retry
docker compose exec app alembic upgrade head

Webhook deliveries stuck as pending

Check:

  1. Worker running? docker compose logs app | grep webhook. Look for "webhook worker started".
  2. WEBHOOK_WORKER_ENABLED set to false? Default is true. Only tests should turn it off.
  3. Target unreachable? GET /webhooks/<id>/deliveries — look at error and response_body.
  4. Target always returning 5xx? Fix the target; Nakatomi will retry up to WEBHOOK_MAX_RETRIES then mark the row dead.

See Webhooks.

Agent creates duplicate contacts

Almost always missing external_id. The bulk upsert + normal create paths both dedupe by external_id first. Without one, two creates with the same email do dedupe, but two creates from different sources (one with email, one without) don't.

Fix: set external_id on every write that traces back to a source system. See the AgentLab anti-patterns.

Import returns 422 "unsupported schema_version"

Your export doc was produced by a newer Nakatomi than the target. Check schema_version in the doc:

jq '.schema_version' nakatomi-dump.json

Current supported: 1. If yours is higher, upgrade the target. If lower, file an issue — we should be back-compatible and didn't realize we broke.

Memory recall returns nothing

Possible causes:

  1. No connectors enabled. Check GET /memory/connectors.
  2. Connector API key is wrong. Check the app logs on startup for "memory connector 'x' failed to initialize".
  3. Nothing's been written to that connector yet. store_event is fired on CRM mutations; on a fresh workspace there's nothing to recall.
  4. The query is too specific. Try broader queries first.

Dashboard shows "enter API key" forever

The cookie is path-scoped to /dashboard. If you navigate to a different path and back, cookie is fine. If you see the prompt every time:

  • Cookie is being cleared by a 401 response from the API (wrong key).
  • Browser has cookies disabled for the origin.
  • You're on HTTPS but the cookie was set on HTTP — mint a new session.

Tests fail locally with "column 'workspace_id' does not exist"

You're on an old schema. Rebuild:

docker compose down -v   # wipes Postgres volume
docker compose up -d
docker compose exec app alembic upgrade head
pytest

Postgres connection pool exhausted

QueuePool limit of size 10 overflow 20 reached. You've got more concurrent requests than the pool allows. Fix in app/db.py:

engine = create_engine(
    settings.DATABASE_URL,
    pool_size=20,          # up from 10
    max_overflow=40,       # up from 20
    pool_pre_ping=True,
)

Or scale horizontally — more app processes, each with their own pool.

"module 'mcp' has no attribute 'server'"

Wrong mcp package. You installed the Microsoft one, not the Anthropic one. Pin:

mcp>=1.2.0

(Already pinned in requirements.txt.)

Still stuck?

  • Check GitHub Issues — someone may have hit the same thing.
  • Include:
    • Nakatomi version (GET /health)
    • Exact command or API call
    • Full error (logs, HTTP response)
    • Deployment (Railway / Docker Compose / native)
  • For security issues, see SECURITY.md — don't post them publicly.

Clone this wiki locally