feat: multi-graph support with schema isolation and RBAC by charlie83Gs · Pull Request #156 · openktree/knowledge-tree

charlie83Gs · 2026-04-05T17:50:00Z

Summary

Adds full multi-graph infrastructure: Graph, GraphMember, DatabaseConnection models with graph_type (versioned, for backward compat) and byok_enabled (honors early adopter BYOK access)
GraphSessionResolver caches per-graph session factories using PostgreSQL search_path for schema isolation or separate DB connections
Graph management API with CRUD, member roles (reader/writer/admin), and synchronous schema provisioning
Graph-scoped data endpoints via GraphContext dependency (/api/v1/graphs/{slug}/nodes/...)
Parameterized Qdrant collection names for per-graph vector isolation
GraphAwareMixin on all key Hatchet workflow inputs (backward compatible — graph_id=None means default graph)
Sync worker iterates all active graphs per cycle
Multi-schema Alembic migration runner (ALEMBIC_SCHEMA env override)
Frontend /graphs page with create form, detail/member management page

Test plan

Verify all existing tests pass (40 API, 64 Qdrant, 19 Hatchet, 123 frontend — all green locally)
Run alembic upgrade head for both graph-db and write-db migrations
Create a graph via POST /api/v1/graphs and verify schema + Qdrant collections are created
Add members with different roles and verify access control on graph-scoped endpoints
Verify default graph endpoints (/api/v1/nodes) continue to work unchanged
Verify sync worker syncs both default and non-default graphs
Test BYOK-enabled graph creation and verify the flag persists

🤖 Generated with Claude Code

Introduces the foundation for multiple isolated knowledge graphs: - Graph, GraphMember, DatabaseConnection models with graph_type and byok_enabled flags - GraphSessionResolver for per-graph session factory caching (schema or database isolation) - GraphRepository for CRUD + member management - Graph management API (CRUD, member roles, synchronous provisioning) - Graph-scoped data endpoints via GraphContext dependency - Parameterized Qdrant collection names (per-graph isolation) - GraphAwareMixin on all key Hatchet workflow inputs (backward compatible) - Multi-graph sync worker (iterates all active graphs) - Multi-schema Alembic migration runner with ALEMBIC_SCHEMA env override - Frontend /graphs page with create form, detail page, and member management - GraphDatabaseConfig in Settings for named DB connection pairs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add graph_slugs column to ApiToken for per-token graph restriction - Add graph parameter to all 8 MCP tools (default: "default") - GraphSessionResolver in MCP dependencies for per-graph session routing - OAuth tokens carry graph:{slug} scopes from API token graph_slugs - GraphContext checks token graph scope before granting access - Frontend token creation form with graph selector - Token list shows graph scope (all graphs vs specific) - Updated MCP instructions to document multi-graph support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Validate schema names with strict ^[a-z0-9_]+$ regex before DDL - #2: Escape ILIKE special chars (%, _, \) in graph_nodes search - #3: Replace cached Graph ORM instances with frozen GraphInfo dataclass to prevent DetachedInstanceError High: - #4: Reuse system session factories for default graph (no duplicate pools) via default_graph_session_factory/default_write_session_factory params - #5: Add 23 unit tests — GraphInfo, GraphSessions, GraphSessionResolver, slug/schema validation, CreateGraphRequest, role validation - #6: Scope sync watermarks by graph_slug — SyncEngine now passes graph_slug to _get_watermark/_set_watermark, composite PK on (table_name, graph_slug) Medium: - #7: Replace N+1 member count queries with batch GROUP BY - #8: Replace catch { // ignore } with console.error in frontend - #9: Engine pool disposal on GraphSessionResolver.invalidate() - #10: Run Alembic migrations during graph provisioning - #11: (node_count in list deferred — requires cross-schema queries) Low: - #13: Replace "Cycle Role" button with role dropdown - #14: require_writer/require_graph_admin kept for future endpoints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fixes TypeScript type error: ApiTokenRead now requires graph_slugs field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- GraphProvider context wrapping the app layout, persists active graph to localStorage, syncs to api module via setActiveGraphSlug() - GraphPicker component in sidebar (dropdown expanded, icon collapsed) auto-hides when only one graph exists - graphRequest() helper in api.ts routes through /graphs/{slug}/... for non-default graphs, falls back to standard paths for default - setActiveGraphSlug/getActiveGraphSlug exports for module-level state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Remove dead quote_ident call — regex is the sole injection guard - #2: Add ^[a-z0-9_]+$ validation for ALEMBIC_SCHEMA in both env.py files High: - #3: Derive kt_db_root from kt_db package location instead of fragile parents[5] - #4: Document MCP omits default_write_session_factory intentionally (read-only) - #5: GraphContext now uses GraphInfo (frozen dataclass) instead of ORM Graph Medium: - #6: Replace user._token_graph_slugs monkey-patching with request.state - #7: Fix remaining catch { // ignore } in graphs/page.tsx - #9: Document MCP graph access check limitation, planned for follow-up Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Invalidate resolver cache after provisioning (both success and error) so subsequent resolve() picks up fresh status - #2: Combine status="active" + add_member in single commit to prevent orphaned graphs on crash High: - #3: Run Alembic migrations via asyncio.to_thread() to avoid blocking the event loop during HTTP requests - #5: Store AsyncEngine references in GraphSessions for proper disposal instead of accessing sessionmaker.kw["bind"] internals Medium: - #7: Replace silent .catch(() => {}) with console.error in tokens page - migrate.py path comment clarified for consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Validate schema_name in GraphRepository.create() (data layer guard) - #2: Enforce graph:{slug} scopes in MCP _get_graph_factory via get_access_token() — tokens without matching scope are denied - #3: Disallow hyphens in slugs to prevent schema name collisions (my_graph and my-graph can no longer coexist) High: - #4: Add asyncio.Lock to GraphSessionResolver.resolve/resolve_by_slug with double-check pattern to prevent duplicate engine pool creation - #5: Evict from cache on graph deletion (invalidate in delete_graph) Medium: - #8: Last-admin protection — prevent removing or demoting the last admin - #9: Defense-in-depth schema_name validation in _make_session_factory Low: - Validate stored graph slug still exists in GraphProvider (reset to default) - Update tests and frontend for no-hyphens slug policy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace single serial sync_wf with two workflows: - sync_dispatch_wf (cron every minute) — fans out one sync per graph - sync_graph_wf (on-demand) — syncs a single graph with per-graph concurrency (max_runs=1 keyed by input.graph_slug) This prevents high-activity bursts on one graph from starving sync for other graphs. Each graph syncs independently and in parallel. Worker slots increased from 1 to 10 to allow parallel graph syncs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Control-plane migrations (zzai, zzaj) now skip when ALEMBIC_SCHEMA is non-public — prevents duplicating graphs/ graph_members/api_tokens tables in per-graph schemas High: - #3: Replace global asyncio.Lock with per-graph locks via _locks dict + lightweight _meta_lock for dict insertion only - #7: Default graph now enforces min_role for write operations (PUT /graphs/default requires admin) Medium: - #9: Validate storage_mode=database requires connection key at creation time (422 instead of confusing ValueError at resolve) - #12: Fix SyncWatermark docstring (defaults to "default", not NULL) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rors - #1: Extract validate_schema_name() into kt_db.keys as single source of truth. Remove duplicate regex from graphs.py, repositories/graphs.py, graph_sessions.py, and both alembic env.py files. Remove redundant double-quotes in SET search_path. - #3: Provisioning no longer caches via resolver — uses temporary write session factory for DDL, avoiding stale cached engines mid-migration. - #5: Qdrant collection failures now propagate (not swallowed), causing graph to go "error" instead of "active" without collections. - #7: GraphProvider gates listGraphs() on auth loading complete + user !== null, preventing race with AuthProvider. - #11: Replace <a> with Next.js <Link> on graphs list page. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - #1: Add SECURITY comments to all DDL f-strings tying them to validate_schema_name() regex — future-proofs against regex loosening - #2: Add POST /graphs/{slug}/retry-provision endpoint for graphs stuck in "error" status. Idempotent (CREATE SCHEMA IF NOT EXISTS + Alembic upgrade head). Also adds admin member if none exist. High: - #3: MCP now requires explicit graph:{slug} scopes for non-default graphs — tokens without graph scopes are denied (not silently allowed) - #4: Document default graph policy: open reads, superuser-only writes - #5: Use one-off engine with dispose() for write-db DDL during provisioning — no leaked connection pools Medium: - #8: Document connection budget math in sync worker slots comment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- #3: Deny non-default graph access when get_access_token() returns None (SKIP_AUTH or missing auth context) - #4: Use SELECT ... FOR UPDATE on admin members during role demotion and member removal to prevent concurrent last-admin race - #5: Add _slug_to_id index to GraphSessionResolver for O(1) slug lookups instead of linear cache scan. Maintained in _build_and_cache and invalidate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ease - #1: Lock admin members unconditionally before checking role — prevents race where two concurrent requests both see admin_count=2 before lock - #5: Release control session before acquiring per-graph lock in resolve_by_slug to avoid holding pool slot during lock wait - #7: require_writer now enforces superuser-only for default graph writes, matching the documented policy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- #2/#14: Fix MCP scope check — empty scopes = unrestricted access (graph_slugs=null tokens and SKIP_AUTH both work again). Only tokens with explicit graph:* scopes are restricted to those graphs. - #3: Block reserved PG schema names (public, pg_catalog, pg_toast, pg_temp, information_schema, pg_*) in validate_schema_name() - #4: Fix scalar_one() → scalar_one_or_none() in resolve_by_slug (introduced by session-release refactor in prior commit) - #10: Sync raises RuntimeError instead of returning error dict when graph_resolver is unavailable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Workers: - Add resolve_sessions() helper to WorkerState for one-line graph_id resolution to (graph_sf, write_sf) tuple - All input models now extend GraphAwareMixin (graph_id on every input) - worker-bottomup: _open_sessions + _build_agent_context accept graph_id, all 9 call sites updated - worker-ingest: _open_sessions + _build_agent_context accept graph_id - worker-nodes: HatchetPipeline constructor accepts graph_id, resolves sessions in _open_sessions and _build_ctx - worker-search: direct session factory calls resolve per graph_id - worker-synthesis: synthesizer + super-synthesizer resolve graph_id to per-graph ReadGraphEngine + write session factories Frontend: - Switch 30 graph-scoped API methods from request() to graphRequest() (nodes, edges, facts, sources, seeds, conversations, syntheses, etc.) - Non-graph-scoped methods (auth, config, members, usage) unchanged - Graph picker now actually affects all data queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critical: - Fix 38 remaining API methods still using request() instead of graphRequest() — node sub-resources (dimensions, facts, edges, history, convergence), conversations, seeds, edge candidates, syntheses, sources. All graph-scoped data now routes correctly. High: - Add graph_pool_size / graph_max_overflow settings (defaults 5/10) for schema-mode non-default graphs, replacing hardcoded values Low: - migrate.py path derivation now uses kt_db.__file__ (matches graphs.py) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

systemSettings, members, waitlist, and invites are global/control-plane resources without graph-scoped backend routes. Using graphRequest() would 404 on non-default graphs. Reverted 7 methods to request(). Graph updated_at confirmed working — onupdate=_utcnow on the ORM column handles auto-update on every flush. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-06T14:27:05Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

charlie83Gs and others added 19 commits April 5, 2026 11:49

fix: add missing graph_slugs to inline token object in handleCreated

8a93c6b

Fixes TypeScript type error: ApiTokenRead now requires graph_slugs field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: sort imports in test_graph_schemas.py (ruff I001)

184c2ca

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

charlie83Gs mentioned this pull request Apr 6, 2026

Multi-graph: follow-up items from PR #156 #162

Closed

24 tasks

charlie83Gs merged commit 8eecab7 into main Apr 6, 2026
18 checks passed

charlie83Gs deleted the worktree-feat+multigraph branch April 6, 2026 14:26

charlie83Gs mentioned this pull request Apr 6, 2026

fix(kt-db): merge alembic heads after multi-graph PR #163

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-graph support with schema isolation and RBAC#156

feat: multi-graph support with schema isolation and RBAC#156
charlie83Gs merged 19 commits intomainfrom
worktree-feat+multigraph

charlie83Gs commented Apr 5, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charlie83Gs commented Apr 5, 2026

Summary

Test plan

Uh oh!

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant