Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,17 @@ frontend/src/
| `AZURE_OPENAI_API_KEY` | — | Azure OpenAI key |
| `AZURE_OPENAI_API_VERSION` | `2024-10-21` | Azure OpenAI API version |
| `AZURE_OPENAI_DEPLOYMENT` | — | Azure OpenAI embedding deployment name |
| `DISABLE_AUTH` | `false` | Local-dev escape hatch — treat every request as the default admin (no login). **Never enable in production** |
| `AUTH_PROVIDER` | `local` | Interactive login backend: `local` (password + magic-link), `magic_link`, or `oidc` (registered seam, not yet implemented) |
| `JWT_SECRET` | `dev-jwt-secret-change-in-production` | HS256 signing secret for session + magic-link JWTs |
| `JWT_ACCESS_TTL_MINUTES` | `720` | Session lifetime (minutes) |
| `MAGIC_LINK_TTL_MINUTES` | `15` | Magic-link token lifetime (minutes) |
| `AUTH_COOKIE_NAME` | `qw_session` | Session cookie name (HTTP-only) |
| `AUTH_COOKIE_SECURE` | `false` | Set `true` behind TLS (HTTPS-only cookie) |
| `AUTH_COOKIE_SAMESITE` | `lax` | Session cookie SameSite (`lax`/`strict`/`none`) |
| `DEFAULT_ORG_SLUG` | `default` | Slug of the auto-created default organization |
| `DEFAULT_ADMIN_EMAIL` | `admin@querywise.local` | Bootstrapped admin user (created on boot + in migration 004) |
| `DEFAULT_ADMIN_PASSWORD` | — | If set, the bootstrapped admin gets this local-login password |

## Ollama (Local LLM)

Expand Down Expand Up @@ -223,3 +234,15 @@ dependencies degrade gracefully — the app boots without `structlog` /
- **Health** (`app/api/v1/endpoints/health.py`): `GET /health/live` (process) and `GET /health/ready` (DB + job queue + LLM provider, 503 on failure) for K8s probes.
- **LLM endpoints:** Azure OpenAI provider (`azure_openai`) added so the pipeline can run inside a customer VPC; registered in `provider_registry`.
- **Tests/CI:** unit tests in `backend/tests/` (no DB/LLM needed); `.github/workflows/ci.yml` runs pytest (gating) + ruff/mypy/frontend build (advisory until pre-existing lint debt is cleared). Optional deps: `pip install -e ".[observability,jobs]"`.

## Identity & auth (Phase 1)

Real users, teams, roles, and ownership. Single-tenant per deployment; isolation is by `workspace_id` (a `Team`) within the auto-created default `Organization`. `organization_id` is carried on every core table so a future managed-SaaS fleet needs no migration. Migration `004` creates the identity tables, seeds the default org/workspace/admin, backfills all existing rows, and promotes the free-text `created_by`/`user_id` columns to real `User` FKs.

- **Identity models** (`app/db/models/`): `Organization`, `User`, `Team` (= workspace), `Membership` (role `admin|editor|viewer`, ranked in `ROLE_RANK`), `ApiKey` (only the SHA-256 hash stored).
- **Primitives** (`app/core/security.py`): PBKDF2 password hashing (stdlib), HS256 JWTs with a `purpose` claim (`session` / `magic_link`), and API-key gen/hash. Dependency-light + unit-tested.
- **Request plumbing** (`app/core/auth.py`): `get_current_user` (API key → Bearer → HTTP-only `qw_session` cookie), `get_org_context` → `AuthContext` (active workspace via `X-Workspace-Id` header, else earliest membership), and `require_role(...)`. `DISABLE_AUTH=true` short-circuits to the bootstrapped admin for local dev.
- **Login** (`app/services/auth_service.py`, `app/api/v1/endpoints/auth.py`): password + magic-link, both issuing a session-cookie JWT. Magic-link delivery (email/Slack) lands in Phase 4 — the token is logged and, outside production, returned by `POST /auth/magic-link`. `app/core/auth_providers.py` is a name-keyed seam (`local`/`magic_link`/`oidc`); **OIDC is registered but not implemented**.
- **AuthZ in services** (per the existing convention): `connection_service` scopes by org+workspace and enforces role; metadata endpoints authorize through the connection (the cascade root) via `app/api/v1/deps.py` (`require_connection_read/write`, `require_column_read/write`). Non-request entry points — startup auto-setup, the MCP server, the seed script via `DISABLE_AUTH` — act under `identity_service.system_context()` (admin in the default workspace).
- **Endpoints:** `/auth/*` (login, register, magic-link request/verify, logout, me, providers), `/teams` + `/teams/{id}/members` (admin-managed), `/api-keys` (per-user, plaintext shown once).
- **Heads-up:** once auth is enforced, the current (pre-auth) frontend gets 401s — run with `DISABLE_AUTH=true` until the Phase 1 frontend (login + auth context + workspace switcher) lands.
252 changes: 252 additions & 0 deletions backend/alembic/versions/004_identity_and_ownership.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
"""Identity, teams & ownership (Phase 1)

Revision ID: 004
Revises: 003
Create Date: 2026-06-05

Adds the identity layer (organizations, users, teams, memberships, api_keys),
re-keys all core tables with organization_id, scopes connections to a workspace
(team) + owner, and promotes the free-text created_by / user_id columns to real
User foreign keys.

Migration strategy per the roadmap: add nullable → create the default org /
workspace / admin → backfill every existing row → enforce NOT NULL. Rollback is
``DISABLE_AUTH=true`` + this downgrade (+ pg_dump restore if needed).
"""

from collections.abc import Sequence
from typing import Union

import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects.postgresql import JSONB, UUID

from app.config import settings

# revision identifiers, used by Alembic.
revision: str = "004"
down_revision: str = "003"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


# Core tables that gain organization_id (SaaS-ready scoping).
ORG_SCOPED_TABLES = [
"database_connections",
"glossary_terms",
"metric_definitions",
"sample_queries",
"knowledge_documents",
"query_executions",
]
# Tables whose free-text created_by becomes a created_by_id User FK.
CREATED_BY_TABLES = ["glossary_terms", "metric_definitions", "sample_queries"]


def _q(value: str) -> str:
"""Single-quote-escape a trusted config string for inline SQL."""
return value.replace("'", "''")


def upgrade() -> None:
org_slug = _q(settings.default_org_slug)
admin_email = _q(settings.default_admin_email)

# --- Identity tables ---------------------------------------------------
op.create_table(
"organizations",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column("name", sa.String(255), nullable=False),
sa.Column("slug", sa.String(255), nullable=False, unique=True),
sa.Column("settings", JSONB, server_default=sa.text("'{}'::jsonb")),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.Column("updated_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
)
op.create_table(
"users",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column("email", sa.String(320), nullable=False, unique=True),
sa.Column("name", sa.String(255), nullable=True),
sa.Column("sso_subject", sa.String(255), nullable=True, unique=True),
sa.Column("password_hash", sa.String(255), nullable=True),
sa.Column("status", sa.String(20), nullable=False, server_default="active"),
sa.Column("last_login_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.Column("updated_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
)
op.create_table(
"teams",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column(
"organization_id",
UUID(as_uuid=True),
sa.ForeignKey("organizations.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column("name", sa.String(255), nullable=False),
sa.Column("slug", sa.String(255), nullable=False),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.Column("updated_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
)
op.create_table(
"memberships",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column(
"user_id", UUID(as_uuid=True), sa.ForeignKey("users.id", ondelete="CASCADE"), nullable=False
),
sa.Column(
"team_id", UUID(as_uuid=True), sa.ForeignKey("teams.id", ondelete="CASCADE"), nullable=False
),
sa.Column("role", sa.String(20), nullable=False, server_default="viewer"),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.UniqueConstraint("user_id", "team_id", name="uq_membership_user_team"),
)
op.create_table(
"api_keys",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column(
"user_id", UUID(as_uuid=True), sa.ForeignKey("users.id", ondelete="CASCADE"), nullable=False
),
sa.Column("name", sa.String(255), nullable=False),
sa.Column("key_hash", sa.String(64), nullable=False, unique=True),
sa.Column("key_prefix", sa.String(16), nullable=False),
sa.Column("permissions", JSONB, server_default=sa.text("'{}'::jsonb")),
sa.Column("expires_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("last_used_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("revoked_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.func.now()),
)

# --- Seed the default org / workspace / admin --------------------------
op.execute(
f"INSERT INTO organizations (name, slug) "
f"VALUES ('{_q(settings.default_org_name)}', '{org_slug}')"
)
op.execute(
f"INSERT INTO teams (organization_id, name, slug) "
f"SELECT id, '{_q(settings.default_workspace_name)}', 'default-workspace' "
f"FROM organizations WHERE slug = '{org_slug}'"
)
op.execute(
f"INSERT INTO users (email, name, status) "
f"VALUES ('{admin_email}', 'Administrator', 'active')"
)
op.execute(
f"INSERT INTO memberships (user_id, team_id, role) "
f"SELECT u.id, t.id, 'admin' FROM users u, teams t "
f"JOIN organizations o ON o.id = t.organization_id "
f"WHERE u.email = '{admin_email}' AND o.slug = '{org_slug}'"
)

org_subq = f"(SELECT id FROM organizations WHERE slug = '{org_slug}')"
team_subq = (
f"(SELECT t.id FROM teams t JOIN organizations o ON o.id = t.organization_id "
f"WHERE o.slug = '{org_slug}' ORDER BY t.created_at LIMIT 1)"
)
admin_subq = f"(SELECT id FROM users WHERE email = '{admin_email}')"

# --- organization_id on every core table -------------------------------
for table in ORG_SCOPED_TABLES:
op.add_column(table, sa.Column("organization_id", UUID(as_uuid=True), nullable=True))
op.execute(f"UPDATE {table} SET organization_id = {org_subq}")
op.alter_column(table, "organization_id", nullable=False)
op.create_foreign_key(
f"fk_{table}_organization_id",
table,
"organizations",
["organization_id"],
["id"],
ondelete="CASCADE",
)
op.create_index(f"ix_{table}_org_created", table, ["organization_id", "created_at"])

# --- database_connections: workspace + owner + privacy -----------------
op.add_column("database_connections", sa.Column("workspace_id", UUID(as_uuid=True), nullable=True))
op.add_column("database_connections", sa.Column("owner_id", UUID(as_uuid=True), nullable=True))
op.add_column(
"database_connections",
sa.Column("is_private", sa.Boolean(), nullable=False, server_default=sa.false()),
)
op.execute(
f"UPDATE database_connections SET workspace_id = {team_subq}, owner_id = {admin_subq}"
)
op.alter_column("database_connections", "workspace_id", nullable=False)
op.create_foreign_key(
"fk_database_connections_workspace_id",
"database_connections",
"teams",
["workspace_id"],
["id"],
ondelete="CASCADE",
)
op.create_foreign_key(
"fk_database_connections_owner_id",
"database_connections",
"users",
["owner_id"],
["id"],
ondelete="SET NULL",
)

# --- created_by → created_by_id User FK --------------------------------
for table in CREATED_BY_TABLES:
op.add_column(table, sa.Column("created_by_id", UUID(as_uuid=True), nullable=True))
# Existing rows were created by the system; attribute them to the admin.
op.execute(f"UPDATE {table} SET created_by_id = {admin_subq} WHERE created_by IS NOT NULL")
op.create_foreign_key(
f"fk_{table}_created_by_id",
table,
"users",
["created_by_id"],
["id"],
ondelete="SET NULL",
)
op.drop_column(table, "created_by")

# --- query_executions.user_id: free-text string → User FK --------------
# Old free-text values cannot be mapped to real users; they become NULL.
op.drop_column("query_executions", "user_id")
op.add_column("query_executions", sa.Column("user_id", UUID(as_uuid=True), nullable=True))
op.create_foreign_key(
"fk_query_executions_user_id",
"query_executions",
"users",
["user_id"],
["id"],
ondelete="SET NULL",
)


def downgrade() -> None:
# query_executions.user_id back to free-text
op.drop_constraint("fk_query_executions_user_id", "query_executions", type_="foreignkey")
op.drop_column("query_executions", "user_id")
op.add_column("query_executions", sa.Column("user_id", sa.String(255), nullable=True))

# created_by_id → created_by string
for table in CREATED_BY_TABLES:
op.drop_constraint(f"fk_{table}_created_by_id", table, type_="foreignkey")
op.drop_column(table, "created_by_id")
op.add_column(table, sa.Column("created_by", sa.String(255), nullable=True))

# database_connections extras
op.drop_constraint("fk_database_connections_owner_id", "database_connections", type_="foreignkey")
op.drop_constraint(
"fk_database_connections_workspace_id", "database_connections", type_="foreignkey"
)
op.drop_column("database_connections", "is_private")
op.drop_column("database_connections", "owner_id")
op.drop_column("database_connections", "workspace_id")

# organization_id on core tables
for table in ORG_SCOPED_TABLES:
op.drop_index(f"ix_{table}_org_created", table_name=table)
op.drop_constraint(f"fk_{table}_organization_id", table, type_="foreignkey")
op.drop_column(table, "organization_id")

# Identity tables
op.drop_table("api_keys")
op.drop_table("memberships")
op.drop_table("teams")
op.drop_table("users")
op.drop_table("organizations")
75 changes: 75 additions & 0 deletions backend/app/api/v1/deps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
"""Shared FastAPI dependencies for authorization.

Metadata endpoints (glossary, metrics, dictionary, sample queries, knowledge,
schema) are keyed by ``connection_id`` — the workspace cascade root. These
dependencies resolve the caller's :class:`AuthContext` and assert access to the
connection in the path, so handlers can stay thin.
"""

from __future__ import annotations

import uuid

from fastapi import Depends
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession

from app.core.auth import AuthContext, get_org_context
from app.core.exceptions import NotFoundError
from app.db.models.schema_cache import CachedColumn, CachedTable
from app.db.session import get_db
from app.services import connection_service


async def require_connection_read(
connection_id: uuid.UUID,
ctx: AuthContext = Depends(get_org_context),
db: AsyncSession = Depends(get_db),
) -> AuthContext:
"""Caller must be able to read the connection in the path."""
await connection_service.get_connection(db, connection_id, ctx)
return ctx


async def require_connection_write(
connection_id: uuid.UUID,
ctx: AuthContext = Depends(get_org_context),
db: AsyncSession = Depends(get_db),
) -> AuthContext:
"""Caller must be an editor (or above) on the connection in the path."""
await connection_service.get_connection(db, connection_id, ctx, write=True)
return ctx


async def _connection_id_for_column(db: AsyncSession, column_id: uuid.UUID) -> uuid.UUID:
result = await db.execute(
select(CachedTable.connection_id)
.join(CachedColumn, CachedColumn.table_id == CachedTable.id)
.where(CachedColumn.id == column_id)
)
connection_id = result.scalar_one_or_none()
if connection_id is None:
raise NotFoundError("Column", str(column_id))
return connection_id


async def require_column_read(
column_id: uuid.UUID,
ctx: AuthContext = Depends(get_org_context),
db: AsyncSession = Depends(get_db),
) -> AuthContext:
"""Caller must be able to read the connection owning the column in the path."""
connection_id = await _connection_id_for_column(db, column_id)
await connection_service.get_connection(db, connection_id, ctx)
return ctx


async def require_column_write(
column_id: uuid.UUID,
ctx: AuthContext = Depends(get_org_context),
db: AsyncSession = Depends(get_db),
) -> AuthContext:
"""Caller must be an editor on the connection owning the column in the path."""
connection_id = await _connection_id_for_column(db, column_id)
await connection_service.get_connection(db, connection_id, ctx, write=True)
return ctx
Loading
Loading