bordumb · bordumb · Jan 22, 2026 · Jan 22, 2026 · Jan 22, 2026 · Jan 22, 2026
@@ -0,0 +1,13 @@
+{
+  "branch_name": "fn-24",
+  "created_at": "2026-01-22T18:01:43.795974Z",
+  "depends_on_epics": [],
+  "id": "fn-24",
+  "next_task": 1,
+  "plan_review_status": "unknown",
+  "plan_reviewed_at": null,
+  "spec_path": ".flow/specs/fn-24.md",
+  "status": "open",
+  "title": "Month 0-2: Triage + Policy Engine",
+  "updated_at": "2026-01-22T18:01:43.796354Z"
+}
@@ -0,0 +1,44 @@
+# fn-24: Month 0-2: Triage + Policy Engine
+
+## Goal
+Ship policy-driven triage and investigation automation for data platform teams, with per-team rules, dataset overrides, queueing, and measurable activation/usage metrics.
+
+## Scope
+- Team policy engine with dataset overrides and per-team rate limits.
+- Integrations -> issues pipeline that applies policy actions (auto, review, issue-only).
+- Investigation queue/batch executor with Redis rate limiting per team.
+- Redis-backed SSE event storage + API rate limiting.
+- Policy editor UI in Settings > Teams.
+- Activation + weekly usage analytics.
+
+## Non-Goals
+- Automated fixing.
+- SCIM provisioning.
+- SAML (planned for Month 4-6).
+
+## Approach
+- Add policy tables and repository methods for team rules and dataset overrides.
+- Implement a policy evaluator that returns an action + queue config for each issue.
+- Route integration events through policy evaluation and trigger issue creation + optional investigations.
+- Add Redis-backed queue + rate limiter for investigations and SSE event replay storage.
+- Build a team policy editor in Settings > Teams with dataset overrides.
+- Instrument issue/investigation lifecycle events and weekly usage metrics.
+
+## Quick commands
+- `just test`
+
+## Acceptance
+- [ ] Team policies with dataset overrides are persisted and can be read/written via API.
+- [ ] Integration ingestion applies policy actions (auto, review, issue-only).
+- [ ] Auto investigations are queued and rate-limited per team via Redis.
+- [ ] SSE events and API rate limiting are no longer in-memory.
+- [ ] Team policy editor UI is usable in Settings > Teams.
+- [ ] Activation + weekly usage metrics are available via API or queries.
+
+## References
+- `python-packages/dataing/src/dataing/entrypoints/api/routes/integrations.py`
+- `python-packages/dataing/src/dataing/entrypoints/api/routes/runs.py`
+- `python-packages/dataing/src/dataing/entrypoints/api/middleware/rate_limit.py`
+- `frontend/app/src/features/issues/IssueList.tsx`
+- `frontend/app/src/features/issues/IssueWorkspace.tsx`
+- `frontend/app/src/features/settings/teams/teams-settings.tsx`
@@ -0,0 +1,23 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T18:07:36.595638Z",
+  "created_at": "2026-01-22T18:02:02.585217Z",
+  "depends_on": [],
+  "epic": "fn-24",
+  "evidence": {
+    "commits": [
+      "b63b00483c7d664738b6efd4e4b0d5837d83930a"
+    ],
+    "prs": [],
+    "tests": [
+      "uv run pytest python-packages/dataing/tests/unit/adapters/db/test_team_policy_repository.py"
+    ]
+  },
+  "id": "fn-24.1",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.1.md",
+  "status": "done",
+  "title": "Data model + migrations for team policies and overrides",
+  "updated_at": "2026-01-22T19:08:26.558940Z"
+}
@@ -0,0 +1,29 @@
+# fn-24.1 Data model + migrations for team policies and overrides
+
+## Description
+Add database tables and repository helpers for team policy rules, dataset overrides, and per-team queue limits. Support overrides by dataset_id and tag-based selectors.
+
+## Acceptance
+- [ ] Migrations add tables for team policies, overrides, and queue limits.
+- [ ] Repository methods exist for CRUD on policies and overrides.
+- [ ] Dataset/tag selectors are represented in the schema (dataset_id or tag_id).
+- [ ] Basic unit tests cover create/read/update for policies.
+
+## Done summary
+- Added migration 028_team_policies.sql with 4 tables: team_policies, team_policy_overrides, team_queue_limits, dataset_tags
+- Created TeamPolicyRepository with full CRUD operations for policies, overrides, queue limits, and dataset tags
+- Added PolicyAction enum and dataclasses for type-safe domain entities
+
+Why:
+- Foundation for policy-driven triage and investigation automation
+- Enables per-team configuration with dataset/tag-specific overrides
+
+Verification:
+- 28 unit tests passing
+- ruff check passing
+- mypy type check passing
+- just test-ce passing (1257 tests)
+## Evidence
+- Commits: b63b00483c7d664738b6efd4e4b0d5837d83930a
+- Tests: uv run pytest python-packages/dataing/tests/unit/adapters/db/test_team_policy_repository.py
+- PRs:
@@ -0,0 +1,25 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T19:09:11.846118Z",
+  "created_at": "2026-01-22T18:02:12.872541Z",
+  "depends_on": [
+    "fn-24.1"
+  ],
+  "epic": "fn-24",
+  "evidence": {
+    "commits": [
+      "5a240ac3e4f38322c857f7789c2f77edb75a629f"
+    ],
+    "prs": [],
+    "tests": [
+      "uv run pytest python-packages/dataing/tests/unit/services/test_policy.py"
+    ]
+  },
+  "id": "fn-24.2",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.2.md",
+  "status": "done",
+  "title": "Policy engine: evaluate team + dataset rules",
+  "updated_at": "2026-01-22T19:12:48.936850Z"
+}
@@ -0,0 +1,30 @@
+# fn-24.2 Policy engine: evaluate team + dataset rules
+
+## Description
+Implement a policy evaluation service that resolves the effective action for an issue using team rules and dataset/tag overrides. Output should include action (auto, review, issue-only) and queue/rate-limit settings.
+
+## Acceptance
+- [ ] Policy evaluator resolves precedence: dataset overrides > team default.
+- [ ] Action outputs include auto/review/issue-only and queue settings.
+- [ ] Evaluation is exercised via unit tests with team + dataset scenarios.
+- [ ] API layer can fetch evaluated policy results for an issue.
+
+## Done summary
+- Added PolicyService with precedence-based policy evaluation
+- Implemented resolution order: dataset overrides > tag overrides > team default > system defaults
+- Added severity-based action resolution (auto/review thresholds)
+- Added QueueConfig, IssueContext, PolicyResult dataclasses
+- Added evaluate_policy_for_issue convenience function for API layer
+
+Why:
+- Central policy engine needed by integrations to determine triage actions
+- Enables automatic investigation triggering based on configured rules
+
+Verification:
+- 19 unit tests passing (test_policy.py)
+- ruff check passing
+- mypy type check passing
+## Evidence
+- Commits: 5a240ac3e4f38322c857f7789c2f77edb75a629f
+- Tests: uv run pytest python-packages/dataing/tests/unit/services/test_policy.py
+- PRs:
@@ -0,0 +1,23 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T19:14:17.576326Z",
+  "created_at": "2026-01-22T18:02:31.734575Z",
+  "depends_on": [
+    "fn-24.2"
+  ],
+  "epic": "fn-24",
+  "evidence": {
+    "commits": [],
+    "prs": [],
+    "tests": [
+      "uv run pytest python-packages/dataing/tests/unit/api/test_integrations_routes.py python-packages/dataing/tests/unit/services/test_policy.py python-packages/dataing/tests/unit/adapters/db/test_team_policy_repository.py"
+    ]
+  },
+  "id": "fn-24.3",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.3.md",
+  "status": "done",
+  "title": "Integrations to issues: policy-driven actions",
+  "updated_at": "2026-01-22T19:21:02.616765Z"
+}
@@ -0,0 +1,36 @@
+# fn-24.3 Integrations to issues: policy-driven actions
+
+## Description
+Route integration events through the policy engine to create issues and trigger the correct action: auto investigation, review required, or issue-only. Ensure notifications for review-required flows.
+
+## Acceptance
+- [ ] Integration ingestion calls policy evaluation and records the action taken.
+- [ ] Auto actions enqueue investigations; review actions create approval notifications.
+- [ ] Issue-only path creates issue without starting investigation.
+- [ ] Idempotency behavior remains intact for integration events.
+
+## Done summary
+- Integrated policy evaluation into webhook issue creation flow
+- Policy determines action: auto investigation, review required, or issue-only
+- AUTO: Starts Temporal investigation workflow (if configured)
+- REVIEW: Sends notification via NotificationService
+- ISSUE_ONLY: No additional action beyond issue creation
+- Added policy_action and investigation_id fields to WebhookIssueResponse
+- Added get_default_team_for_tenant helper to TeamPolicyRepository
+
+Why:
+- Integration events need policy-driven triage to determine appropriate action
+- Enables automatic investigation triggering based on configured rules
+- Maintains idempotency behavior for integration events
+
+Verification:
+- 21 unit tests for integration routes (6 new policy-related tests)
+- 19 unit tests for PolicyService
+- 28 unit tests for TeamPolicyRepository
+- All 68 tests passing
+- ruff check passing
+- mypy type check passing
+## Evidence
+- Commits:
+- Tests: uv run pytest python-packages/dataing/tests/unit/api/test_integrations_routes.py python-packages/dataing/tests/unit/services/test_policy.py python-packages/dataing/tests/unit/adapters/db/test_team_policy_repository.py
+- PRs:
@@ -0,0 +1,23 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T19:21:55.582374Z",
+  "created_at": "2026-01-22T18:02:44.208277Z",
+  "depends_on": [
+    "fn-24.2"
+  ],
+  "epic": "fn-24",
+  "evidence": {
+    "commits": [],
+    "prs": [],
+    "tests": [
+      "uv run pytest python-packages/dataing/tests/unit/adapters/queue/ -v"
+    ]
+  },
+  "id": "fn-24.4",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.4.md",
+  "status": "done",
+  "title": "Investigation queue + per-team rate limits (Redis)",
+  "updated_at": "2026-01-22T19:30:31.922472Z"
+}
@@ -0,0 +1,38 @@
+# fn-24.4 Investigation queue + per-team rate limits (Redis)
+
+## Description
+Add a Redis-backed investigation queue with per-team rate limits and batch processing. Policies should control queue thresholds and rate limits.
+
+## Acceptance
+- [ ] Redis queue exists for investigation jobs with per-team routing.
+- [ ] Rate limiting is enforced per team with configurable limits.
+- [ ] Worker can batch-dequeue and start Temporal workflows.
+- [ ] Failures retry with backoff and do not block other teams.
+
+## Done summary
+- Added Redis-backed investigation queue with per-team routing
+- Implemented sliding window rate limiter using Redis Lua script
+- Created InvestigationWorker that processes jobs and starts Temporal workflows
+- Jobs support priority, retry with exponential backoff, and status tracking
+- Worker polls teams, respects rate limits, and processes in batches
+- Failures don't block other teams (isolated per-team queues)
+- Added redis>=5.0.0 dependency
+
+Why:
+- Per-team rate limiting prevents any single team from overwhelming the system
+- Batch processing improves throughput for investigation workflows
+- Retry with backoff handles transient failures gracefully
+
+Components:
+- InvestigationQueue: Per-team job queue with priority sorting
+- RedisRateLimiter: Sliding window rate limiter per team
+- InvestigationWorker: Background worker that processes queues
+
+Verification:
+- 25 unit tests for queue and rate limiter
+- mypy type check passing
+- ruff check passing
+## Evidence
+- Commits:
+- Tests: uv run pytest python-packages/dataing/tests/unit/adapters/queue/ -v
+- PRs:
@@ -0,0 +1,21 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T19:32:07.904710Z",
+  "created_at": "2026-01-22T18:02:54.454759Z",
+  "depends_on": [],
+  "epic": "fn-24",
+  "evidence": {
+    "commits": [],
+    "prs": [],
+    "tests": [
+      "uv run pytest python-packages/dataing/tests/unit/adapters/sse/ python-packages/dataing/tests/unit/middleware/test_redis_rate_limit.py -v"
+    ]
+  },
+  "id": "fn-24.5",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.5.md",
+  "status": "done",
+  "title": "Redis-backed SSE event store + rate limiting",
+  "updated_at": "2026-01-22T19:39:04.950959Z"
+}
@@ -0,0 +1,39 @@
+# fn-24.5 Redis-backed SSE event store + rate limiting
+
+## Description
+Replace in-memory SSE event storage and API rate limiting with Redis-backed implementations.
+
+## Acceptance
+- [ ] SSE run events are persisted in Redis and survive process restart.
+- [ ] Replay window reads from Redis instead of in-memory dicts.
+- [ ] API rate limiting uses Redis with per-tenant identifiers.
+- [ ] Existing SSE API behavior remains backward compatible.
+
+## Done summary
+- Added Redis-backed SSE event store for run events persistence
+- SSE events now survive process restart with configurable TTL
+- Replay window reads from Redis instead of in-memory dicts
+- Added Redis-backed API rate limiting middleware with sliding window algorithm
+- Rate limiting uses per-tenant identifiers (tenant > API key > IP fallback)
+- Both components fail open on Redis errors for reliability
+- Included in-memory fallback store for local development
+
+Why:
+- SSE events were lost on process restart, causing client reconnection issues
+- In-memory rate limiting didn't work in multi-instance deployments
+- Redis provides distributed state for horizontal scaling
+
+Components:
+- RedisSSEEventStore: Store/retrieve events with automatic sequencing and TTL
+- RunMetadata: Track run status and replay window expiration
+- InMemoryFallbackSSEEventStore: Local development fallback
+- RedisRateLimitMiddleware: Distributed rate limiting with Lua script
+
+Verification:
+- 41 unit tests for SSE event store and rate limit middleware
+- mypy type check passing
+- ruff check passing
+## Evidence
+- Commits:
+- Tests: uv run pytest python-packages/dataing/tests/unit/adapters/sse/ python-packages/dataing/tests/unit/middleware/test_redis_rate_limit.py -v
+- PRs:
@@ -0,0 +1,24 @@
+{
+  "assignee": "bordumbb@gmail.com",
+  "claim_note": "",
+  "claimed_at": "2026-01-22T19:40:38.864418Z",
+  "created_at": "2026-01-22T18:03:03.205338Z",
+  "depends_on": [
+    "fn-24.2"
+  ],
+  "epic": "fn-24",
+  "evidence": {
+    "files_changed": [
+      "frontend/app/src/features/settings/teams/team-policy-editor.tsx",
+      "frontend/app/src/features/settings/teams/teams-settings.tsx"
+    ],
+    "lint_pass": true,
+    "tests_pass": true
+  },
+  "id": "fn-24.6",
+  "priority": null,
+  "spec_path": ".flow/tasks/fn-24.6.md",
+  "status": "done",
+  "title": "Policy editor UI in Settings > Teams",
+  "updated_at": "2026-01-22T19:46:27.896902Z"
+}
@@ -0,0 +1,17 @@
+# fn-24.6 Policy editor UI in Settings > Teams
+
+## Description
+Build a policy editor under Settings > Teams for managing per-team alert sources, auto-investigate thresholds, review requirements, and dataset/tag overrides.
+
+## Acceptance
+- [ ] UI lives under Settings > Teams and loads/saves policy via API.
+- [ ] Supports editing default team policy and dataset/tag overrides.
+- [ ] Displays queue/rate limit settings per team.
+- [ ] Error and empty states are handled.
+
+## Done summary
+Added team policy editor UI with default policy settings, dataset/tag overrides, and queue limit management. Integrated into Settings > Teams page with settings button for each team.
+## Evidence
+- Commits:
+- Tests:
+- PRs: