Skip to content

Proposal: in-application event framework — design + Phase 1 plumbing#8096

Draft
adamsachs wants to merge 3 commits into
mainfrom
event-framework/plumbing
Draft

Proposal: in-application event framework — design + Phase 1 plumbing#8096
adamsachs wants to merge 3 commits into
mainfrom
event-framework/plumbing

Conversation

@adamsachs
Copy link
Copy Markdown
Contributor

@adamsachs adamsachs commented May 4, 2026

Proposal for design review. Primary goal is to gather feedback on the design doc and align on the framework's shape before any production code consumes it. The plumbing implementation is included so reviewers can see the design landed end-to-end, but it is gated behind a kill-switch (CONFIG.events.enabled) and ships with no production publishers or subscribers wired in.

Ticket: related to https://ethyca.atlassian.net/browse/ENG-3325 and https://ethyca.atlassian.net/browse/ENG-3510

What to review (in priority order)

  1. docs/fides/docs/event-framework/01-design.md — the framework's design, prescriptive when-to-use boundary, transport choice rationale (Celery + existing Redis broker; alternatives explicitly considered and rejected), failure modes, and open questions. This is the most important file in the PR.
  2. docs/fides/docs/event-framework/02-implementation-plan.md — phased rollout plan. Phase 1 is this PR; Phases 2 and 3 are the first two consumers (monitor stewardship inheritance + monitor stats refresh).
  3. docs/fides/docs/event-framework/03-phase-1-plumbing.md — concrete checklist for the framework PR: package layout, per-component acceptance bars, the canonical end-to-end ping validation, coverage check pattern.
  4. docs/fides/docs/event-framework/04-stewardship-implementation-plan.md — self-contained plan for the first feature consumer (monitor stewardship inheritance), designed so another agent can pick it up on a fresh branch once dependencies land.
  5. The plumbing codesrc/fides/common/events/ and the test suite. Implementation reflects the design but the design is the artifact under review.

Description Of Changes

A new in-application event framework for asynchronous reactive work. Publishers emit typed events when application state changes; the framework dispatches registered handlers via Celery to recompute / propagate / refresh derived state. Goals:

  • Decouple publishers from consumers via a registry, so adding a new subscriber doesn't require modifying every action site.
  • Standardize observability, retry, and reconciliation patterns across features.
  • Establish prescriptive boundaries (when to use, when not) so the framework doesn't get misused.

Key design decisions (covered in detail in §1–5 of the design doc):

  • Transport: Celery, with the existing Redis broker. No new infrastructure.
  • Programming model: typed events as frozen dataclasses + @subscribes_to decorator-registered handlers + autodiscovery. Two-line API for subscribers.
  • Operational home (default): fides Celery queue, picked up by the existing "Other" worker. Per-handler queue override available.
  • Delivery semantics: at-least-once, not guaranteed — explicit non-goal. Subscribers must be idempotent against current DB state. Where correctness matters, an independent reconciler is required.
  • Where the bones live: fides OSS. fidesplus subscribes from across the boundary.

Non-goals (the prescriptive part — see §2.1):

  • Does not guarantee delivery.
  • Does not provide ordering guarantees.
  • Does not detect missed publishes.
  • Is not a replacement for synchronous calls.
  • Is not pub/sub on the wire.

Code Changes

  • New package src/fides/common/events/ (10 files, ~700 lines):
    • base.py — frozen-dataclass + JSON-primitive validation; rejects UUID/datetime/Enum at decoration time.
    • registry.py — in-process registry of subscribers + @publishes markers.
    • decorators.py@subscribes_to(EventType, queue="fides") and @publishes(*EventType).
    • publisher.pypublish_after_commit(session, event) with after-commit / after-rollback session hooks.
    • dispatcher.py — fan-out via task.apply_async; works in eager mode + production; per-subscriber failure isolation; kill-switch via CONFIG.events.enabled.
    • celery_integration.py — task wrapper factory; event_id passed as explicit task kwarg (not headers, so it works in eager mode + production identically).
    • context.py — ContextVar exposing get_current_event_id() to subscribers.
    • autodiscover.pyregister_subscriber_module + import_subscriber_modules for cross-repo registration.
    • testing.pypublished_events, assert_event_published, clear_pending_events.
    • __init__.py — public API.
  • New config section: src/fides/config/event_settings.py (EventSettings.enabled: bool = True), wired into FidesConfig.
  • Edits to src/fides/api/tasks/__init__.py (+14 lines): import celery_integration; after celery_app is created, call import_subscriber_modules() so subscribers register at boot.
  • Tests under tests/common/events/ (11 files, ~800 lines):
    • _ping.py — canonical reference fixture (event + subscriber + repository) demonstrating the recommended publish-from-repository pattern.
    • test_ping.py — 8 end-to-end tests: commit, rollback, ordering, kill-switch, marker recording, event_id flow-through, no-subscriber path.
    • Per-module unit tests for validation, registry, decorators, publisher, dispatcher.
    • test_coverage_checks.py — hard-fail check that every production subscriber's event type has an in-tree @publishes marker, with a documented exclude-list pattern.
  • Four design docs under docs/fides/docs/event-framework/.

42 tests passing. No production publishers or subscribers consume the framework yet — that's intentional.

Steps to Confirm

The framework's correctness is demonstrated entirely via the test suite — no production code path consumes it yet, so there's nothing to manually click through.

  1. Run the test suite:
    pytest tests/common/events/ -v
    Expected: 42 passing.
  2. Verify the kill switch works: CONFIG.events.enabled = False should make publish_after_commit queue events on the session but skip dispatch on commit. Covered by test_ping.py::test_kill_switch_disables_dispatch.
  3. Verify rollback safety: events queued on a rolled-back session should never fire. Covered by test_ping.py::test_publish_then_rollback_does_not_dispatch.
  4. Read the design doc and tell us where it's wrong, vague, or missing constraints we'd regret leaving out.

Pre-Merge Checklist

  • Design doc reviewed and approved by the team
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration label to the entry if your change includes a DB migration — N/A, no migrations
    • Add a high-risk label to the entry if your change includes a high-risk change — N/A, gated behind kill-switch, no production callers
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • No UX review needed (no UI changes)
  • Followup issues:
    • Followup issues created (Phase 2 stewardship + Phase 3 stats; ticket creation deferred until design lands)
    • No followup issues
  • Database migrations:
    • No migrations
  • Documentation:
    • Design doc + implementation plan + Phase 1 plumbing checklist + stewardship implementation plan all included in PR
    • Documentation issue created in fidesdocs
    • No new client scopes
    • No documentation updates required

Open Questions for Reviewers

These are surfaced explicitly in 01-design.md §10 — please weigh in:

  1. @publishes marker enforcement strictness: required (CI-enforced) or recommended? Currently recommended; v1 leans soft.
  2. Reconciler cadence for stewardship inheritance: starting at 15 minutes — too tight, too loose, or right?
  3. Stats event granularity: single MonitorOperationCompleted(monitor_config_key, operation_type) or one event class per operation? Leaning the simpler model.
  4. Event package layout: fides.api.events.<domain> for fides-emitted; fidesplus.api.events.<domain> for fidesplus-emitted. Confirm before Phase 2.
  5. event_id provenance: generated at publish time, shared across subscribers for the same publish. Confirmed in design — flag if a different model is preferred.

Generated with Claude Code

adamsachs and others added 3 commits May 3, 2026 11:04
…e 1 plumbing checklist

Three docs under docs/fides/docs/event-framework/ covering the in-application
event framework for asynchronous reactive work:

- 01-design.md: motivation, when-to-use boundary, transport choice (Celery
  + existing Redis broker), programming model, failure modes, open questions.
- 02-implementation-plan.md: phased rollout — framework plumbing, monitor
  stewardship inheritance subscriber, monitor stats refresh subscriber.
- 03-phase-1-plumbing.md: concrete checklist for the framework PR including
  package layout, per-component acceptance bars, the canonical e2e ping
  validation, and coverage check pattern with exclude lists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New fides.common.events package implements an in-application event
framework for asynchronous reactive work. Phase 1 plumbing only — no
production publishers or subscribers wired in.

Public API:
- publish_after_commit(session, event): queue an event for dispatch when
  the SQLAlchemy session next commits. Drops events on rollback.
- @subscribes_to(EventType, queue="fides"): register a handler. Creates
  a deterministically-named Celery task and adds it to the in-process
  registry.
- @publishes(EventType, ...): optional marker on publisher methods for
  test-time coverage introspection.
- assert_event_published / published_events / clear_pending_events:
  test helpers operating on session.info pending events.

Transport: Celery + existing Redis broker. Subscribers default to the
"fides" queue and are picked up by the existing "Other" worker, with
per-handler queue override available. event_id is generated at publish
time (UUID4 hex) and passed as an explicit task kwarg so propagation
works identically in eager mode (tests) and production.

Validation: events must be frozen dataclasses with JSON-primitive fields.
UUID/datetime/Enum are rejected at decoration time with a field-level
error message.

Kill switch: CONFIG.events.enabled (default True). When False the
dispatcher logs and returns; the session's after_commit drains pending
events without sending tasks.

Tests (42 passing):
- tests/common/events/_ping.py: canonical end-to-end fixture pair —
  PingEvent + ping_handler + PingRepository.emit demonstrating the
  recommended publish-from-repository pattern.
- test_ping.py: e2e validation (commit, rollback, ordering, kill switch,
  event_id flow-through, no-subscriber path, marker recording).
- test_base.py: validation acceptance/rejection per field type.
- test_decorators.py: @subscribes_to and @publishes behavior.
- test_publisher.py: session lifecycle, multi-session isolation, helpers.
- test_dispatcher.py: fan-out, kill switch, per-subscriber failure
  isolation.
- test_coverage_checks.py: hard-fail check that every production
  subscriber's event type has an in-tree @publishes marker, with an
  exclude list for documented exemptions.

Existing-file edits:
- src/fides/api/tasks/__init__.py: import celery_integration (defensive
  no-op today; reserved for future Celery hooks) and call
  import_subscriber_modules() after celery_app is created.
- src/fides/config/__init__.py: register EventSettings under the
  events: subsection.

See docs/fides/docs/event-framework/ for the design doc, implementation
plan, and phase 1 plumbing checklist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-contained plan for the first feature consumer of the event framework.
Designed so another agent can pick up the work on a fresh branch once
dependencies land.

Covers two trigger paths concretely (System.data_stewards change in fides;
inherit_system_stewards toggle in fidesplus), with the subscriber +
reconciler architected to handle the remaining triggers as follow-ups
without code changes.

Includes:
- Branch dependencies (event framework PR + fides #7888 model updates +
  fidesplus #3394 API surface) — all three must be on main before work
  starts.
- Event type definitions (SystemDataStewardsChanged,
  MonitorInheritanceFlagChanged) in fides so any repo can subscribe.
- Repository pattern application: new SystemStewardsRepository in fides
  for Trigger 1; new MonitorConfigRepository.upsert_with_inheritance_flag_diff
  wrapper in fidesplus for Trigger 2.
- Subscriber in fidesplus running inline on the existing "Other" worker.
- Periodic reconciler with the Redis last-run timestamp skip pattern for
  multi-pod cluster coordination.
- On-demand reconcile endpoint that bypasses the skip.
- Suggested PR split (fides events + repository, then fidesplus subscriber
  + reconciler).
- Open questions and acceptance criteria.

Debouncing is out of scope for this implementation per design discussion;
the inline subscriber + idempotent reconciler handle the day-one case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored May 4, 2026 2:37pm
fides-privacy-center Ignored Ignored May 4, 2026 2:37pm

Request Review

adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 13, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 14, 2026
Temporary single-purpose callback registry under ``fides.api`` that lets
the fidesplus monitor-stewardship-inheritance feature react when a
user is added/removed as a data steward (system manager) of a system.

The three v1 system-manager mutation routes (``bulk_assign_steward``,
``update_managed_systems``, ``remove_user_as_system_manager``, and
the approver-demotion branch of ``update_user_permissions``) gain a
``BackgroundTasks`` injection and call
``notify_system_stewards_changed(background_tasks, system.id)`` after
each successful set/remove. The model methods themselves are
unchanged so callers from tests/conftest don't trigger propagation.

This is intentionally a tiny ~50-line shim. The module docstring
documents the cutover: when the event framework lands (fides
PR #8096), trigger sites swap to ``publish_after_commit(...)`` and
this file is deleted. Do not generalize this into a ``hooks`` package.

Pairs with fidesplus ``ENG-3324/monitor-steward-inheritance-propagation``,
which registers the inheritance-reconcile callback at app startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adamsachs added a commit that referenced this pull request May 14, 2026
Adds a second temporary callback registry (sibling to
``system_steward_change_hooks``) so fidesplus's monitor-stewardship-
inheritance propagation can react when the system↔connection-config link
that backs a monitor changes. Inheritance walks
``monitor → connection_config → system → data_stewards``, so a link
add/remove changes the desired inherited set even when the system's own
steward roster is untouched.

Same shim posture as the steward hook: single-purpose registry, loud
docstring pointing at the event-framework cutover (#8096), explicit
"do not generalize this into a ``hooks`` package" note.

* ``src/fides/api/system_connection_config_link_change_hooks.py`` —
  registry with ``register_…`` + ``notify_system_connection_config_link_changed(background_tasks, connection_config_id)``.
  Failure-isolated dispatch (each hook wrapped in try/except).
* ``src/fides/system_integration_link/service.py`` — ``set_links`` and
  ``delete_link`` accept an optional ``background_tasks``; when provided
  and a write actually happened, they fire the link-change hooks.
* ``src/fides/system_integration_link/routes.py`` — ``PUT`` and
  ``DELETE`` system-links routes inject ``BackgroundTasks`` and pass it
  through to the service.
* Tests: 5 unit tests for the registry itself
  (``tests/api/test_system_connection_config_link_change_hooks.py``)
  plus 4 service-level tests confirming dispatch fires on writes when
  ``background_tasks`` is supplied and stays silent otherwise / on
  no-op deletes.

FK ``ondelete='CASCADE'`` on both join columns means deleting a
``System`` or ``ConnectionConfig`` outright bypasses Python and won't
fire the hook — documented in the module docstring; consumers must
rely on a periodic reconciler as the convergence safety net for those.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant