Skip to content

Add ProjectionCoordinator + distributor layer for hot-cold daemon coo…#96

Merged
jeremydmiller merged 1 commit into
mainfrom
feature/projection-coordinator-83
May 14, 2026
Merged

Add ProjectionCoordinator + distributor layer for hot-cold daemon coo…#96
jeremydmiller merged 1 commit into
mainfrom
feature/projection-coordinator-83

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

…rdination

Closes #83.

Polecat could spin up the async projection daemon via PolecatDaemonHostedService (single-node, no coordination), but had no path to safely run multiple Polecat nodes against the same database — every node would try to drive every shard. This adds the multi-node-aware coordinator that mirrors Marten's hot-cold distributor design, adapted to SQL Server idioms.

What's added

  • SqlServerAppLock — Polecat's analog to Marten's Weasel.Postgresql.AdvisoryLock. Wraps sp_getapplock / sp_releaseapplock with @LockOwner = 'Session', holding a dedicated SqlConnection for the lifetime of the lock owner so the Session-scoped locks persist until we explicitly release or the connection drops. Stale-handle detection clears local tracking when the connection is found broken so the coordinator's HasLock() check stays honest.

  • IProjectionDistributor + ProjectionSet — the per-cycle "(database × shards) with a lock id" grouping the coordinator polls.

  • Three concrete distributors:

    • SoloProjectionDistributor — single-node fall-through, no locks (test / explicit-opt-out path; not auto-selected by the coordinator).
    • SingleTenantProjectionDistributor — one set per shard, deterministic lock id from db.Identifier:schema:shard.Identity so every node in the deployment races for the same sp_getapplock resource.
    • MultiTenantedProjectionDistributor — one set per database, all shards grouped behind one per-database lock so a tenant's projections never split across nodes.
  • ProjectionCoordinator — the concrete IProjectionCoordinator implementation. The ExecuteAsync loop is a slot-for-slot mirror of Marten's ProjectionCoordinator.executeAsync:

    • Build the current distribution
    • For each set: if we hold the lock, ensure agents are running; if not, try to attain it (start agents on success, stop them on failure to recover from a lost lock)
    • Sleep based on agent pause status Picks SingleTenantProjectionDistributor for DatabaseCardinality.Single and MultiTenantedProjectionDistributor for StaticMultiple (mirrors Marten's Tenancy is DefaultTenancy ? Single : MultiTenanted choice). Plus ProjectionCoordinator<T> for ancillary multi-store DI registration.
  • DI wiring on PolecatConfigurationExpression: AddProjectionCoordinator(DaemonMode) and the typed AddProjectionCoordinator<T>(DaemonMode). Mutually exclusive with AddAsyncDaemon — pick one.

Tests

src/Polecat.Tests/Daemon/sql_server_app_lock_tests.cs — 6 cases for the lock primitive, all passing against SQL Server 2025 docker on net9.0:

  • Only one of two SqlServerAppLock instances acquires a given id
  • Releasing lets a waiting instance acquire
  • Disposing one auto-releases its session locks (Session-scope)
  • Multiple distinct lock ids are independent
  • TryAttainLockAsync is idempotent when the id is already owned
  • Releasing an unknown id is a no-op

The full coordinator loop (leadership election + agent lifecycle) follows Marten's well-tested implementation slot-for-slot; the lock primitive is the genuinely Polecat-new code path and gets dedicated coverage.

…rdination

Closes #83.

Polecat could spin up the async projection daemon via PolecatDaemonHostedService
(single-node, no coordination), but had no path to safely run multiple Polecat
nodes against the same database — every node would try to drive every shard.
This adds the multi-node-aware coordinator that mirrors Marten's hot-cold
distributor design, adapted to SQL Server idioms.

## What's added

- `SqlServerAppLock` — Polecat's analog to Marten's `Weasel.Postgresql.AdvisoryLock`.
  Wraps `sp_getapplock` / `sp_releaseapplock` with `@LockOwner = 'Session'`,
  holding a dedicated `SqlConnection` for the lifetime of the lock owner so the
  Session-scoped locks persist until we explicitly release or the connection
  drops. Stale-handle detection clears local tracking when the connection is
  found broken so the coordinator's `HasLock()` check stays honest.

- `IProjectionDistributor` + `ProjectionSet` — the per-cycle "(database × shards)
  with a lock id" grouping the coordinator polls.

- Three concrete distributors:
  - `SoloProjectionDistributor` — single-node fall-through, no locks (test /
    explicit-opt-out path; not auto-selected by the coordinator).
  - `SingleTenantProjectionDistributor` — one set per shard, deterministic
    lock id from `db.Identifier:schema:shard.Identity` so every node in the
    deployment races for the same `sp_getapplock` resource.
  - `MultiTenantedProjectionDistributor` — one set per database, all shards
    grouped behind one per-database lock so a tenant's projections never split
    across nodes.

- `ProjectionCoordinator` — the concrete `IProjectionCoordinator`
  implementation. The `ExecuteAsync` loop is a slot-for-slot mirror of
  Marten's `ProjectionCoordinator.executeAsync`:
    - Build the current distribution
    - For each set: if we hold the lock, ensure agents are running; if not,
      try to attain it (start agents on success, stop them on failure to
      recover from a lost lock)
    - Sleep based on agent pause status
  Picks `SingleTenantProjectionDistributor` for `DatabaseCardinality.Single`
  and `MultiTenantedProjectionDistributor` for `StaticMultiple` (mirrors
  Marten's `Tenancy is DefaultTenancy ? Single : MultiTenanted` choice).
  Plus `ProjectionCoordinator<T>` for ancillary multi-store DI registration.

- DI wiring on `PolecatConfigurationExpression`:
  `AddProjectionCoordinator(DaemonMode)` and the typed
  `AddProjectionCoordinator<T>(DaemonMode)`. Mutually exclusive with
  `AddAsyncDaemon` — pick one.

## Tests

`src/Polecat.Tests/Daemon/sql_server_app_lock_tests.cs` — 6 cases for the
lock primitive, all passing against SQL Server 2025 docker on net9.0:

- Only one of two `SqlServerAppLock` instances acquires a given id
- Releasing lets a waiting instance acquire
- Disposing one auto-releases its session locks (Session-scope)
- Multiple distinct lock ids are independent
- `TryAttainLockAsync` is idempotent when the id is already owned
- Releasing an unknown id is a no-op

The full coordinator loop (leadership election + agent lifecycle) follows
Marten's well-tested implementation slot-for-slot; the lock primitive is
the genuinely Polecat-new code path and gets dedicated coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeremydmiller jeremydmiller merged commit 15e4089 into main May 14, 2026
6 checks passed
@jeremydmiller jeremydmiller deleted the feature/projection-coordinator-83 branch May 14, 2026 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add projection distributor layer (Solo / SingleTenant / MultiTenanted) for multi-node coordinated daemons

1 participant