Skip to content

Add num_member_sups and member_shutdown pool config options#109

Merged
seriyps merged 3 commits into
epgsql:masterfrom
seriyps:num-member-sups
May 29, 2026
Merged

Add num_member_sups and member_shutdown pool config options#109
seriyps merged 3 commits into
epgsql:masterfrom
seriyps:num-member-sups

Conversation

@seriyps

@seriyps seriyps commented May 27, 2026

Copy link
Copy Markdown
Member

Related to #104

Each pool's single pooler_pooled_worker_sup serialises all start_child / terminate_child calls from parallel starters, async stoppers, and cull events. Under load with a slow start_mfa this becomes the bottleneck even when workers themselves are healthy.

Mirrors the approach used by Ranch 2.0: split the supervisor into N parallel shards (num_member_sups, default 1). New starts are distributed round-robin; each worker records its shard index so terminations route in O(1). Shard 1 keeps the legacy supervisor name for backward compatibility.

Also adds member_shutdown to replace the hardcoded brutal_kill — allows graceful terminate/2 shutdown. With multiple shards, concurrent graceful stops proceed in parallel, reducing total teardown time from O(N × timeout) to O(N × timeout / num_member_sups).

Preparatory refactor converts all_members tuples to a #member{} record. Hot-upgrade compatible (code_change v4→v5 handles the full state migration).

Internal refactor: positional tuple destructuring (`{MRef, Status, Time, ExpTs}`)
is replaced by `#member{}' record fields throughout the pool gen_server. Each
call site only references the fields it cares about, and partial updates
preserve the rest of the record automatically — making future field additions
mechanical instead of a fan-out of pattern rewrites.

External API stability: `pool_stats/1' still returns the legacy 4-tuple shape
via the new `member_to_info_tuple/1' helper; the `member_info()' exported type
is unchanged. Tests that pattern-match `{_, free, _, _}` continue to work.

Hot upgrade: bump -vsn 4 → 5; add `do_upgrade_to_v5' that wraps existing
v4 4-tuples in `#member{}` records. The v3 → v4 step now chains into v4 → v5
so upgrades from 1.5.x / 1.6.0 / 1.7.0 all converge on the new record shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seriyps seriyps force-pushed the num-member-sups branch from 96bfd58 to 04a34b8 Compare May 28, 2026 08:55
seriyps and others added 2 commits May 28, 2026 11:40
Each pool's single pooler_pooled_worker_sup serialises all start_child and
terminate_child calls from parallel starters, async stoppers and cull events.
With slow start_mfa (e.g. a monolithic third-party start_link that opens a
network connection) this becomes the bottleneck even when individual workers
are healthy.

Mirrors the approach used by Ranch 2.0 (num_conns_sups): split the single
supervisor into N parallel shards. New starts are distributed round-robin
across shards; each #member{} records its shard_idx so terminations route to
the correct supervisor in O(1). Defaults to 1 — identical to the legacy layout.

Shard 1 always keeps the legacy unsuffixed name (pooler_<pool>_member_sup)
for backward compatibility; additional shards use _2, _3, ... suffixes.

Changes:
- pool_config() gains num_member_sups (pos_integer, default 1)
- #pool{} replaces member_sup with member_sups tuple + next_shard counter
- #member{} gains shard_idx for O(1) termination routing without leaking
  supervisor names across every member record
- pooler_pool_sup starts N member supervisors; add_member_sups/4 supports
  runtime increase via pool_reconfigure/2 (decrease refused)
- pooler_starter: tagged start/stop spec tuples; new '$pooler_member_sup' and
  '$pooler_pool' placeholders; '$pooler_pool_name' kept as deprecated alias
- code_change(4,...) extends do_upgrade_to_v5 to also migrate #pool{} shape
  (member_sup -> member_sups 1-tuple, next_shard=1, shard_idx=1 on all members)
- pooler.appup.src: note for future version bump to include
  {update, pooler_pool_sup, supervisor}
- Tests: 7 EUnit + 1 PropEr property covering distribution, routing, cull,
  and reconfigure scenarios

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Replaces the hardcoded brutal_kill in pooler_pooled_worker_sup with a
configurable member_shutdown option. Defaults to brutal_kill (unchanged
behaviour); a pos_integer() value maps directly to the OTP supervisor child
spec Shutdown field, sending exit(Pid, shutdown) and waiting up to that many
milliseconds for the worker's terminate/2 to complete.

Effective only when the worker has called process_flag(trap_exit, true) in
its init/1 — exit(Pid, shutdown) kills a non-trapping process immediately.
Ignored by custom stop_mfa that bypasses supervisor:terminate_child/2.

Pairs with num_member_sups: stops across different shards proceed in parallel,
reducing total teardown time from O(N x timeout) to O(N x timeout / num_member_sups).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@seriyps seriyps force-pushed the num-member-sups branch from 04a34b8 to 611060a Compare May 28, 2026 09:46
@seriyps seriyps merged commit 026fc09 into epgsql:master May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant