Refactor transient connections into dedicated manager

## Problem
Transient handling spans `connection_manager`, `p2p_protoc`, and routing code. Budgeting, reservation, promotion, and cleanup are scattered across multiple maps and counters, making it hard to reason about correctness and concurrency. Nacho called out in Matrix that this muddiness makes concurrency issues hard to spot; AI reviewers also struggle to follow the lifecycle.

Specific pain points:
- Budget and lifecycle logic (reserve, promote, drop, expire) are spread between connection_manager and the bridge.
- Promotion requires touching several data structures manually (pending reservations, transient map, connections_by_location), which is error-prone.
- TTL enforcement is anchored in the bridge task rather than encapsulated.
- Admission idempotency/race handling is duplicated.

## Proposal
Introduce a dedicated `TransientConnectionManager` that owns transient lifecycle and budget.
- API surface: `try_reserve(peer, location) -> bool`, `promote(peer, location) -> Result<(), PromoteError>`, `drop(peer)`, `expire_due()` or internal cleanup task, `is_transient(peer)`, `count()`, `budget()`.
- Encapsulate map + counter and enforce atomicity inside the manager (no cross-module map/counter mutations).
- ConnectionManager holds a reference and delegates transient operations; routing/topology use a single source of truth for transient state.
- Promotion path: bridge calls `drop_transient`/`promote` via the manager; `add_connection` only handles established peers and does not need transient bookkeeping knowledge.
- TTL: transient manager owns timers/cleanup (single task or timer-wheel) instead of ad-hoc tasks in the bridge.
- Idempotency: manager handles duplicate admissions/promotions centrally to reduce TOCTOU races.

## Rationale
- Clear separation between transient and established connections should reduce concurrency bugs and make reviews/maintenance easier.
- Limits the number of places that mutate transient state, making invariants testable.
- Makes it easier to add targeted tests (unit + integration) for transient lifecycle without threading through the bridge.

## Isotonic / forwarding follow-up
- Keep the global isotonic regression (shared curve across peers) but bound/reset it so it doesn’t grow unbounded and can adapt. Per-peer offsets stay layered on the shared curve.
- Add a simple recency bias that’s easy to reason about (e.g., short cooldown or small recent-window weighting) to avoid overfitting to stale data; keep it minimal to avoid hard-to-debug behaviors.
- Move storage out of the ad-hoc global static into a controlled owner (even if process-global) with explicit reset/bounds.

## Questions / Decisions
- Should promotion be synchronous (returns Result) or enqueue a promotion task? (lean: synchronous to keep invariants tight)
- Do we need metrics/hooks for transient events (reserve, promote, drop, expire) as part of the refactor?
- TTL enforcement strategy: per-connection timer vs. periodic sweep.
- Recency handling: what’s the minimal cooldown/decay that still prevents hammering the same neighbor?

## Next Steps
- Sketch API in a small RFC/PRD comment thread on this issue.
- Draft refactor in a follow-up PR (after current transient stack) with focused tests:
  - Concurrent reserve/promote/drop stress
  - TTL expiry cleanup
  - Promotion idempotency and cap enforcement
  - Estimator bounds/reset + minimal recency bias
- Wire bridge and connection_manager to use the new manager, minimize surface changes elsewhere.

CC: @iduartgomez for architectural alignment.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor transient connections into dedicated manager #2127

Problem

Proposal

Rationale

Isotonic / forwarding follow-up

Questions / Decisions

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Refactor transient connections into dedicated manager #2127

Description

Problem

Proposal

Rationale

Isotonic / forwarding follow-up

Questions / Decisions

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions