Skip to content

bugfix/server cpu usage#527

Merged
zeroXbrock merged 10 commits intomainfrom
bugfix/server-cpu-usage
Apr 28, 2026
Merged

bugfix/server cpu usage#527
zeroXbrock merged 10 commits intomainfrom
bugfix/server-cpu-usage

Conversation

@zeroXbrock
Copy link
Copy Markdown
Member

@zeroXbrock zeroXbrock commented Apr 24, 2026

Motivation

CPU usage from server sessions was insane. From htop, I was getting ~2.6% CPU usage per session added, without even starting a spammer.

Solution

There was a hot loop in TxActor::run. When stopping a spammer, we shut down the scenario via scenario.ctx.cancel_token.cancel(). That dropped owned mpsc sender handles, which caused the receiver handles to default to an instant-return condition (returning None), which when evaluated inside tokio::select! would trigger a hot loop and burn CPU.

The solution I used here was to refresh the handles at the end of a spam run. If a user doesn't want to do another spam run, they should be able to just drop the entire Contender instance and all the handles will be dropped.

We also had to refactor the CancellationToken handling for TestScenario. (See breaking changes below). To remove a session mid-initialization, we need to have the cancellation token before/while we initialize Contender. We place this in ContenderCtx and give it to TestScenario, which creates a "child token", which means it can cancel tasks internally (which it does at the end of a spam run) without cancelling the parent context. With the "parent token" owned by ContenderCtx, we're able to propagate the cancel signal all the way down to TxActor (keeps track of pending txs, monitors for receipts).

Also added an optional tokio-metrics feature, and refactored flush_loop for better cancellation detection.

Breaking changes

  • TestScenario::new now requires an additional parameter: cancel_token: CancellationToken

PR Checklist

  • Added Tests
  • Added Documentation
  • Ran cargo +nightly clippy --workspace --lib --examples --tests --benches --all-features --locked --fix
  • Ran cargo fmt --all
  • Note breaking changes in PR description, if applicable
  • update changelogs
    • Update CHANGELOG.md in each affected crate
    • add a high-level description in the root changelog

  - add CancellationToken to orchcestrator::ContenderCtx, to be given to TestScenario
  - add CancellationToken as param to TestScenario::new
    - parent has unfettered access
    - TestScenario creates a child token for internal use
  - add a dedicated stop channel to TxActor
    - ensures the 'stop' message always gets delivered
  - add special cases to long-running methods to quit in response to the cancel token
@zeroXbrock zeroXbrock force-pushed the bugfix/server-cpu-usage branch from 4cac431 to ff17132 Compare April 28, 2026 00:07
@zeroXbrock zeroXbrock added the bug Something isn't working label Apr 28, 2026
@zeroXbrock zeroXbrock merged commit 835d293 into main Apr 28, 2026
7 checks passed
@zeroXbrock zeroXbrock deleted the bugfix/server-cpu-usage branch April 28, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant