Skip to content

fix: force 4 tokio worker threads on node#46

Merged
mickvandijke merged 1 commit intoWithAutonomi:mainfrom
jacderida:fix/force-4-worker-threads
Mar 30, 2026
Merged

fix: force 4 tokio worker threads on node#46
mickvandijke merged 1 commit intoWithAutonomi:mainfrom
jacderida:fix/force-4-worker-threads

Conversation

@jacderida
Copy link
Copy Markdown
Collaborator

Summary

  • Force worker_threads = 4 on the tokio runtime regardless of CPU count
  • Remove !.cargo/config.toml exception from .gitignore

On small VMs (1-2 vCPU), the default num_cpus gives only 1-2 worker threads. The NAT traversal poll() function does synchronous work (parking_lot locks, DashMap iteration) that blocks its worker thread. With only 1 worker, this freezes the entire runtime — timers stop, keepalives can't fire, and connections die silently.

Test plan

  • Tested on 6-node testnet (5 cloud + 1 local behind NAT) for 60 minutes
  • 43 consecutive successful uploads, 0 failures
  • No deadlocks or process hangs

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 26, 2026 20:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The default #[tokio::main] uses num_cpus worker threads. On small
node VMs (1-2 vCPU), this gives only 1-2 worker threads. The NAT
traversal poll() function does synchronous work (parking_lot locks,
DashMap iteration) that blocks its worker thread. With only 1 worker,
this freezes the entire runtime — timers stop, keepalives can't fire,
and connections die silently.

Also removes the !.cargo/config.toml exception from .gitignore so
local-only patch overrides stay out of version control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jacderida jacderida force-pushed the fix/force-4-worker-threads branch from f10ac95 to e66d372 Compare March 26, 2026 21:07
@mickvandijke mickvandijke merged commit 76c548d into WithAutonomi:main Mar 30, 2026
11 checks passed
mickvandijke added a commit that referenced this pull request Apr 1, 2026
Add unit and e2e tests covering the remaining Section 18 scenarios:

Unit tests (32 new):
- Quorum: #4 fail→abandoned, #16 timeout→inconclusive, #27 single-round
  dual-evidence, #28 dynamic threshold undersized, #33 batched per-key,
  #34 partial response unresolved, #42 quorum-derived paid-list auth
- Admission: #5 unauthorized peer, #7 out-of-range rejected
- Config: #18 invalid config rejected, #26 dynamic paid threshold
- Scheduling: #8 dedup safety, #8 replica/paid collapse
- Neighbor sync: #35 round-robin cooldown skip, #36 cycle completion,
  #38 snapshot stability mid-join, #39 unreachable removal + slot fill,
  #40 cooldown peer removed, #41 cycle termination guarantee,
  consecutive rounds, cycle preserves sync times
- Pruning: #50 hysteresis prevents premature delete, #51 timestamp reset
  on heal, #52 paid/record timestamps independent, #23 entry removal
- Audit: #19/#53 partial failure mixed responsibility, #54 all pass,
  #55 empty failure discard, #56 repair opportunity filter,
  response count validation, digest uses full record bytes
- Types: #13 bootstrap drain, repair opportunity edge cases,
  terminal state variants
- Bootstrap claims: #46 first-seen recorded, #49 cleared on normal

E2e tests (4 new):
- #2 fresh offer with empty PoP rejected
- #5/#37 neighbor sync request returns response
- #11 audit challenge multi-key (present + absent)
- Fetch not-found for non-existent key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants