Skip to content

timeout on slow downloads and bandwidth tracking improvement#593

Merged
pefontana merged 5 commits intomainfrom
p2p-bandwidth-sort-improvements
Mar 6, 2026
Merged

timeout on slow downloads and bandwidth tracking improvement#593
pefontana merged 5 commits intomainfrom
p2p-bandwidth-sort-improvements

Conversation

@IAvecilla
Copy link
Copy Markdown
Contributor

@IAvecilla IAvecilla commented Feb 26, 2026

  • Introduce a PeerBandwidth enum to distinguish between peers with no download data yet (NotMeasured) and peers with actual measurements (Measured(f64)), replacing the raw f64 bandwidth field that defaulted to 0.0
  • Fix peer sorting for downloads: peers are now sorted into priority tiers, measured positive bandwidth first, then unmeasured peers, then peers with measured zero bandwidth (failed downloads). This prevents untested peers from being deprioritized just because they haven't been tried yet
  • Add 5-minute timeout for blob downloads to prevent indefinitely stuck downloads from blocking progress
  • Preserve peer bandwidth data on disconnect: instead of removing the connection entry entirely, only the path info is cleared, so bandwidth history survives reconnections
  • Clear bandwidth tracking after warmup to reset stale measurements when a new training run state begins

@IAvecilla IAvecilla force-pushed the p2p-bandwidth-sort-improvements branch from c20d83b to 6eb805b Compare February 26, 2026 13:23
@IAvecilla IAvecilla self-assigned this Feb 27, 2026
@IAvecilla IAvecilla marked this pull request as ready for review March 5, 2026 21:39
@IAvecilla IAvecilla force-pushed the p2p-bandwidth-sort-improvements branch from 91576e4 to 0ba4031 Compare March 6, 2026 13:47
@pefontana pefontana added this pull request to the merge queue Mar 6, 2026
Merged via the queue into main with commit 7d5fa57 Mar 6, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants