-
-
Notifications
You must be signed in to change notification settings - Fork 106
Description
Description
When a peer has no other peers available to forward a GET request to, the operation fails immediately with "reached max retries" instead of actually retrying. This prevents content retrieval in sparse network conditions.
Confirmed Bug
The GET operation code has a logic flaw: when k_closest_potentially_caching() returns an empty list (no available peers), it immediately returns failure WITHOUT:
- Actually incrementing the retry counter
- Implementing any delay mechanism
- Attempting the configured MAX_RETRIES (10)
Evidence from Logs
User's peer logs:
2025-09-26T02:48:53.726028Z INFO freenet::operations::connect: Immediately requesting more peer connections from gateway
[3.5 minutes pass - user attempts to join River room]
2025-09-26T02:52:23.003037Z INFO freenet::operations::get: Seek contract, tx: 01K61YZ7JT1PP4WGM8N4ZRDWG2, key: 9L1N9DyVwofcib7PpQqjsmdkcEWLBf8PUVGMLx9LDW1H, target: v6MWKgqHiBMNcGtG
2025-09-26T02:52:23.195452Z WARN freenet::operations::get: Neither contract or contract value for contract found at peer v6MWKgqHiBMNcGtG, retrying with other peers
2025-09-26T02:52:23.195511Z ERROR freenet::operations::get: Failed getting a value for contract 9L1N9DyVwofcib7PpQqjsmdkcEWLBf8PUVGMLx9LDW1H, reached max retries
Note: Entire operation failed in ~192ms, clearly not attempting 10 retries.
Gateway logs (same transaction):
2025-09-26T02:52:23.092836Z WARN freenet::operations::get: No other peers found while trying to get the contract, tx: 01K61YZ7JT1PP4WGM8N4ZRDWG2, key: 9L1N9DyVwofcib7PpQqjsmdkcEWLBf8PUVGMLx9LDW1H, this_peer: v6MWKgqHiBMNcGtG
at crates/core/src/operations/get.rs:1135
Root Cause
In get.rs lines 695-744, when new_candidates.is_empty():
- If there's a requester peer: returns failure immediately
- If original requester: logs "reached max retries" and fails
- Never actually increments retry counter or delays
Why the Gateway Had No Peers (Hypothesis)
Several possible explanations:
- Version incompatibility: The v0.1.27 release had just happened, possibly leaving the gateway isolated from older versions
- Network bootstrap issue: The gateway may have lost connections and hadn't re-established them
- Actual sparse network: Simply very few peers online
- Connection failures: Network issues preventing peer connections
We cannot determine the exact cause from the logs alone.
Impact
This bug affects any scenario where peers temporarily have no connections:
- Network bootstrap/startup
- Version transitions (if versions are incompatible)
- Network partitions
- Connection losses
- Small/test networks
Proposed Solution
The GET operation needs proper retry logic:
- Always increment retry counter, even when no peers available
- Implement exponential backoff delays (100ms, 200ms, 400ms, etc.)
- Actually attempt MAX_RETRIES times before failing
- Consider scheduling retries rather than blocking
This would allow time for:
- Peer connections to establish
- Network conditions to improve
- Bootstrap processes to complete
Additional Issues
- Misleading error message: "reached max retries" when no retries were attempted
- No grace period: Fails immediately even though peer requested more connections
- Test gap: Tests use pre-populated networks, missing this edge case
Related Code References
- Bug location:
get.rs:695-744 - MAX_RETRIES constant:
get.rs:30 - Gateway's "no peers" warning:
get.rs:1135
[AI-assisted debugging and comment]
Metadata
Metadata
Assignees
Labels
Type
Projects
Status