-
Notifications
You must be signed in to change notification settings - Fork 135
chore(pegboard-gateway): add new message id format & add deprecated tunnel ack #3492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 11-19-chore_pegboard-gateway_remove_tunnelack_
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
c020fe1 to
e47241f
Compare
Code ReviewThis PR introduces a significant architectural improvement to the tunnel messaging system by implementing a more efficient message ID format and removing the need for explicit tunnel acknowledgments. Overall, the changes are well-structured and maintain backward compatibility. Here are my findings: ✅ Strengths
|
Code ReviewI've reviewed this PR which adds a new message ID format and deprecates the old tunnel ack mechanism. Here's my feedback: Positive Aspects✅ Well-structured message ID system: The new ✅ Backwards compatibility: The PR thoughtfully handles protocol versioning with ✅ Clean architecture: The new ✅ Improved state management: Removing the ✅ Request ID collision handling: Good defensive programming in Issues & Concerns🔴 Critical: Request ID Not Released After CompletionLocation: engine/packages/guard-core/src/proxy_service.rs:1889 The // At line 1889, request_id is generated
let request_id = state.generate_unique_request_id().await?;
// ... but I don't see state.release_request_id(request_id) anywhere in the taskRecommendation: Add a 🟡 Medium: Missing Request ID Release in HTTP PathLocation: engine/packages/guard-core/src/proxy_service.rs:1086 Similar to above, the HTTP request path generates a unique request ID but doesn't appear to release it when the request completes. The request ID should be released after the response is sent. Recommendation: Wrap the request handling in a scope guard or defer pattern to ensure 🟡 Medium: Serialization Size Not Verified at Compile TimeLocation: engine/packages/pegboard/src/tunnel/id.rs:34-39 The code uses a runtime assertion to verify the serialized size is exactly 12 bytes: ensure!(
written == 12,
"message id serialization produced wrong size: expected 12 bytes, got {}",
written
);While this is good defensive programming, if the BARE schema changes, this will only be caught at runtime. Recommendation: Consider adding a compile-time test or const assertion to verify the size. At minimum, add a unit test that exercises this code path. 🟡 Medium: Wrapping Arithmetic Without Overflow DocumentationLocation: engine/packages/pegboard-gateway/src/shared_state.rs:136 The message index uses wrapping arithmetic: req.message_index = req.message_index.wrapping_add(1);This is correct for a u16 counter, but there's no documentation about what happens when the index wraps from 65535 to 0, especially for WebSocket message acknowledgments. Recommendation: Add a comment explaining the wrapping behavior and confirming that the 🟢 Minor: Inconsistent Error Message FormattingLocation: engine/packages/pegboard/src/tunnel/id.rs:80-81 vs TypeScript implementation The Rust error messages use lowercase ("invalid request id length") while some TypeScript errors might use different casing. While minor, consistency helps with debugging. Recommendation: Standardize error message formats across Rust and TypeScript implementations. 🟢 Minor: Missing Documentation on MessageId FormatLocation: engine/sdks/schemas/runner-protocol/v3.bare:217 The BARE schema defines Recommendation: Add a comment in the BARE schema: Performance Considerations✅ The new format is actually more efficient - the old approach with UUID-based IDs used 16 bytes per message ID, while the new format uses 12 bytes. Over millions of messages, this saves bandwidth. ✅ Removing the
Security Considerations✅ Using random u32 for both gateway_id and request_id provides sufficient entropy for non-cryptographic uniqueness.
Test Coverage❌ Missing tests: I found no test files for the new tunnel ID functionality. Given this is a protocol-level change, unit tests are strongly recommended. Recommended tests:
SummaryThis is a solid refactoring that simplifies the protocol and improves efficiency. The main concern is the request ID memory leak which must be fixed before merging. The other issues are lower priority but should still be addressed. Required before merge:
Recommended before merge:
Nice to have:
Great work on the backwards compatibility handling! The deprecated tunnel ack approach is clean and will make rolling this out much safer. Review generated with Claude Code |
| route_cache: RouteCache, | ||
| rate_limiters: Cache<(Id, std::net::IpAddr), Arc<Mutex<RateLimiter>>>, | ||
| in_flight_counters: Cache<(Id, std::net::IpAddr), Arc<Mutex<InFlightCounter>>>, | ||
| inflight_requests: Arc<Mutex<HashSet<RequestId>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use scc HashSet
| @@ -0,0 +1,86 @@ | |||
| use anyhow::{Context, Result, ensure}; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in runner protocol crate
| @@ -0,0 +1 @@ | |||
| pub mod id; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in runner protocol crate
4adef6f to
759df3e
Compare
e47241f to
6521b98
Compare
PR Review: New Message ID Format & Deprecated Tunnel AckI have completed a thorough review of this PR. Overall, this is a solid refactoring that improves the message ID system and maintains backwards compatibility. Here are my findings: Strengths
Critical Issues1. Memory Leak: Request IDs Never Released The release_request_id() method is defined in proxy_service.rs:687-689 but never called. This means the inflight_requests HashSet will grow indefinitely, memory usage will increase over time, and eventually could lead to performance degradation. Fix needed: Add cleanup calls after custom serve request completes (both success and error paths) and after WebSocket task completion. 2. Unsafe .expect() Calls in Database Code In engine/packages/pegboard/src/keys/actor.rs there are .expect() calls that will panic if the database contains invalid data. Per CLAUDE.md error handling guidelines, these should return proper Result<> with context instead of panicking in production code paths. Moderate Issues3. Missing Test Coverage - No tests added for the new message ID format, collision detection logic, or backwards compatibility handling. 4. Message Index Overflow - proxy_service.rs uses wrapping_add(1) for message_index (u16). After 65,535 messages it wraps to 0. Consider adding a comment or warning log. 5. Import Organization - Some files have imports that should be moved to the top of the file per CLAUDE.md style. Performance & Security
SummaryThis is a strong refactoring with good architectural decisions, but has two critical issues that must be fixed before merge:
Once these are addressed, this PR will be ready to merge. The new message ID format is well-designed and the backwards compatibility approach is solid. |
PR Review: New Message ID Format & Deprecated Tunnel AckThis is a well-structured refactoring that modernizes the tunnel message ID system. The changes introduce a more efficient message identification scheme while maintaining backward compatibility. Here's my detailed review: ✅ Strengths1. Improved Message ID Design
2. Strong Backward Compatibility
3. Good Type Safety
4. Hibernation Request Improvements
|
6521b98 to
d6a2e79
Compare
759df3e to
4adef6f
Compare
PR Review: Add New Message ID Format & Deprecated Tunnel AckOverviewThis PR introduces a significant refactor to the message ID system in the Pegboard gateway, moving from UUID-based IDs to a more compact and structured format. The changes touch critical paths in the proxy service and gateway communication. Code Quality & Best Practices ✅Strengths:
Areas for Improvement: 1. Inconsistent Whitespace (engine/packages/guard-core/src/proxy_service.rs:641)// Generate unique request ID
let request_id = Some(self.generate_unique_request_id().await?);Mixed tabs and spaces. The leading whitespace appears to be spaces instead of tabs. Per CLAUDE.md, this project uses hard tabs for Rust formatting. 2. Missing Import Consolidation (engine/packages/guard-core/src/proxy_service.rs:16)The import for use pegboard::tunnel::id::{RequestId, generate_request_id};Consider grouping this with other pegboard imports if any exist nearby. Potential Bugs & Issues
|
Pull Request ReviewSummaryThis PR introduces a new message ID format for the tunnel protocol and adds backward compatibility for deprecated tunnel acknowledgments. The changes migrate from a 16-byte UUID-based message ID to a more efficient 12-byte structured format. Code Quality - Strengths
CRITICAL BUG - Double In-Flight AcquisitionLocation: engine/packages/guard-core/src/proxy_service.rs:1097-1114 The CustomServe path acquires in-flight limits TWICE:
Impact: CustomServe HTTP requests count twice against the in-flight limit, reducing effective capacity by 50%. Fix: Remove the second acquire_in_flight call in the CustomServe branch. Other Issues1. Request ID Collision Detection (proxy_service.rs:465-495)
2. Message Index Wrap-Around (pegboard-gateway/src/shared_state.rs:136-151)
3. Lock Contention (proxy_service.rs:381, 465)
4. Test Coverage
RecommendationsHigh Priority:
Medium Priority: Low Priority: ConclusionSolid refactoring with good protocol versioning. However, the double in-flight acquisition bug should be fixed before merging. The code follows Rivet conventions well. Great work on backward compatibility! |
d6a2e79 to
adeb05d
Compare
4adef6f to
759df3e
Compare
PR Review: Tunnel Message ID Format & Deprecated Tunnel AckI've reviewed this PR which introduces a new message ID format and adds deprecated tunnel acknowledgments. Overall, the implementation is solid with good architectural decisions. Here are my findings: ✅ Strengths
|
PR Review: Message ID Format & Deprecated Tunnel AckSummaryThis PR introduces a new compact message ID format and deprecates the TunnelAck mechanism. The changes are well-structured with good backward compatibility support through protocol versioning. ✅ Strengths
🔍 Issues Found1. Whitespace Formatting Issue (guard-core/src/proxy_service.rs:641)// Generate unique request ID
let request_id = Some(self.generate_unique_request_id().await?);Issue: Mixed tabs and spaces - line uses spaces instead of hard tabs 2. Request ID Lifecycle Bug (guard-core/src/proxy_service.rs:1094-1173)let request_id = match self.state.acquire_in_flight(client_ip, &actor_id, &req_headers).await? {
Some(id) => id,
None => { return Err(errors::RateLimit { /* ... */ }.build()); }
};
// ... later in the loop
self.state.release_in_flight(client_ip, &actor_id, request_id).await;Issue:
This causes:
Fix: Remove the duplicate 3. Collision Handling Concerns (guard-core/src/proxy_service.rs:667-694)const MAX_TRIES: u32 = 100;
for attempt in 0..MAX_TRIES {
let request_id = generate_request_id();
if !requests.contains(&request_id) {
requests.insert(request_id);
return Ok(request_id);
}
tracing::warn!(?request_id, attempt, "request id collision");
}Issue: With 4 bytes (4 billion possibilities), collisions should be extremely rare. If hitting 100 attempts, there is likely a deeper issue (not cleaning up IDs, memory leak, etc.) Suggestion:
4. Message Index Type Mismatch (pegboard-gateway/src/shared_state.rs:39)message_index: tunnel_id::MessageIndex, // u16But in v3.bare: And in shared_state.rs:316: fn wrapping_gt(msg.message_index, ack_index)Issue: The code uses wrapping arithmetic for u16 indices, which is correct, but there is inconsistency in how indices are used vs. how they are documented. Suggestion: Add a comment explaining the wrapping behavior for message indices (e.g., "Indices wrap at u16::MAX") 5. Missing Error Context (pegboard/src/tunnel/id.rs:31)serde_bare::to_writer(&mut cursor, &parts).context("failed to serialize message id parts")?;Suggestion: Add the actual values to the error context for debugging: .with_context(|| format!("failed to serialize message id parts: gateway_id={:?}, request_id={:?}, message_index={}",
gateway_id, request_id, message_index))?;
|
adeb05d to
d6a2e79
Compare
759df3e to
4adef6f
Compare
PR Review: Add new message ID format & deprecated tunnel ackI've reviewed this PR and have the following feedback: Code Quality & Architecture ✅The refactoring to use a structured message ID format is well-designed: Strengths:
Potential Issues
|
Code Review - PR 3492SummaryThis PR introduces a new message ID format for the pegboard-gateway tunnel protocol and adds support for deprecated tunnel acknowledgments. The changes involve significant refactoring of the message ID system from UUID-based identifiers to a structured format with gateway ID, request ID, and message index components. Strengths
Critical Issues1. Potential Race Condition in Request ID Management (engine/packages/guard-core/src/proxy_service.rs:656-692) The code acquires an in-flight slot and generates a request ID, but releases it in a spawned task. If the request fails before the spawned task runs, the request ID leaks from the inflight_requests HashSet. Recommendation: Use RAII pattern or ensure release happens in all code paths. Consider using a guard struct that releases on drop. 2. Missing Tests No tests found for the new tunnel::id module. Required coverage:
3. Commented-out Code (shared_state.rs:472-474) Malformed commented code should be removed or properly formatted. 4. Duplicate In-Flight Acquisition (proxy_service.rs:809, 1097-1113) In handle_http_request, acquire_in_flight is called twice - at top level and within CustomServe handler. This could cause double-counting or double-release. Needs verification. Other Concerns5. Memory Leak Potential If the release task panics, request ID remains in HashSet forever, leading to memory exhaustion. 6. Message Index Overflow (shared_state.rs:150) No documentation about wraparound behavior when message index wraps from u16::MAX to 0. 7. Hard-coded Magic Numbers Consider documenting or making configurable: HWS_MESSAGE_ACK_TIMEOUT (30s), HWS_MAX_PENDING_MSGS_SIZE_PER_REQ (1MB), MAX_TRIES (100). Security Considerations
Performance Considerations
Style & Best Practices
RecommendationsMust Fix:
Should Fix:
Overall: Solid refactoring improving message ID structure, but resource cleanup and test coverage issues need addressing before merge. Request changes. |

No description provided.