Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Nov 5, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Ready Ready Preview Comment Nov 5, 2025 5:23am
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 5, 2025 5:23am
rivet-inspector Ignored Ignored Preview Nov 5, 2025 5:23am
rivet-site Ignored Ignored Preview Nov 5, 2025 5:23am

@NathanFlurry NathanFlurry merged commit d5ee786 into main Nov 5, 2025
8 of 11 checks passed
Copy link
Member Author


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@NathanFlurry NathanFlurry deleted the 11-05-chore_pegboard_allow_configuring_runner_eligible_threshold_and_runner_lost_threshold branch November 5, 2025 05:22
@claude
Copy link

claude bot commented Nov 5, 2025

PR Review: Allow Configuring Runner Thresholds

Summary

This PR makes the runner eligibility and lost thresholds configurable via the Pegboard config, converting them from hardcoded constants to configurable parameters. This improves flexibility for different deployment environments, particularly for development where faster iteration is beneficial.


Code Quality ✅

Strengths:

  • Clear separation of concerns with well-documented config getters
  • Consistent pattern follows existing config methods (e.g., actor_start_threshold, base_retry_timeout)
  • Good documentation with units clearly specified (milliseconds)
  • Properly marked as Experimental which is appropriate for infrastructure tuning parameters

Adherence to CLAUDE.md conventions:

  • ✅ Uses hard tabs for formatting (as per rustfmt.toml)
  • ✅ Follows existing patterns in neighboring code
  • ✅ Config accessors properly use unwrap_or() with sensible defaults
  • ✅ Uses util::duration::seconds() helper for readability

Implementation Review 📝

Config Structure (engine/packages/config/src/config/pegboard.rs:38-84)

  • ✅ Properly adds Option<i64> fields with clear documentation
  • ✅ Default values maintained (10s and 15s) preserve existing behavior
  • ✅ Accessor methods follow established patterns

Usage Updates:
All 5 usages of the constants have been correctly migrated:

  1. runner.rs:84 - Timeout for listen operation
  2. runner.rs:115 - Protocol metadata for client
  3. runner.rs:940 - Check expired activity
  4. runtime.rs:151 - Actor allocation ping threshold
  5. update_alloc_idx.rs:48 - Runner eligibility check

Development Configuration (rivetkit-typescript):

  • ✅ Sets 5s/7s thresholds for faster development iteration
  • ✅ Good comment explaining the rationale (default ping interval is 3s)
  • ⚠️ Minor concern: The 2s grace period (5s threshold - 3s ping) is quite tight. If a ping is delayed by network issues, runners might be marked ineligible prematurely.

Potential Issues & Concerns

1. ⚠️ Tight Development Thresholds

RIVET__PEGBOARD__RUNNER_ELIGIBLE_THRESHOLD: "5000",  // 5s
RIVET__PEGBOARD__RUNNER_LOST_THRESHOLD: "7000",      // 7s

With a 3s ping interval:

  • Eligible threshold allows only ~1.67 missed pings (5s / 3s)
  • Lost threshold allows only ~2.33 missed pings (7s / 3s)

Recommendation: Consider using 6s/9s (2x and 3x ping interval) for more robustness in development environments with variable network conditions.

2. ✅ No Configuration Validation

While the code doesn't validate that runner_lost_threshold > runner_eligible_threshold, this is acceptable because:

  • The system gracefully handles any ordering
  • These are experimental tuning parameters
  • Over-validation can be restrictive

3. ⚠️ Missing Test Coverage

  • No unit tests verify the new config methods
  • No integration tests validate behavior with different threshold values
  • Recommendation: Consider adding tests that verify:
    • Config getters return correct defaults
    • Runner expiration logic works with custom thresholds
    • Runner eligibility checks respect the configured threshold

Performance Considerations ✅

Positive impacts:

  • No performance regression - same computation paths, just parameterized
  • Config values are read from ctx.config() which is typically cached
  • Shorter development thresholds reduce wait times for local testing

No concerns about:

  • Additional memory overhead (negligible - two i64 values)
  • Runtime cost (config access is lightweight)

Security Concerns ✅

No security issues identified:

  • Threshold values don't expose sensitive information
  • No risk of injection attacks (values are integers)
  • Configuration is environment-based, not user-controlled
  • Worst case: misconfiguration leads to aggressive runner eviction (availability issue, not security)

Architecture & Design ✅

Well-aligned with system design:

  • Follows the established pattern for Pegboard configuration
  • Maintains backward compatibility via default values
  • Separates configuration from implementation logic
  • Properly threads config through all necessary callsites

Recommendations

High Priority

None - the PR is fundamentally sound.

Medium Priority

  1. Consider adjusting development thresholds to 6000ms/9000ms for better resilience
  2. Add basic unit tests for the config getters

Low Priority

  1. Add a comment in the config explaining the relationship between these thresholds and the default ping interval
  2. Consider adding a validation warning if runner_lost_threshold <= runner_eligible_threshold

Final Verdict ✅ APPROVE WITH MINOR SUGGESTIONS

This is a clean, well-implemented change that follows project conventions. The core functionality is solid, and the only concerns are around the aggressiveness of the development environment thresholds.

Strengths:

  • Clean refactoring from constants to config
  • Maintains backward compatibility
  • Good documentation
  • Enables faster development iteration

Minor improvements suggested:

  • Adjust development thresholds for more robustness
  • Add test coverage

Great work! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants