Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@vercel
Copy link

vercel bot commented Nov 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

4 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 18, 2025 11:50pm
rivet-inspector Ignored Ignored Nov 18, 2025 11:50pm
rivet-site Ignored Ignored Nov 18, 2025 11:50pm
rivetkit-serverless Skipped Skipped Nov 18, 2025 11:50pm

Copy link
Contributor Author

MasterPtato commented Nov 18, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 18, 2025

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3483

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3483

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3483

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3483

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3483

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3483

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3483

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3483

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3483

commit: fa56065

@claude
Copy link

claude bot commented Nov 18, 2025

Pull Request Review

Overview

This PR simplifies the runner workflow by removing the command/event handling logic from the runner workflow and moving event forwarding directly to actor workflows. This is a significant refactoring that reduces ~500 lines of code by eliminating intermediate state management.

Key Changes

  1. Event Forwarding Architecture Change

    • Before: Events were sent to the runner workflow, stored in state, then forwarded to actor workflows
    • After: Events are forwarded directly from ws_to_tunnel_task.rs to actor workflows using the actor_id tag
    • This eliminates the need for the runner workflow to track last_event_idx, last_event_ack_idx, and event storage
  2. Removed State Management (runner2.rs:29-40)

    • Removed last_event_idx, last_command_idx, and commands vector from State
    • Simplified LifecycleState by removing event tracking fields
    • This reduces memory overhead and workflow complexity
  3. Removed Activities

    • InsertDb: Database initialization logic removed (~160 lines)
    • ProcessInit: Init packet processing removed (~80 lines)
    • InsertEvents: Event storage removed (~25 lines)
    • InsertCommands: Command storage and batching removed (~35 lines)
  4. Signal Changes

    • Removed Command signal entirely
    • Kept Forward, CheckQueue, and Stop signals
    • Forward now only handles ToServerStopping and rejects other message types

Code Quality & Best Practices

Good:

  • Follows error handling conventions with anyhow and custom RivetError types
  • Proper use of structured logging (tracing::warn!(?actor_id, ...))
  • Consistent naming conventions (snake_case, past tense for timestamps)
  • Good use of workspace dependencies
  • Clear separation of concerns by moving event handling closer to where it's needed

⚠️ Concerns:

  1. Missing Actor ID in Event Forwarding (ws_to_tunnel_task.rs:341-356)

    protocol::ToServer::ToServerEvents(events) => {
        let res = ctx.signal(pegboard::workflows::runner2::Forward {
            inner: protocol::ToServer::try_from(msg)
                .context("failed to convert message for workflow forwarding")?,
        })
        .tag("actor_id", actor_id)  // ❌ ERROR: 'actor_id' not defined

    Issue: The variable actor_id is not defined in this scope. You need to extract the actor_id from the events first.

    Fix: Extract actor_id from the events before using it:

    protocol::ToServer::ToServerEvents(events) => {
        for event in &events.events {
            let actor_id = crate::utils::event_actor_id(&event.inner).to_string();
            let res = ctx.signal(pegboard::workflows::actor::Event {
                inner: event.inner.clone(),
            })
            .tag("actor_id", &actor_id)
            .graceful_not_found()
            .send()
            .await?;
            if res.is_none() {
                tracing::warn!(?actor_id, "failed to send signal to actor workflow, likely already stopped");
            }
        }
    }
  2. TODO Comment Without Implementation Plan (runner2.rs:79)

    // TODO: Ack events

    Question: How will events be acknowledged now that the runner workflow no longer tracks them? This could lead to unbounded event queues or lost events if not handled properly.

  3. Potential Lost Init Handling (runner2.rs:98-108)
    The workflow now rejects ToServerInit messages, but there's no clear indication of where init logic has moved to. The init activity only writes to RunnerByKeyKey, which is much simpler than before.

    Question: Was the metadata and prepopulated actor names handling removed intentionally, or does it need to be reimplemented elsewhere?

  4. Variable Naming Improvement (conn.rs:79,89)
    ✅ Good change renaming packet to init_packet for clarity

Potential Bugs

CRITICAL: Compilation Error

  • Line 346 in ws_to_tunnel_task.rs: actor_id is undefined

⚠️ Possible Logic Issues:

  1. Event Acknowledgment: Without event tracking in the runner workflow, how are events acknowledged back to the client? This could cause the client to resend events unnecessarily.

  2. Command Flow: The Command signal was removed entirely. Need to verify that actor start commands are still being sent through an alternative path.

  3. Race Condition Handling: The old code had logic to prevent scheduling actors to draining runners (removed code around line 110-140 of the diff). Verify this is now handled elsewhere.

Performance Considerations

Improvements:

  • Reduced workflow state size (no more storing commands/events in memory)
  • Eliminated intermediate event storage and forwarding overhead
  • Simpler signal processing loop should have better throughput

⚠️ Potential Concerns:

  • Events are now forwarded individually rather than batched - could increase signal overhead if many events arrive simultaneously
  • Consider implementing batching in the new architecture if event volume is high

Security Concerns

✅ No new security issues introduced

  • Existing validation logic for actor ownership is preserved in KV request handling
  • Error messages don't leak sensitive information

Test Coverage

⚠️ Recommendations:

  1. Add integration tests for the new event forwarding path to ensure events reach actor workflows correctly
  2. Test the timeout/expiry behavior to ensure actors are properly marked as lost when runners disconnect
  3. Verify the draining behavior works correctly without the intermediate command handling
  4. Test the init flow to ensure runners can still connect and initialize properly

Additional Questions

  1. Was the InsertDb activity's functionality moved elsewhere, or is runner initialization now handled differently?
  2. How are commands being sent to runners now that the Command signal and command batching logic are removed?
  3. What's the plan for event acknowledgment mentioned in the TODO comment?

Summary

This is a valuable simplification that removes significant complexity from the runner workflow. However, there's a critical compilation error that needs to be fixed before this can be merged. Additionally, the missing event acknowledgment logic needs to be clarified or implemented.

Recommendation: Request changes to fix the compilation error and clarify the event acknowledgment and command flow architecture.


Review generated following CLAUDE.md conventions for error handling, logging patterns, and code style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants