Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@vercel
Copy link

vercel bot commented Oct 30, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Error Error Oct 30, 2025 6:11pm
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Oct 30, 2025 6:11pm
rivet-inspector Ignored Ignored Preview Oct 30, 2025 6:11pm
rivet-site Ignored Ignored Preview Oct 30, 2025 6:11pm

Copy link
Contributor Author

MasterPtato commented Oct 30, 2025


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Oct 30, 2025

PR Review: Integrate Sentry with Tracing and Axum

I've reviewed this PR that upgrades Sentry integration and consolidates error reporting through the tracing layer. Overall, this is a solid improvement that modernizes the Sentry integration and reduces code duplication. Here are my findings:


Strengths

  1. Unified Error Reporting: Excellent consolidation - removing manual sentry::capture_message calls in favor of automatic capture via the tracing layer is the right approach. This reduces boilerplate and ensures consistent error reporting.

  2. OpenTelemetry Integration: The SentryOtelLayer (traces.rs:79-115) is well-implemented and solves the important problem of correlating Sentry errors with OpenTelemetry traces. This will significantly improve debugging.

  3. Dependency Cleanup: The Cargo.lock changes show good dependency tree consolidation (removal of duplicate nu-ansi-term, regex-automata, etc.).

  4. Axum Middleware: Adding NewSentryLayer to the Axum router (router.rs:72) properly integrates request context and will automatically capture HTTP errors.

  5. Version Upgrade: Sentry 0.37.0 → 0.45.0 brings important improvements including better tower/axum integration.


🔍 Issues & Concerns

Critical: Missing Sentry Dependency in runtime package

Location: engine/packages/runtime/Cargo.toml

The PR adds sentry to the dependencies list but does not add it to the workspace Cargo.toml's members that use it. I can see in the diff that sentry.workspace = true was added to runtime/Cargo.toml, but I want to verify this compiles correctly.

Action: Ensure cargo build -p rivet-runtime succeeds.


Medium: Layer Ordering in Tracing Subscriber

Location: engine/packages/runtime/src/traces.rs:40-45

let registry = registry
    .with(reload_layer)
    .with(otel_metric_layer)
    .with(otel_trace_layer)
    .with(sentry::integrations::tracing::layer())  // ← Added here
    .with(SentryOtelLayer);

Concern: The SentryOtelLayer is placed after the Sentry tracing layer. This means the OTel trace ID replacement happens after Sentry's layer processes events.

Question: Does the Sentry layer capture events in on_event, or does it just collect them? If it captures immediately, the trace context updates in SentryOtelLayer might arrive too late.

Recommendation: Consider reordering to:

.with(SentryOtelLayer)  // Set trace context first
.with(sentry::integrations::tracing::layer())  // Then let Sentry layer capture

Or verify through testing that the current order correctly propagates OTel trace IDs to Sentry.


Medium: Scope Mutation Thread Safety

Location: engine/packages/runtime/src/traces.rs:101-110

sentry::configure_scope(|scope| {
    scope.set_context(
        "trace",
        sentry::protocol::TraceContext { /* ... */ },
    );
});

Concern: configure_scope modifies the current thread's scope. In async Rust with tokio, tasks can migrate between threads. This could lead to:

  1. Trace context being set on the wrong scope
  2. Race conditions if multiple events fire concurrently

Recommendation: Use Hub::current() and verify the scope isolation, or consider using span-local context instead:

sentry::with_scope(|scope| {
    scope.set_context("trace", /* ... */);
    // Capture within this scope if needed
}, || {});

Actually, looking more closely, since this is in on_event, it should be fine as long as Sentry's Hub handles thread-local storage correctly. But worth testing under concurrent load.


Low: Error Removal Without Replacement

Locations:

  • engine/packages/api-builder/src/middleware.rs:147-161 (removed)
  • engine/packages/gasoline/src/ctx/workflow.rs:228-236 (removed)
  • engine/packages/gasoline/src/worker.rs:266-277 (removed)

The manual sentry::capture_message calls were removed, which is correct. However:

Verification Needed: Ensure that the tracing layer automatically captures these errors. Based on the code:

  1. HTTP errors (middleware.rs:142-148): Logged with tracing::error\! → ✅ Should be auto-captured
  2. Workflow errors (ctx/workflow.rs:220): Logged with tracing::error\! → ✅ Should be auto-captured
  3. Unhandled workflow errors (worker.rs:266): Logged with tracing::error\! → ✅ Should be auto-captured

But: The old code set explicit tags like error_code, workflow_id, etc. With structured logging, these should be available as fields, but verify they appear in Sentry.

Action: Test that error metadata (error codes, workflow IDs, HTTP status codes) appears in Sentry events after this change.


🎯 Recommendations

1. Add Integration Tests

Since there are no existing tests in runtime or api-builder packages, consider adding at least a smoke test:

#[test]
fn test_sentry_otel_layer_trace_context() {
    // Verify OTel trace IDs propagate to Sentry
}

2. Document the Integration

Add a comment in traces.rs explaining the purpose of SentryOtelLayer:

/// Custom tracing layer that synchronizes OpenTelemetry trace IDs with Sentry.
/// This ensures that errors reported to Sentry are correlated with distributed
/// traces, enabling cross-system debugging.
struct SentryOtelLayer;

3. Verify Sentry Feature Flags

The new features added:

  • tracing: ✅ Correct
  • tower-axum-matched-path: ✅ Correct for Axum integration

Good choices here.

4. Consider Error Level Filtering

The Sentry tracing layer might capture all error-level logs. Consider adding a filter if you only want specific errors:

.with(
    sentry::integrations::tracing::layer()
        .event_filter(|metadata| {
            metadata.level() >= &tracing::Level::ERROR
        })
)

🔐 Security Considerations

No issues found. The Sentry DSN is already in the codebase (telemetry/src/lib.rs:3), so no new secrets exposed.


Performance Considerations

Minor: The SentryOtelLayer::on_event implementation (traces.rs:86-114) does several allocations:

  • Gets span extensions
  • Clones trace context
  • Calls configure_scope (acquires locks)

This happens on every trace event. Consider:

  1. Early return if Sentry is disabled
  2. Cache the hub lookup
  3. Only process ERROR level events

Suggestion:

fn on_event(&self, event: &tracing::Event<'_>, ctx: Context<'_, S>) {
    // Early return for non-error events
    if \!matches\!(event.metadata().level(), &tracing::Level::ERROR | &tracing::Level::WARN) {
        return;
    }
    
    // Rest of implementation...
}

📋 Test Coverage

Missing: No tests were added for the new SentryOtelLayer functionality.

Recommendation: Add tests or manually verify:

  1. OTel trace IDs appear in Sentry events
  2. Span IDs are correctly propagated
  3. Error metadata (structured fields) appears in Sentry
  4. HTTP errors include matched route paths (from tower-axum-matched-path)

Summary

This PR is a net positive with good architectural decisions. The main concerns are:

  1. ⚠️ Verify layer ordering doesn't cause trace ID issues
  2. ⚠️ Test that error metadata still appears in Sentry
  3. ⚠️ Consider performance optimization in SentryOtelLayer
  4. ✅ Code quality is good
  5. ✅ Follows repository conventions

Recommendation: Approve with minor changes - address the layer ordering question and add basic testing/verification.


Review performed following CLAUDE.md conventions. No formatting changes made per project guidelines.

@claude
Copy link

claude bot commented Oct 30, 2025

PR Review: Integrate Sentry with Tracing and Axum

I've reviewed this PR that upgrades Sentry integration and consolidates error reporting through the tracing layer. Overall, this is a solid improvement that modernizes the Sentry integration and reduces code duplication. Here are my findings:


Strengths

  1. Unified Error Reporting: Excellent consolidation - removing manual sentry::capture_message calls in favor of automatic capture via the tracing layer is the right approach. This reduces boilerplate and ensures consistent error reporting.

  2. OpenTelemetry Integration: The SentryOtelLayer (traces.rs:79-115) is well-implemented and solves the important problem of correlating Sentry errors with OpenTelemetry traces. This will significantly improve debugging.

  3. Dependency Cleanup: The Cargo.lock changes show good dependency tree consolidation (removal of duplicate nu-ansi-term, regex-automata, etc.).

  4. Axum Middleware: Adding NewSentryLayer to the Axum router (router.rs:72) properly integrates request context and will automatically capture HTTP errors.

  5. Version Upgrade: Sentry 0.37.0 → 0.45.0 brings important improvements including better tower/axum integration.


🔍 Issues & Concerns

Critical: Missing Sentry Dependency in runtime package

Location: engine/packages/runtime/Cargo.toml

The PR adds sentry to the dependencies list but does not add it to the workspace Cargo.toml's members that use it. I can see in the diff that sentry.workspace = true was added to runtime/Cargo.toml, but I want to verify this compiles correctly.

Action: Ensure cargo build -p rivet-runtime succeeds.


Medium: Layer Ordering in Tracing Subscriber

Location: engine/packages/runtime/src/traces.rs:40-45

let registry = registry
    .with(reload_layer)
    .with(otel_metric_layer)
    .with(otel_trace_layer)
    .with(sentry::integrations::tracing::layer())  // ← Added here
    .with(SentryOtelLayer);

Concern: The SentryOtelLayer is placed after the Sentry tracing layer. This means the OTel trace ID replacement happens after Sentry's layer processes events.

Question: Does the Sentry layer capture events in on_event, or does it just collect them? If it captures immediately, the trace context updates in SentryOtelLayer might arrive too late.

Recommendation: Consider reordering to:

.with(SentryOtelLayer)  // Set trace context first
.with(sentry::integrations::tracing::layer())  // Then let Sentry layer capture

Or verify through testing that the current order correctly propagates OTel trace IDs to Sentry.


Medium: Scope Mutation Thread Safety

Location: engine/packages/runtime/src/traces.rs:101-110

sentry::configure_scope(|scope| {
    scope.set_context(
        "trace",
        sentry::protocol::TraceContext { /* ... */ },
    );
});

Concern: configure_scope modifies the current thread's scope. In async Rust with tokio, tasks can migrate between threads. This could lead to:

  1. Trace context being set on the wrong scope
  2. Race conditions if multiple events fire concurrently

Recommendation: Use Hub::current() and verify the scope isolation, or consider using span-local context instead:

sentry::with_scope(|scope| {
    scope.set_context("trace", /* ... */);
    // Capture within this scope if needed
}, || {});

Actually, looking more closely, since this is in on_event, it should be fine as long as Sentry's Hub handles thread-local storage correctly. But worth testing under concurrent load.


Low: Error Removal Without Replacement

Locations:

  • engine/packages/api-builder/src/middleware.rs:147-161 (removed)
  • engine/packages/gasoline/src/ctx/workflow.rs:228-236 (removed)
  • engine/packages/gasoline/src/worker.rs:266-277 (removed)

The manual sentry::capture_message calls were removed, which is correct. However:

Verification Needed: Ensure that the tracing layer automatically captures these errors. Based on the code:

  1. HTTP errors (middleware.rs:142-148): Logged with tracing::error\! → ✅ Should be auto-captured
  2. Workflow errors (ctx/workflow.rs:220): Logged with tracing::error\! → ✅ Should be auto-captured
  3. Unhandled workflow errors (worker.rs:266): Logged with tracing::error\! → ✅ Should be auto-captured

But: The old code set explicit tags like error_code, workflow_id, etc. With structured logging, these should be available as fields, but verify they appear in Sentry.

Action: Test that error metadata (error codes, workflow IDs, HTTP status codes) appears in Sentry events after this change.


🎯 Recommendations

1. Add Integration Tests

Since there are no existing tests in runtime or api-builder packages, consider adding at least a smoke test:

#[test]
fn test_sentry_otel_layer_trace_context() {
    // Verify OTel trace IDs propagate to Sentry
}

2. Document the Integration

Add a comment in traces.rs explaining the purpose of SentryOtelLayer:

/// Custom tracing layer that synchronizes OpenTelemetry trace IDs with Sentry.
/// This ensures that errors reported to Sentry are correlated with distributed
/// traces, enabling cross-system debugging.
struct SentryOtelLayer;

3. Verify Sentry Feature Flags

The new features added:

  • tracing: ✅ Correct
  • tower-axum-matched-path: ✅ Correct for Axum integration

Good choices here.

4. Consider Error Level Filtering

The Sentry tracing layer might capture all error-level logs. Consider adding a filter if you only want specific errors:

.with(
    sentry::integrations::tracing::layer()
        .event_filter(|metadata| {
            metadata.level() >= &tracing::Level::ERROR
        })
)

🔐 Security Considerations

No issues found. The Sentry DSN is already in the codebase, so no new secrets exposed.


Performance Considerations

Minor: The SentryOtelLayer::on_event implementation (traces.rs:86-114) does several allocations:

  • Gets span extensions
  • Clones trace context
  • Calls configure_scope (acquires locks)

This happens on every trace event. Consider:

  1. Early return if Sentry is disabled
  2. Cache the hub lookup
  3. Only process ERROR level events

Suggestion:

fn on_event(&self, event: &tracing::Event<'_>, ctx: Context<'_, S>) {
    // Early return for non-error events
    if \!matches\!(event.metadata().level(), &tracing::Level::ERROR | &tracing::Level::WARN) {
        return;
    }
    
    // Rest of implementation...
}

📋 Test Coverage

Missing: No tests were added for the new SentryOtelLayer functionality.

Recommendation: Add tests or manually verify:

  1. OTel trace IDs appear in Sentry events
  2. Span IDs are correctly propagated
  3. Error metadata (structured fields) appears in Sentry
  4. HTTP errors include matched route paths (from tower-axum-matched-path)

Summary

This PR is a net positive with good architectural decisions. The main concerns are:

  1. ⚠️ Verify layer ordering doesn't cause trace ID issues
  2. ⚠️ Test that error metadata still appears in Sentry
  3. ⚠️ Consider performance optimization in SentryOtelLayer
  4. ✅ Code quality is good
  5. ✅ Follows repository conventions

Recommendation: Approve with minor changes - address the layer ordering question and add basic testing/verification.


Review performed following CLAUDE.md conventions. No formatting changes made per project guidelines.

@claude
Copy link

claude bot commented Oct 30, 2025

PR Review: Sentry Integration with Tracing and Axum

Summary

This PR modernizes the Sentry integration by leveraging Sentry's built-in tracing support instead of manual sentry::capture_message calls. The changes remove redundant error reporting code and integrate Sentry with OpenTelemetry traces.


Positive Changes

1. Cleaner Architecture

  • ✅ Removed manual sentry::capture_message calls in favor of automatic capture via sentry::integrations::tracing::layer()
  • ✅ Reduced code duplication - removed ~35 lines of manual error reporting across multiple files
  • ✅ Better separation of concerns - tracing handles logging, Sentry layer automatically captures errors

2. Dependency Updates

  • ✅ Upgraded Sentry from 0.37.0 to 0.45.0 (8 minor versions)
  • ✅ Added tracing feature to Sentry workspace dependencies
  • ✅ Consolidated dependency versions (e.g., removed duplicate webpki-roots, regex-automata, nu-ansi-term)

3. OpenTelemetry Integration

  • ✅ Custom SentryOtelLayer bridges OTel trace IDs with Sentry trace context
  • ✅ Ensures Sentry errors are correlated with distributed traces
  • ✅ Correctly extracts span context from tracing_opentelemetry::OtelData

Issues & Concerns

1. Critical: Missing Error Capture ⚠️
The PR removes manual Sentry error reporting but relies entirely on the sentry::integrations::tracing::layer(). By default, this layer only captures events at ERROR level or higher. However:

  • In engine/packages/api-builder/src/middleware.rs:142-148, server errors are logged with tracing::error!
  • In engine/packages/gasoline/src/ctx/workflow.rs:220, workflow errors are logged with tracing::error!
  • In engine/packages/gasoline/src/worker.rs:266, unhandled workflow errors are logged with tracing::error!

This should work, but verify that the Sentry layer is capturing these errors in your testing environment.

2. Loss of Structured Context ⚠️
The old code explicitly set tags on Sentry events:

// OLD CODE (removed)
sentry::with_scope(|scope| {
    scope.set_tag("status", status_code);
    scope.set_tag("group", group);
    scope.set_tag("code", code);
    // ...
}, || sentry::capture_message(...));

The new approach relies on tracing's structured fields, but the Sentry tracing layer may not automatically convert all fields to tags. You might lose searchability in Sentry for fields like group, code, status, etc.

Recommendation: Consider using sentry::configure_scope before logging errors to explicitly set important tags:

tracing::error!(
    status=?status_code,
    %group,
    %code,
    %meta,
    %internal,
    "http server error"
);

// Add this to preserve Sentry tags
sentry::configure_scope(|scope| {
    scope.set_tag("error.group", group);
    scope.set_tag("error.code", code);
    scope.set_tag("http.status", status_code);
});

3. Race Condition in SentryOtelLayer 🐛

In engine/packages/runtime/src/traces.rs:86-114, the on_event implementation has a subtle issue:

fn on_event(&self, event: &tracing::Event<'_>, ctx: tracing_subscriber::layer::Context<'_, S>) {
    if let Some(span) = ctx.event_span(event) {
        let extensions = span.extensions();
        // ...
        sentry::configure_scope(|scope| {
            scope.set_context("trace", ...);
        });
    }
}

Problem: sentry::configure_scope modifies global scope state. If multiple events fire concurrently from different spans, they could overwrite each other's trace contexts before Sentry captures them.

Recommendation: Use sentry::Hub::current() with thread-local hubs, or better yet, integrate the trace context into the Sentry event itself rather than relying on scope mutation. Consider this approach:

// Store trace context in span extensions instead of global scope
// Then extract it when the sentry layer processes the event

Alternatively, verify that Sentry's tracing layer already handles this correlation automatically (it might!).

4. Test Coverage
No test changes are included in this PR. Consider adding integration tests to verify:

  • Sentry events are captured correctly from tracing logs
  • OTel trace IDs properly correlate with Sentry events
  • Error metadata is preserved in Sentry reports

5. Migration Dependency 📦
The sentry dependency is now in runtime/Cargo.toml but removed from api-builder/Cargo.toml. Ensure all services that need Sentry error reporting initialize the runtime's tracing subscriber correctly.


Security Considerations

✅ No security issues identified. The changes maintain the same error handling boundaries.


Performance Considerations

Improved: Removed manual scope manipulation and message capturing reduces overhead
✅ The SentryOtelLayer::on_event only runs on events (not every span), so performance impact is minimal
⚠️ Ensure configure_scope calls don't cause contention under high load


Code Quality

Strengths:

  • Follows Rust best practices for tracing integration
  • Properly uses structured logging conventions
  • Clean removal of boilerplate code

Style Compliance:

  • ✅ Uses structured logging (tracing::error!(?err, "message")) per CLAUDE.md guidelines
  • ✅ Lowercase log messages
  • ✅ Maintains existing import patterns

Recommendations

  1. High Priority: Test end-to-end that errors are captured in Sentry with the new integration
  2. High Priority: Verify that critical error metadata (error codes, groups) are searchable in Sentry
  3. Medium Priority: Review the SentryOtelLayer race condition concern
  4. Medium Priority: Add integration tests for Sentry capture
  5. Low Priority: Document the new Sentry integration pattern for team members

Questions for Author

  1. Have you verified in a staging/test environment that errors are captured correctly in Sentry?
  2. Are the OTel trace IDs properly showing up in Sentry's distributed tracing view?
  3. Did you compare the Sentry event payloads before/after to ensure no critical metadata is lost?
  4. Is there a reason sentry was removed from api-builder dependencies but kept in runtime? Does this affect all API error reporting?

Verdict

Approve with minor concerns - This is a well-structured modernization of the Sentry integration. The architectural improvements are sound, but please address the structured context preservation and verify end-to-end functionality before merging.

The core logic is correct, and this reduces maintenance burden significantly. Just ensure the migration is verified in a real environment.

Base automatically changed from 10-29-chore_remove_headers_through_guard to main October 31, 2025 01:15
@NathanFlurry NathanFlurry merged commit 6cde37c into main Oct 31, 2025
7 of 12 checks passed
@NathanFlurry NathanFlurry deleted the 10-29-fix_integrate_sentry_with_tracing_and_axum branch October 31, 2025 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants