fix: Emit error outcome on pre-envelope failure in playstation and minidump#5966
fix: Emit error outcome on pre-envelope failure in playstation and minidump#5966elramen wants to merge 9 commits into
Conversation
5e17b40 to
536fad8
Compare
| "timestamp": time_within_delta(), | ||
| "project_id": 42, | ||
| "outcome": 3, | ||
| "reason": "invalid_multipart", |
There was a problem hiding this comment.
This would ideally have reason item_too_large:attachment:minidump like the attachment outcomes. The cause of this diversion is that the minidump gets dropped in the multipart flow due to being too large, which leads to the attachment outcomes being emitted and then an error is returned to the endpoint where the error-outcome is emitted without knowing the reason for the multipart failure.
Perhaps in the future we can propagate emitted outcomes from the multipart flow so that managed instances at higher levels could use the same discard reason.
| "timestamp": time_within_delta(), | ||
| "project_id": 42, | ||
| "outcome": 3, | ||
| "reason": "missing_prosperodump", |
There was a problem hiding this comment.
Also not ideal. Cause: prosperodump gets dropped in the multipart flow due to being too large. This leads to attachment outcomes for the prosperodump with the correct reason, item_too_large:attachment:prosperodump. This in turn causes a MissingProsperodump error when the endpoint tries to find the prosperodump, which leads to the managed error instance emitting an outcome with reason missing_prosperodump.
I unfortunately did not find a simple fix for this.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5dfa139. Configure here.
| let envelope = envelope(upload_context, &state, meta, content_type, request) | ||
| .await | ||
| .reject(&managed_err)?; | ||
|
|
||
| let envelope = if MINIDUMP_RAW_CONTENT_TYPES.contains(&content_type.as_ref()) { | ||
| raw_minidump_to_envelope(request, meta, &state, content_type, config, upload_context) | ||
| .await? | ||
| } else { | ||
| let multipart = utils::multipart_from_request(request)?; | ||
| multipart_to_envelope(multipart, meta, &state, config, upload_context).await? | ||
| }; | ||
| managed_err.accept(|_| ()); // Now there is an envelope with (DataCategory::Error, 1) |
There was a problem hiding this comment.
When envelope() early-returns with an error after the managed envelope has been created, we now reject both the envelope and the err, right?
Can we change the signature of envelope (and the functions it wraps) to accept a Managed<T> instead of meta and state? Then instead of calling Managed::with_meta_from_request_meta within those functions you could simply .map() on err, and there would never be two Managed at the same time.
There was a problem hiding this comment.
From what I can tell, rejecting both can't happen because as soon as the envelope is created, there are no results/errors that can be returned.
I tried your suggestion anyway to see how it looked like but I have a few concerns:
- We still need
metato create the envelope andstate/configto upload to objectstore so passing aroundmanaged_errwon't eliminate parameters. - Having multiple managed instances at the same time is actually intended: one for
DataCategory::Errorand the others for attachments. Hence, the only otherManaged::with_meta_from_request_meta-call could not be eliminated with amapas we would then lose theDataCategory::Erroroutcome. - Passing around
managed_errrequires us to dorejectin several places to get correct discard reason (like we do for items) which just seems unnecessarily messy.
As long as the envelope can't be dropped before the managed_err is accepted, we don't run the risk of emitting two error outcomes.
Please let me know if I misunderstood something and we can discuss further.
| Self::EmptyBody => DiscardReason::EmptyBody, | ||
| Self::InternalEnvelope => DiscardReason::InternalEnvelope, | ||
| Self::InvalidBody(_) => DiscardReason::InvalidBody, | ||
| Self::InvalidJson(_) => DiscardReason::InvalidJson, | ||
| Self::InvalidMsgpack(_) => DiscardReason::InvalidMsgpack, | ||
| Self::InvalidEnvelope(_) => DiscardReason::InvalidEnvelope, | ||
| Self::InvalidMultipart(_) => DiscardReason::InvalidMultipart, | ||
| Self::InvalidMinidump => DiscardReason::InvalidMinidump, | ||
| Self::MissingMinidump => DiscardReason::MissingMinidump, | ||
| Self::InvalidUnrealReport => DiscardReason::InvalidUnrealReport, | ||
| #[cfg(sentry)] | ||
| Self::InvalidProsperodump => DiscardReason::InvalidProsperodump, | ||
| Self::MissingMinidump => DiscardReason::MissingMinidumpUpload, | ||
| #[cfg(sentry)] | ||
| Self::MissingProsperodump => DiscardReason::MissingProsperodumpUpload, | ||
| Self::Overflow(item_type) => DiscardReason::TooLarge(*item_type), | ||
| _ => DiscardReason::Internal, | ||
| Self::MissingProsperodump => DiscardReason::MissingProsperodump, | ||
| Self::InvalidCompression(_) => DiscardReason::InvalidCompression, | ||
| Self::InvalidEventId => DiscardReason::InvalidEventId, | ||
| Self::QueueFailed(_) => DiscardReason::QueueFailed, | ||
| Self::ItemTooLarge(item_type) => DiscardReason::ItemTooLarge(*item_type), | ||
| Self::RequestTooLarge => DiscardReason::RequestTooLarge, | ||
| Self::RateLimited(_) => DiscardReason::RateLimited, | ||
| Self::EventRejected(discard_reason) => *discard_reason, | ||
| Self::ProjectUnavailable => DiscardReason::ProjectUnavailable, | ||
| Self::ObjectstoreUploadFailed => DiscardReason::ObjectstoreUploadFailed, |
There was a problem hiding this comment.
Adding more precise discard reasons seems like a win, but I really hope we can refactor this to reduce code duplication at some point.
There was a problem hiding this comment.
Yeah, seems way more reasonable to - in the future - have well-scoped errors and implement OutcomeError on these.
| Outcome::Invalid(DiscardReason::ItemTooLarge(too_large_reason)) => Some(Cow::Owned( | ||
| format!("item_too_large:{}", too_large_reason.as_str()), |
There was a problem hiding this comment.
Bug: The test test_minidump_large_attachment_skipped_when_no_project_fetching was not updated to expect the new outcome reason for oversized attachments, causing a test failure.
Severity: LOW
Suggested Fix
Update the assertion in tests/integration/test_minidump.py for the test test_minidump_large_attachment_skipped_when_no_project_fetching to expect the new outcome reason "item_too_large:attachment:attachment".
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: relay-server/src/services/outcome.rs#L213-L214
Potential issue: The change from `DiscardReason::TooLarge` to
`DiscardReason::ItemTooLarge` updated the corresponding outcome reason string from
`"too_large:{}"` to `"item_too_large:{}"`. While most tests were updated to reflect this
change, the integration test
`test_minidump_large_attachment_skipped_when_no_project_fetching` was missed. This test
still asserts the old outcome reason `"too_large:attachment:attachment"` and will fail
because the new code now generates `"item_too_large:attachment:attachment"`. This
represents an incomplete update of the test suite to match the intended production code
changes.
Also affects:
tests/integration/test_minidump.py:1352tests/integration/test_minidump.py:1359
Did we get this right? 👍 / 👎 to inform future reviews.

Emit a
(DataCategory::Error, 1)outcome if something goes wrong before the envelope has been created in the playstation and minidump endpoints. This ensures that the customer gets an outcome when we haven't even started parsing the multipart yet, for example, due to config-fetch failures. The additional outcomes are emitted usingreject, which is the reason why this PR homogenizesBadStoreRequestandDiscardReasonvariants. In addition to completeness, making these two enums more similar is also a step towards removingBadStoreRequestor inverting the conversion order between the two as discussed here.Fixes https://linear.app/getsentry/issue/INGEST-902/impl-outcomeerror-for-badstorerequest