feat(attachment): Add verification logic for attachment placeholders#5747
feat(attachment): Add verification logic for attachment placeholders#5747tobias-wilfert wants to merge 16 commits intomasterfrom
Conversation
| fn produce_attachment_ref( | ||
| &self, | ||
| event_id: EventId, | ||
| project_id: ProjectId, | ||
| item: &Item, | ||
| send_individual_attachments: bool, |
There was a problem hiding this comment.
Logic inspired by that of produce_attachment.
| // NOTE: Using the received timestamp here breaks tests without a pop-relay. | ||
| let location = signed_location | ||
| .verify(chrono::Utc::now(), config) | ||
| .map_err(|_| ProcessingError::InvalidAttachmentRef)?; |
There was a problem hiding this comment.
Potentially using chrono::Utc::now() might cause some issues if the palceholder gets spooled? Not sure if we worry about this.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Placeholder deserialization rejects non-enum content types
- Changed AttachmentPlaceholder content_type field from Option to Option to accept arbitrary MIME types like 'image/png' and 'application/pdf', consistent with regular attachments.
Or push these changes by commenting:
@cursor push 98ef05f070
Preview (98ef05f070)
diff --git a/relay-server/src/endpoints/playstation.rs b/relay-server/src/endpoints/playstation.rs
--- a/relay-server/src/endpoints/playstation.rs
+++ b/relay-server/src/endpoints/playstation.rs
@@ -2,7 +2,6 @@
//!
//! Crashes are received as multipart uploads in this [format](https://game.develop.playstation.net/resources/documents/SDK/12.000/Core_Dump_System-Overview/ps5-core-dump-file-set-sending-format.html).
use std::io;
-use std::str::FromStr;
use axum::extract::{DefaultBodyLimit, Request};
use axum::response::IntoResponse;
@@ -127,7 +126,7 @@
&& let Ok(location) = location.to_str()
&& let Ok(payload) = serde_json::to_vec(&AttachmentPlaceholder {
location,
- content_type: content_type.and_then(|ct| ContentType::from_str(ct.as_ref()).ok()),
+ content_type: content_type.map(|ct| ct.to_string()),
})
{
item.set_payload(ContentType::AttachmentRef, payload);
diff --git a/relay-server/src/envelope/attachment.rs b/relay-server/src/envelope/attachment.rs
--- a/relay-server/src/envelope/attachment.rs
+++ b/relay-server/src/envelope/attachment.rs
@@ -2,8 +2,6 @@
use serde::{Deserialize, Serialize};
-use crate::envelope::ContentType;
-
/// The type of an event attachment.
///
/// These item types must align with the Sentry processing pipeline.
@@ -83,7 +81,7 @@
#[serde(borrow)]
pub location: &'a str,
#[serde(skip_serializing_if = "Option::is_none")]
- pub content_type: Option<ContentType>,
+ pub content_type: Option<String>,
}
#[derive(Debug)]
@@ -113,3 +111,53 @@
AttachmentType,
"an attachment type (see sentry develop docs)"
);
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+
+ #[test]
+ fn test_placeholder_with_arbitrary_content_type() {
+ // Test that placeholders can accept any content type string, including
+ // unrecognized MIME types like "image/png"
+ let json = r#"{"location":"https://example.com/file","content_type":"image/png"}"#;
+ let placeholder: AttachmentPlaceholder = serde_json::from_str(json).unwrap();
+ assert_eq!(placeholder.location, "https://example.com/file");
+ assert_eq!(placeholder.content_type, Some("image/png".to_string()));
+ }
+
+ #[test]
+ fn test_placeholder_with_various_content_types() {
+ // Test various content types including PDF, video, etc.
+ let test_cases = vec![
+ "application/pdf",
+ "video/mp4",
+ "audio/mpeg",
+ "image/jpeg",
+ "text/html",
+ ];
+
+ for content_type in test_cases {
+ let json = format!(
+ r#"{{"location":"https://example.com/file","content_type":"{}"}}"#,
+ content_type
+ );
+ let placeholder: AttachmentPlaceholder = serde_json::from_str(&json).unwrap();
+ assert_eq!(
+ placeholder.content_type,
+ Some(content_type.to_string()),
+ "Failed to deserialize content_type: {}",
+ content_type
+ );
+ }
+ }
+
+ #[test]
+ fn test_placeholder_without_content_type() {
+ // Test that content_type is optional
+ let json = r#"{"location":"https://example.com/file"}"#;
+ let placeholder: AttachmentPlaceholder = serde_json::from_str(json).unwrap();
+ assert_eq!(placeholder.location, "https://example.com/file");
+ assert_eq!(placeholder.content_type, None);
+ }
+}
diff --git a/relay-server/src/services/store.rs b/relay-server/src/services/store.rs
--- a/relay-server/src/services/store.rs
+++ b/relay-server/src/services/store.rs
@@ -987,7 +987,7 @@
id: Uuid::new_v4().to_string(),
name: item.filename().unwrap_or(UNNAMED_ATTACHMENT).to_owned(),
rate_limited: item.rate_limited(),
- content_type: placeholder.content_type.map(|c| c.as_str().to_owned()),
+ content_type: placeholder.content_type.map(|s| s.to_ascii_lowercase()),
attachment_type: item.attachment_type().unwrap_or_default(),
size: item.attachment_body_size(),
payload: AttachmentPayload::Stored(store_key),This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
| #[derive(Serialize)] | ||
| #[derive(Serialize, Deserialize)] | ||
| pub struct AttachmentPlaceholder<'a> { | ||
| #[serde(borrow)] |
There was a problem hiding this comment.
Placeholder deserialization rejects non-enum content types
Medium Severity
The content_type field in AttachmentPlaceholder is Option<ContentType>, but ContentType is an enum that only supports a small set of known MIME types (e.g., text/plain, application/json, application/octet-stream). When serde_json::from_slice encounters an unrecognized content type string like "image/png" or "application/pdf", deserialization of the entire placeholder fails, causing the attachment to be rejected as invalid. Regular attachments accept any content type via raw_content_type() as a free-form string. Using Option<String> here would be consistent with how regular attachments handle content types.
Additional Locations (2)
There was a problem hiding this comment.
This is true, but it's an issue that may have been introduced by #5671. So I wouldn't change it in this PR.
There was a problem hiding this comment.
produce_attachment and produce_attachment_ref split in the beginning depending on whether the item is an "actual" attachment or an attachment placeholder, but they end the same starting at if send_individual_attachments …. A different way to factor this might be like this:
fn produce_attachment(…) -> Result<Option<ChunkedAttachment>, StoreError> {
let chunked_attachment = if item.is_attachment_ref() {
chunked_attachment_from_placeholder(item)
} else {
chunked_attachment_from_attachment(item)
};
if send_individual_attachments {
let message = KafkaMessage::Attachment(AttachmentKafkaMessage {
event_id,
project_id,
chunked_attachment,
});
self.produce(KafkaTopic::Attachments, message)?;
Ok(None)
} else {
Ok(Some(chunked_attachment))
}
}where chunked_attachment_from_placeholder and chunked_attachment_from_attachment are two new functions that express the logical differences between the current produce_attachment and produce_attachment_ref. WDYT @tobias-wilfert and @jjbayer ?
There was a problem hiding this comment.
Sure we can do that.
jjbayer
left a comment
There was a problem hiding this comment.
Looks good, there's no rate limiting yet or did I miss that? Can be a separate PR.
| #[derive(Serialize)] | ||
| #[derive(Serialize, Deserialize)] | ||
| pub struct AttachmentPlaceholder<'a> { | ||
| #[serde(borrow)] |
There was a problem hiding this comment.
This is true, but it's an issue that may have been introduced by #5671. So I wouldn't change it in this PR.
| async fn process( | ||
| &self, | ||
| attachments: Managed<Self::Input>, | ||
| #[allow(unused_mut)] mut attachments: Managed<Self::Input>, |
There was a problem hiding this comment.
I would rather follow's @Dav1dde 's advice here and only use #[cfg(feature = "processing")] if absolutely necessary. I.e. remove it from validate_attachments and its calls. It is already guarded internally by ctx.is_processing().
There was a problem hiding this comment.
Ok so rather move the #[cfg(feature = "processing")] inside the internals that call the other function that is #[cfg(feature = "processing")] ?
There was a problem hiding this comment.
Yes, basically prevent it from spreading further out than necessary.
| pub fn validate_attachments(transaction: &mut Managed<Box<ExpandedTransaction>>, ctx: Context<'_>) { | ||
| if !ctx.is_processing() { | ||
| return; | ||
| } | ||
|
|
||
| transaction.modify(|transaction, records| { | ||
| transaction.attachments.retain_mut(|attachment| { | ||
| match utils::attachments::validate(attachment, ctx.config) { | ||
| Ok(()) => true, | ||
| Err(err) => { | ||
| records.reject_err(err, &*attachment); | ||
| false | ||
| } | ||
| } | ||
| }); | ||
| }); | ||
| } |
There was a problem hiding this comment.
I feel like it should be possible to generalize this function, something like
fn validate_attachments<T, F>(parent: T, getter: F, ctx: Context<'_>)
where
T: Managed,
F: FnOnce(&mut T) -> &mut impl RetainMut<Item>Co-authored-by: Sebastian Zivota <loewenheim@users.noreply.github.com>
Co-authored-by: Sebastian Zivota <loewenheim@users.noreply.github.com>
Co-authored-by: Joris Bayer <joris.bayer@sentry.io>
Co-authored-by: Joris Bayer <joris.bayer@sentry.io>
As far as I could tell the enforce quotas should work "out of the box" for attachment items (hence I left them as Items) but not sure we need more? |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| #[error("failed to store event because event id was missing")] | ||
| NoEventId, | ||
| #[error("invalid attachment reference")] | ||
| InvalidAttachmentRef, |
There was a problem hiding this comment.
Store error maps to wrong discard reason
Low Severity
StoreError::InvalidAttachmentRef is mapped to DiscardReason::Internal by the blanket OutcomeError impl for StoreError, rather than to the dedicated DiscardReason::InvalidAttachmentRef that was specifically created for this purpose. While the validation step typically catches errors first, if produce_attachment_ref fails independently (e.g., signature timeout), the outcome will be reported with the wrong discard reason.
Additional Locations (1)
You are right, I forgot that I already added an integration test for that part: #5653 |
1 similar comment
You are right, I forgot that I already added an integration test for that part: #5653 |



Adds logic to verify attachment placeholders (to ensure that the self reported size is indeed correct) and to correctly store them (send the correct kafka message for them and never upload to the object store).
ref: https://linear.app/getsentry/issue/INGEST-726/accept-envelope-placeholders-in-relay
Follow up: Clean up the validation functions once the
retain_mutis relaxed