Skip to content

feat(attachment): Add verification logic for attachment placeholders#5747

Open
tobias-wilfert wants to merge 16 commits intomasterfrom
tobias-wilfert/feat/envelope-placeholder
Open

feat(attachment): Add verification logic for attachment placeholders#5747
tobias-wilfert wants to merge 16 commits intomasterfrom
tobias-wilfert/feat/envelope-placeholder

Conversation

@tobias-wilfert
Copy link
Copy Markdown
Member

@tobias-wilfert tobias-wilfert commented Mar 19, 2026

Adds logic to verify attachment placeholders (to ensure that the self reported size is indeed correct) and to correctly store them (send the correct kafka message for them and never upload to the object store).

ref: https://linear.app/getsentry/issue/INGEST-726/accept-envelope-placeholders-in-relay

Follow up: Clean up the validation functions once the retain_mut is relaxed

@tobias-wilfert tobias-wilfert self-assigned this Mar 19, 2026
@linear-code
Copy link
Copy Markdown

linear-code bot commented Mar 19, 2026

Comment on lines +964 to +969
fn produce_attachment_ref(
&self,
event_id: EventId,
project_id: ProjectId,
item: &Item,
send_individual_attachments: bool,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic inspired by that of produce_attachment.

@tobias-wilfert tobias-wilfert changed the title WIP: Accept envelope placeholders in Relay feat(attachment): Add verification logic for attachment placeholders Mar 23, 2026
Comment on lines +28 to +31
// NOTE: Using the received timestamp here breaks tests without a pop-relay.
let location = signed_location
.verify(chrono::Utc::now(), config)
.map_err(|_| ProcessingError::InvalidAttachmentRef)?;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially using chrono::Utc::now() might cause some issues if the palceholder gets spooled? Not sure if we worry about this.

@tobias-wilfert tobias-wilfert marked this pull request as ready for review March 23, 2026 13:08
@tobias-wilfert tobias-wilfert requested a review from a team as a code owner March 23, 2026 13:08
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Placeholder deserialization rejects non-enum content types
    • Changed AttachmentPlaceholder content_type field from Option to Option to accept arbitrary MIME types like 'image/png' and 'application/pdf', consistent with regular attachments.

Create PR

Or push these changes by commenting:

@cursor push 98ef05f070
Preview (98ef05f070)
diff --git a/relay-server/src/endpoints/playstation.rs b/relay-server/src/endpoints/playstation.rs
--- a/relay-server/src/endpoints/playstation.rs
+++ b/relay-server/src/endpoints/playstation.rs
@@ -2,7 +2,6 @@
 //!
 //! Crashes are received as multipart uploads in this [format](https://game.develop.playstation.net/resources/documents/SDK/12.000/Core_Dump_System-Overview/ps5-core-dump-file-set-sending-format.html).
 use std::io;
-use std::str::FromStr;
 
 use axum::extract::{DefaultBodyLimit, Request};
 use axum::response::IntoResponse;
@@ -127,7 +126,7 @@
             && let Ok(location) = location.to_str()
             && let Ok(payload) = serde_json::to_vec(&AttachmentPlaceholder {
                 location,
-                content_type: content_type.and_then(|ct| ContentType::from_str(ct.as_ref()).ok()),
+                content_type: content_type.map(|ct| ct.to_string()),
             })
         {
             item.set_payload(ContentType::AttachmentRef, payload);

diff --git a/relay-server/src/envelope/attachment.rs b/relay-server/src/envelope/attachment.rs
--- a/relay-server/src/envelope/attachment.rs
+++ b/relay-server/src/envelope/attachment.rs
@@ -2,8 +2,6 @@
 
 use serde::{Deserialize, Serialize};
 
-use crate::envelope::ContentType;
-
 /// The type of an event attachment.
 ///
 /// These item types must align with the Sentry processing pipeline.
@@ -83,7 +81,7 @@
     #[serde(borrow)]
     pub location: &'a str,
     #[serde(skip_serializing_if = "Option::is_none")]
-    pub content_type: Option<ContentType>,
+    pub content_type: Option<String>,
 }
 
 #[derive(Debug)]
@@ -113,3 +111,53 @@
     AttachmentType,
     "an attachment type (see sentry develop docs)"
 );
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_placeholder_with_arbitrary_content_type() {
+        // Test that placeholders can accept any content type string, including
+        // unrecognized MIME types like "image/png"
+        let json = r#"{"location":"https://example.com/file","content_type":"image/png"}"#;
+        let placeholder: AttachmentPlaceholder = serde_json::from_str(json).unwrap();
+        assert_eq!(placeholder.location, "https://example.com/file");
+        assert_eq!(placeholder.content_type, Some("image/png".to_string()));
+    }
+
+    #[test]
+    fn test_placeholder_with_various_content_types() {
+        // Test various content types including PDF, video, etc.
+        let test_cases = vec![
+            "application/pdf",
+            "video/mp4",
+            "audio/mpeg",
+            "image/jpeg",
+            "text/html",
+        ];
+
+        for content_type in test_cases {
+            let json = format!(
+                r#"{{"location":"https://example.com/file","content_type":"{}"}}"#,
+                content_type
+            );
+            let placeholder: AttachmentPlaceholder = serde_json::from_str(&json).unwrap();
+            assert_eq!(
+                placeholder.content_type,
+                Some(content_type.to_string()),
+                "Failed to deserialize content_type: {}",
+                content_type
+            );
+        }
+    }
+
+    #[test]
+    fn test_placeholder_without_content_type() {
+        // Test that content_type is optional
+        let json = r#"{"location":"https://example.com/file"}"#;
+        let placeholder: AttachmentPlaceholder = serde_json::from_str(json).unwrap();
+        assert_eq!(placeholder.location, "https://example.com/file");
+        assert_eq!(placeholder.content_type, None);
+    }
+}

diff --git a/relay-server/src/services/store.rs b/relay-server/src/services/store.rs
--- a/relay-server/src/services/store.rs
+++ b/relay-server/src/services/store.rs
@@ -987,7 +987,7 @@
             id: Uuid::new_v4().to_string(),
             name: item.filename().unwrap_or(UNNAMED_ATTACHMENT).to_owned(),
             rate_limited: item.rate_limited(),
-            content_type: placeholder.content_type.map(|c| c.as_str().to_owned()),
+            content_type: placeholder.content_type.map(|s| s.to_ascii_lowercase()),
             attachment_type: item.attachment_type().unwrap_or_default(),
             size: item.attachment_body_size(),
             payload: AttachmentPayload::Stored(store_key),

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

#[derive(Serialize)]
#[derive(Serialize, Deserialize)]
pub struct AttachmentPlaceholder<'a> {
#[serde(borrow)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placeholder deserialization rejects non-enum content types

Medium Severity

The content_type field in AttachmentPlaceholder is Option<ContentType>, but ContentType is an enum that only supports a small set of known MIME types (e.g., text/plain, application/json, application/octet-stream). When serde_json::from_slice encounters an unrecognized content type string like "image/png" or "application/pdf", deserialization of the entire placeholder fails, causing the attachment to be rejected as invalid. Regular attachments accept any content type via raw_content_type() as a free-form string. Using Option<String> here would be consistent with how regular attachments handle content types.

Additional Locations (2)
Fix in Cursor Fix in Web

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but it's an issue that may have been introduced by #5671. So I wouldn't change it in this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

produce_attachment and produce_attachment_ref split in the beginning depending on whether the item is an "actual" attachment or an attachment placeholder, but they end the same starting at if send_individual_attachments …. A different way to factor this might be like this:

fn produce_attachment() -> Result<Option<ChunkedAttachment>, StoreError> {
    let chunked_attachment = if item.is_attachment_ref() {
        chunked_attachment_from_placeholder(item)
    } else {
        chunked_attachment_from_attachment(item)
    };

    if send_individual_attachments {
            let message = KafkaMessage::Attachment(AttachmentKafkaMessage {
                event_id,
                project_id,
                chunked_attachment,
            });
            self.produce(KafkaTopic::Attachments, message)?;
            Ok(None)
        } else {
            Ok(Some(chunked_attachment))
        }
}

where chunked_attachment_from_placeholder and chunked_attachment_from_attachment are two new functions that express the logical differences between the current produce_attachment and produce_attachment_ref. WDYT @tobias-wilfert and @jjbayer ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can do that.

Copy link
Copy Markdown
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, there's no rate limiting yet or did I miss that? Can be a separate PR.

#[derive(Serialize)]
#[derive(Serialize, Deserialize)]
pub struct AttachmentPlaceholder<'a> {
#[serde(borrow)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but it's an issue that may have been introduced by #5671. So I wouldn't change it in this PR.

async fn process(
&self,
attachments: Managed<Self::Input>,
#[allow(unused_mut)] mut attachments: Managed<Self::Input>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather follow's @Dav1dde 's advice here and only use #[cfg(feature = "processing")] if absolutely necessary. I.e. remove it from validate_attachments and its calls. It is already guarded internally by ctx.is_processing().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so rather move the #[cfg(feature = "processing")] inside the internals that call the other function that is #[cfg(feature = "processing")] ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, basically prevent it from spreading further out than necessary.

Comment on lines +453 to +469
pub fn validate_attachments(transaction: &mut Managed<Box<ExpandedTransaction>>, ctx: Context<'_>) {
if !ctx.is_processing() {
return;
}

transaction.modify(|transaction, records| {
transaction.attachments.retain_mut(|attachment| {
match utils::attachments::validate(attachment, ctx.config) {
Ok(()) => true,
Err(err) => {
records.reject_err(err, &*attachment);
false
}
}
});
});
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it should be possible to generalize this function, something like

fn validate_attachments<T, F>(parent: T, getter: F, ctx: Context<'_>)
where
    T: Managed,
    F: FnOnce(&mut T) -> &mut impl RetainMut<Item>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it.

tobias-wilfert and others added 4 commits March 24, 2026 10:12
Co-authored-by: Sebastian Zivota <loewenheim@users.noreply.github.com>
Co-authored-by: Sebastian Zivota <loewenheim@users.noreply.github.com>
Co-authored-by: Joris Bayer <joris.bayer@sentry.io>
Co-authored-by: Joris Bayer <joris.bayer@sentry.io>
@tobias-wilfert
Copy link
Copy Markdown
Member Author

Looks good, there's no rate limiting yet or did I miss that? Can be a separate PR.

As far as I could tell the enforce quotas should work "out of the box" for attachment items (hence I left them as Items) but not sure we need more?

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

#[error("failed to store event because event id was missing")]
NoEventId,
#[error("invalid attachment reference")]
InvalidAttachmentRef,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store error maps to wrong discard reason

Low Severity

StoreError::InvalidAttachmentRef is mapped to DiscardReason::Internal by the blanket OutcomeError impl for StoreError, rather than to the dedicated DiscardReason::InvalidAttachmentRef that was specifically created for this purpose. While the validation step typically catches errors first, if produce_attachment_ref fails independently (e.g., signature timeout), the outcome will be reported with the wrong discard reason.

Additional Locations (1)
Fix in Cursor Fix in Web

@jjbayer
Copy link
Copy Markdown
Member

jjbayer commented Mar 24, 2026

Looks good, there's no rate limiting yet or did I miss that? Can be a separate PR.

As far as I could tell the enforce quotas should work "out of the box" for attachment items (hence I left them as Items) but not sure we need more?

You are right, I forgot that I already added an integration test for that part: #5653

1 similar comment
@jjbayer
Copy link
Copy Markdown
Member

jjbayer commented Mar 24, 2026

Looks good, there's no rate limiting yet or did I miss that? Can be a separate PR.

As far as I could tell the enforce quotas should work "out of the box" for attachment items (hence I left them as Items) but not sure we need more?

You are right, I forgot that I already added an integration test for that part: #5653

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants