Skip to content

feat(pii): Base SpanData PII on relay-conventions#5997

Merged
loewenheim merged 10 commits into
masterfrom
sebastian/span-data-conventions
May 19, 2026
Merged

feat(pii): Base SpanData PII on relay-conventions#5997
loewenheim merged 10 commits into
masterfrom
sebastian/span-data-conventions

Conversation

@loewenheim
Copy link
Copy Markdown
Contributor

@loewenheim loewenheim commented May 13, 2026

This replaces the default PII value on SpanData with a function that looks up a field's name (as defined by the metastructure annotation) in sentry-conventions. If the name isn't found, the function returns Pii::True to be safe.

It also updates sentry-conventions for getsentry/sentry-conventions@523f1f0.

Additionally, it fixes a very annoying edge case bug in PII string processing: the logic assumes that a null byte can be freely inserted into strings to mark already removed chunks, but some PII regexes do match null bytes. I've corrected one such regex that otherwise would cause a test failure.

Closes INGEST-919.

This replaces the default PII value on `SpanData` with a function that
looks up a field's name (as defined by the `metastructure` annotation)
in `sentry-conventions`. If the name isn't found, the function returns
`Pii::True` to be safe.
@loewenheim loewenheim requested a review from a team as a code owner May 13, 2026 13:19
@loewenheim loewenheim self-assigned this May 13, 2026
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 13, 2026

INGEST-919

Comment thread relay-pii/src/regexes.rs
)
(
[^/\\\r\n]+
[^/\\\r\n\x00]+
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this regex can match a null byte it causes problems with redactions.

"url.full": {
"type": "string",
"value": "https://www.service.io/users/01234-qwerty/settings/98765-adfghj",
"value": "https://www.service.io/users/[user]/settings/98765-adfghj",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute has become pii: "true" in the current conventions version.

"sentry.description": {
"type": "string",
"value": "GET https://www.service.io/users/01234-qwerty/settings/98765-adfghj",
"value": "GET https://www.service.io/users/[user]/settings/98765-adfghj",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is filled from url.full, hence inherits the redaction.

Copy link
Copy Markdown
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change with potentially big impact, are you confident we have sufficient test coverage already?

Comment thread tests/integration/test_spansv2.py
Comment thread relay-event-schema/src/protocol/span.rs
Comment thread CHANGELOG.md Outdated
Comment thread relay-event-schema/src/protocol/span.rs
Comment on lines -597 to +601
match self.pii() {
Pii::True => Some(Cow::Borrowed(&PII_TRUE_FIELD_ATTRS)),
Pii::False => None,
Pii::Maybe => Some(Cow::Borrowed(&PII_MAYBE_FIELD_ATTRS)),
match self.attrs().pii {
PiiMode::Static(Pii::True) => Some(Cow::Borrowed(&PII_TRUE_FIELD_ATTRS)),
PiiMode::Static(Pii::False) => None,
PiiMode::Static(Pii::Maybe) => Some(Cow::Borrowed(&PII_MAYBE_FIELD_ATTRS)),
PiiMode::Dynamic(f) => Some(Cow::Owned(DEFAULT_FIELD_ATTRS.pii_dynamic(f))),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, when deriving the attributes for recursion, we would eagerly resolve a dynamic PII value function into a static value. Now we instead pass the dynamic function through.

Comment thread relay-event-schema/src/protocol/span.rs
Comment thread relay-event-schema/src/protocol/span.rs
@loewenheim
Copy link
Copy Markdown
Contributor Author

Oh, and to actually answer this:

This is a change with potentially big impact, are you confident we have sufficient test coverage already?

No, not necessarily 😞. I would probably extend the test_spandata_conventions test with more cases.

Comment thread relay-event-schema/src/protocol/span.rs
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7fa8c6d. Configure here.

PiiMode::Static(Pii::True) => Some(Cow::Borrowed(&PII_TRUE_FIELD_ATTRS)),
PiiMode::Static(Pii::False) => None,
PiiMode::Static(Pii::Maybe) => Some(Cow::Borrowed(&PII_MAYBE_FIELD_ATTRS)),
PiiMode::Dynamic(f) => Some(Cow::Owned(DEFAULT_FIELD_ATTRS.pii_dynamic(f))),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic PII in inner_attrs over-scrubs nested field values

Medium Severity

The PiiMode::Dynamic(f) arm in inner_attrs passes span_data_pii_from_conventions through to children of complex-typed fields (e.g., gen_ai.input.messages of type Annotated<Value>). When such a field contains a nested object, each sub-key (like "role" or "content") is looked up in conventions via state.keys().next(). Sub-keys not found in conventions get Pii::True, whereas previously (with pii = "maybe" on SpanData) they inherited Pii::Maybe. This upgrades scrubbing from "only specific path selectors" to "generic selectors like $string match," causing unintended data removal in nested structures.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 7fa8c6d. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True in principle, but not really a concern.

@loewenheim loewenheim requested a review from Dav1dde May 19, 2026 10:21
@loewenheim
Copy link
Copy Markdown
Contributor Author

I do believe this is sufficiently tested now. I've also looked over conventions again and found no egregious cases of PII being misconfigured.

Comment thread CHANGELOG.md Outdated
@loewenheim loewenheim enabled auto-merge May 19, 2026 11:24
@loewenheim loewenheim added this pull request to the merge queue May 19, 2026
Merged via the queue into master with commit 60b66c8 May 19, 2026
34 checks passed
@loewenheim loewenheim deleted the sebastian/span-data-conventions branch May 19, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants