feat(preprod): Add size status check rules API#114414
Conversation
Expose the current Size Analysis status check rule configuration through a public preprod project endpoint. Share rule parsing with the runtime status check task so API consumers can evaluate the same thresholds and filters in external CI. Serialize filter queries into a machine-readable shape that preserves runtime grouping semantics, invalid-query behavior, IN and notIn filters, and escaped wildcard literals. Refs EME-1061 Co-Authored-By: OpenAI Codex <noreply@openai.com>
fa64ddd to
387485d
Compare
|
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
Return the enum value directly because RuleArtifactType already exposes the public artifact-type literal. This keeps backend typing from flagging the serializer cast as redundant. Refs EME-1061 Co-Authored-By: OpenAI Codex <noreply@openai.com>
Serialize only simple wildcard filters as startsWith, endsWith, or contains. Keep match-all and complex wildcard patterns as matches/notMatches values so the public API does not expose internal regex strings or mislabel wildcard behavior. Refs EME-1061 Co-Authored-By: OpenAI Codex <noreply@openai.com>
Move the detailed status check rules API contract into the endpoint docs source so the generated API reference owns response semantics, filter matching, and wildcard behavior. Refs EME-1061
| if search_filter.value.is_wildcard(): | ||
| operator: SizeStatusCheckRuleFilterOperator = "matches" | ||
| if search_filter.is_negation: | ||
| operator = _negate_operator(operator) | ||
| return { | ||
| "operator": operator, | ||
| "values": [str(raw_value) for raw_value in _raw_filter_values(search_filter)], | ||
| } |
There was a problem hiding this comment.
Bug: In a mixed IN filter, the presence of a wildcard value prevents escape sequence translation for all other non-wildcard values in the same filter.
Severity: MEDIUM
Suggested Fix
Modify the logic for IN filters within _condition_from_search_filter. Instead of an all-or-nothing check for wildcards, iterate through the filter values. Apply _format_filter_value to each non-wildcard value individually, while leaving wildcard values as-is. This ensures correct escape sequence handling for mixed-content filters.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/sentry/preprod/api/models/public/size_status_check_rules.py#L145-L152
Potential issue: In the `_condition_from_search_filter` function, when processing an
`IN` filter that contains a mix of wildcard and non-wildcard values, the presence of a
single wildcard value causes the entire list to be treated as a wildcard match. This
bypasses the `_format_filter_value` function for all values in the list. As a result,
non-wildcard values that contain escaped characters (e.g., `bar\*baz`) are not
unescaped, and the API returns the raw string with escape characters (`bar\*baz`)
instead of the intended literal value (`bar*baz`), leading to incorrect filter matching.
There was a problem hiding this comment.
Thanks for the check. I think this is intentional for the final API contract rather than a serialization bug.
matches / notMatches values are documented as Sentry wildcard patterns, not decoded literal strings and not Python regexes. Under that contract, \* is the escape sequence for a literal asterisk. So {"operator": "matches", "values": ["bar\\*baz", "*beta*"]} means app_id == "bar*baz" OR app_id matches "*beta*".
That mirrors runtime: wildcard IN values are translated with translate_wildcard per raw value. A raw bar\*baz value becomes an exact match for literal bar*baz, while *beta* becomes the wildcard pattern. If the API decoded bar\*baz to bar*baz while still returning operator: "matches", consumers would interpret it as a wildcard pattern and change the meaning.
So I’m leaving this shape unchanged: exact/simple operators return decoded literal values, while matches / notMatches preserve Sentry wildcard syntax.
runningcode
left a comment
There was a problem hiding this comment.
Just some nits! Looks good but I'm not an expert in this part of the codebase.
| ): | ||
| logger.warning( | ||
| "preprod.status_checks.rules.invalid_rule", | ||
| extra={"project_id": project_id, "rule_id": rule_id}, |
There was a problem hiding this comment.
Warden comment seems legit but I wonder what is the best practice here.
| ) | ||
| def get(self, request: Request, project: Project) -> Response: | ||
| r""" | ||
| Retrieve the current Size Analysis status check rules configured for a project. |
There was a problem hiding this comment.
This is a long docstring. Is that typical for this part of the codebase? Would this be better put in docs somewhere?
There was a problem hiding this comment.
This function was ported along with the comment, decided to keep it.
| metric=metric, | ||
| measurement=measurement, | ||
| value=float(value), | ||
| filter_query=str(filter_query), |
There was a problem hiding this comment.
nit: isn't this already a str? is this redundnat?
Add a public preprod project endpoint for reading the current Size
Analysis status check rule configuration. External CI consumers can use
this as the same source of truth Sentry uses when evaluating size
thresholds, without duplicating rule config outside Sentry.
**Endpoint**
```
GET /api/0/projects/{organization_slug}/{project_slug}/preprod/size-analysis/status-check-rules/
```
**Example**
```json
{
"enabled": true,
"rules": [
{
"id": "12733d00-c8b5-1834-9d42-7f140d5ae079",
"metric": "install_size",
"measurement": "relative_diff",
"value": "10",
"filterQuery": "app_id:com.example.app platform_name:apple",
"filters": [
{
"key": "app_id",
"conditions": [
{
"operator": "equals",
"values": [
"com.example.app"
]
}
]
},
{
"key": "platform_name",
"conditions": [
{
"operator": "equals",
"values": [
"apple"
]
}
]
}
],
"artifactType": "main_artifact"
},
{
"id": "19333983-0893-6abc-cc2a-f75d3df54af5",
"metric": "download_size",
"measurement": "absolute",
"value": "20000000",
"filterQuery": "app_id:com.example.app platform_name:apple",
"filters": [
{
"key": "app_id",
"conditions": [
{
"operator": "equals",
"values": [
"com.example.app"
]
}
]
},
{
"key": "platform_name",
"conditions": [
{
"operator": "equals",
"values": [
"apple"
]
}
]
}
],
"artifactType": "main_artifact"
}
]
}
```
**Filter Semantics**
The response includes both the original `filterQuery` and a
machine-readable `filters` form. Serialized filters preserve runtime
grouping behavior: separate filter objects are ANDed, conditions inside
one filter object are ORed, and same-key positive and negated groups
remain separate.
For example, this response shape:
```json
[
{
"key": "app_id",
"conditions": [
{"operator": "equals", "values": ["com.example.app"]},
{"operator": "startsWith", "values": ["com.example.beta"]}
]
},
{
"key": "platform_name",
"conditions": [{"operator": "equals", "values": ["apple"]}]
},
{
"key": "app_id",
"conditions": [{"operator": "notStartsWith", "values": ["internal"]}]
}
]
```
means:
```text
(app_id = com.example.app OR app_id STARTS WITH com.example.beta)
AND platform_name = apple
AND app_id NOT STARTS WITH internal
```
**Wildcard Serialization**
The status-check runtime evaluates wildcard filters by translating them
to an internal Python regex, but the public API intentionally does not
expose that regex. `matches` and `notMatches` values use the same Sentry
wildcard syntax users configured in the rule: `*` matches zero or more
characters, and escaped `\*` is a literal asterisk.
Simple wildcard patterns are simplified to the most specific operator:
```text
app_id:foo* -> {"operator": "startsWith", "values": ["foo"]}
app_id:*foo -> {"operator": "endsWith", "values": ["foo"]}
app_id:*foo* -> {"operator": "contains", "values": ["foo"]}
```
Complex or match-all wildcard patterns stay as wildcard patterns:
```text
app_id:* -> {"operator": "matches", "values": ["*"]}
app_id:foo*bar -> {"operator": "matches", "values": ["foo*bar"]}
app_id:*foo*bar* -> {"operator": "matches", "values": ["*foo*bar*"]}
!app_id:*internal -> {"operator": "notMatches", "values": ["*internal"]}
```
This is deliberately different from returning the runtime regex form,
for example:
```text
app_id:*foo*bar* -> not returned as {"operator": "matches", "values": ["^.*foo.*bar.*$"]}
```
Keeping wildcard syntax in the API avoids exposing Python regex
internals and gives external CI consumers a portable contract to
implement in their own language. A `matches` value without `*` is
equivalent to an exact match.
**Runtime Parity Edge Cases**
Invalid filter queries serialize as `filters: null`, while valid empty
filters serialize as `filters: []`. `in` and `notIn` stay atomic so
negated list filters keep the same boolean meaning as runtime
evaluation. Escaped wildcard values are decoded as literals in `values`,
so `app_id:\*com` becomes `operator: "equals"` with `values: ["*com"]`.
**Auth Model**
The endpoint requires project read permission and does not allow project
distribution tokens because it exposes status-check rule configuration
rather than build distribution data. Review note: confirm this is the
intended auth model for external CI consumers.
Refs EME-1061
---------
Co-authored-by: OpenAI Codex <noreply@openai.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Add a public preprod project endpoint for reading the current Size Analysis status check rule configuration. External CI consumers can use this as the same source of truth Sentry uses when evaluating size thresholds, without duplicating rule config outside Sentry.
Endpoint
Example
{ "enabled": true, "rules": [ { "id": "12733d00-c8b5-1834-9d42-7f140d5ae079", "metric": "install_size", "measurement": "relative_diff", "value": "10", "filterQuery": "app_id:com.example.app platform_name:apple", "filters": [ { "key": "app_id", "conditions": [ { "operator": "equals", "values": [ "com.example.app" ] } ] }, { "key": "platform_name", "conditions": [ { "operator": "equals", "values": [ "apple" ] } ] } ], "artifactType": "main_artifact" }, { "id": "19333983-0893-6abc-cc2a-f75d3df54af5", "metric": "download_size", "measurement": "absolute", "value": "20000000", "filterQuery": "app_id:com.example.app platform_name:apple", "filters": [ { "key": "app_id", "conditions": [ { "operator": "equals", "values": [ "com.example.app" ] } ] }, { "key": "platform_name", "conditions": [ { "operator": "equals", "values": [ "apple" ] } ] } ], "artifactType": "main_artifact" } ] }Filter Semantics
The response includes both the original
filterQueryand a machine-readablefiltersform. Serialized filters preserve runtime grouping behavior: separate filter objects are ANDed, conditions inside one filter object are ORed, and same-key positive and negated groups remain separate.For example, this response shape:
[ { "key": "app_id", "conditions": [ {"operator": "equals", "values": ["com.example.app"]}, {"operator": "startsWith", "values": ["com.example.beta"]} ] }, { "key": "platform_name", "conditions": [{"operator": "equals", "values": ["apple"]}] }, { "key": "app_id", "conditions": [{"operator": "notStartsWith", "values": ["internal"]}] } ]means:
Wildcard Serialization
The status-check runtime evaluates wildcard filters by translating them to an internal Python regex, but the public API intentionally does not expose that regex.
matchesandnotMatchesvalues use the same Sentry wildcard syntax users configured in the rule:*matches zero or more characters, and escaped\*is a literal asterisk.Simple wildcard patterns are simplified to the most specific operator:
Complex or match-all wildcard patterns stay as wildcard patterns:
This is deliberately different from returning the runtime regex form, for example:
Keeping wildcard syntax in the API avoids exposing Python regex internals and gives external CI consumers a portable contract to implement in their own language. A
matchesvalue without*is equivalent to an exact match.Runtime Parity Edge Cases
Invalid filter queries serialize as
filters: null, while valid empty filters serialize asfilters: [].inandnotInstay atomic so negated list filters keep the same boolean meaning as runtime evaluation. Escaped wildcard values are decoded as literals invalues, soapp_id:\*combecomesoperator: "equals"withvalues: ["*com"].Auth Model
The endpoint requires project read permission and does not allow project distribution tokens because it exposes status-check rule configuration rather than build distribution data. Review note: confirm this is the intended auth model for external CI consumers.
Refs EME-1061