Skip to content

Conversation

floels
Copy link
Contributor

@floels floels commented Aug 14, 2025

Goal

The goal of this PR is to fix the flakiness reported in GH-97781. The unit test test_uptime_root_tree_with_orphaned_spans in tests/snuba/api/endpoints/test_organization_trace.py is randomly failing in the CI with this error:

AssertionError: Span 0 differs (excluding children)
assert {'additional_...06400.15, ...} == {'additional_...906400.3, ...}
  
  Omitting 15 identical items, use -vv to show
  Differing items:
  {'duration': 150.0} != {'duration': 300.0}
  {'description': 'Uptime Check [success] - https://www.example.com/ (200)'} != {'description': 'Uptime Check [success] - https://example.com/ (301)'}
...

Explanation of the flakiness

The failing assertion is:

self.assert_expected_results(
data,
[redirect_result, final_result],
expected_children_ids=["root", "/transaction/orphan"],
)

assert_expected_results is a helper method defined in the test class which asserts that the API response matches expected results from input trace items. To do this, it sorts both the API response and the input trace item with a (guid, request_sequence) custom sort key:

def sort_key(item):
guid = (
item.attributes.get("guid", ProtoAttributeValue(val_str="")).string_value
if hasattr(item, "attributes")
else item.get("event_id", "")
)
seq = (
item.attributes.get("request_sequence", ProtoAttributeValue(val_int=0)).int_value
if hasattr(item, "attributes")
else item.get("additional_attributes", {}).get("request_sequence", 0)
)
return guid, seq

The issue is that the API response items don't have an attributes key but only an additional_attributes key. In the sort_key method, their guid will therefore default to their event_id, which is a randomly-generated UUID. While input trace items ([redirect_result, final_result]) will be ordered deterministically (based on their request_sequence, since they have the same guid of check-123), API response items will be ordered randomly based on their random UUID, making the assertion fail randomly (because the assertion doesn't compare related items).

You can easily reproduce the failure locally by running the test 10 times in a row:

for i in {1..10}; do pytest tests/snuba/api/endpoints/test_organization_trace.py::OrganizationEventsTraceEndpointTest::test_with_uptime_results ||
  exit 1; done

It should fail at least once.

Fix

The fix is simple: rely on additional_attributes["guid"] to sort the API response items. Incidentally, it's exactly what is done to get the sequence number:

seq = (
item.attributes.get("request_sequence", ProtoAttributeValue(val_int=0)).int_value
if hasattr(item, "attributes")
else item.get("additional_attributes", {}).get("request_sequence", 0)
)

We just need to use the same syntax.

Reproduction

Run the test 10 times in a row locally (same command as above) and see that it now passes consistently.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@floels floels requested a review from a team as a code owner August 14, 2025 08:46
@floels floels force-pushed the fix-flaky-uptime-trace-test branch from 3e4e0d7 to e1fd8e0 Compare August 14, 2025 09:32
The assert_expected_results method was using inconsistent sort keys for API
results (event_id) vs expected results (guid), causing random test failures.
Fix by using the same guid field from additional_attributes for both.

Fixes getsentryGH-97781
@floels floels force-pushed the fix-flaky-uptime-trace-test branch from e1fd8e0 to 38373e4 Compare August 14, 2025 11:37
@asottile-sentry asottile-sentry added the Trigger: getsentry tests Once code is reviewed: apply label to PR to trigger getsentry tests label Aug 19, 2025
Copy link
Contributor

@asottile-sentry asottile-sentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will merge after hackweek

@asottile-sentry asottile-sentry enabled auto-merge (squash) August 19, 2025 13:37
@asottile-sentry asottile-sentry merged commit 8364faf into getsentry:master Aug 25, 2025
59 of 61 checks passed
lzhao-sentry pushed a commit that referenced this pull request Aug 25, 2025
# Goal

The goal of this PR is to fix the flakiness reported in GH-97781. The
unit test `test_uptime_root_tree_with_orphaned_spans` in
`tests/snuba/api/endpoints/test_organization_trace.py` is randomly
failing in the CI with this error:

```
AssertionError: Span 0 differs (excluding children)
assert {'additional_...06400.15, ...} == {'additional_...906400.3, ...}
  
  Omitting 15 identical items, use -vv to show
  Differing items:
  {'duration': 150.0} != {'duration': 300.0}
  {'description': 'Uptime Check [success] - https://www.example.com/ (200)'} != {'description': 'Uptime Check [success] - https://example.com/ (301)'}
...
```

# Explanation of the flakiness

The failing assertion is:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L669-L673

`assert_expected_results` is a helper method defined in the test class
which asserts that the API response matches expected results from input
trace items. To do this, it sorts both the API response and the input
trace item with a `(guid, request_sequence)` custom sort key:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L465-L476

The issue is that the API response items don't have an `attributes` key
but only an `additional_attributes` key. In the `sort_key` method, their
`guid` will therefore default to their `event_id`, which is a
randomly-generated UUID. While input trace items (`[redirect_result,
final_result]`) will be ordered deterministically (based on their
`request_sequence`, since they have the same `guid` of `check-123`), API
response items will be ordered randomly based on their random UUID,
making the assertion fail randomly (because the assertion doesn't
compare related items).

You can easily reproduce the failure locally by running the test 10
times in a row:

```
for i in {1..10}; do pytest tests/snuba/api/endpoints/test_organization_trace.py::OrganizationEventsTraceEndpointTest::test_with_uptime_results ||
  exit 1; done
```

It should fail at least once.

# Fix

The fix is simple: rely on `additional_attributes["guid"]` to sort the
API response items. Incidentally, it's exactly what is done to get the
sequence number:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L471-L475

We just need to use the same syntax.

# Reproduction

Run the test 10 times in a row locally (same command as above) and see
that it now passes consistently.

# Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated
in the State of Delaware in 2015 as Functional Software, Inc. and is
gonna need some rights from me in order to utilize my contributions in
this here PR. So here's the deal: I retain all rights, title and
interest in and to my contributions, and by keeping this boilerplate
intact I confirm that Sentry can use, modify, copy, and redistribute my
contributions, under Sentry's choice of terms.
andrewshie-sentry pushed a commit that referenced this pull request Aug 26, 2025
# Goal

The goal of this PR is to fix the flakiness reported in GH-97781. The
unit test `test_uptime_root_tree_with_orphaned_spans` in
`tests/snuba/api/endpoints/test_organization_trace.py` is randomly
failing in the CI with this error:

```
AssertionError: Span 0 differs (excluding children)
assert {'additional_...06400.15, ...} == {'additional_...906400.3, ...}
  
  Omitting 15 identical items, use -vv to show
  Differing items:
  {'duration': 150.0} != {'duration': 300.0}
  {'description': 'Uptime Check [success] - https://www.example.com/ (200)'} != {'description': 'Uptime Check [success] - https://example.com/ (301)'}
...
```

# Explanation of the flakiness

The failing assertion is:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L669-L673

`assert_expected_results` is a helper method defined in the test class
which asserts that the API response matches expected results from input
trace items. To do this, it sorts both the API response and the input
trace item with a `(guid, request_sequence)` custom sort key:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L465-L476

The issue is that the API response items don't have an `attributes` key
but only an `additional_attributes` key. In the `sort_key` method, their
`guid` will therefore default to their `event_id`, which is a
randomly-generated UUID. While input trace items (`[redirect_result,
final_result]`) will be ordered deterministically (based on their
`request_sequence`, since they have the same `guid` of `check-123`), API
response items will be ordered randomly based on their random UUID,
making the assertion fail randomly (because the assertion doesn't
compare related items).

You can easily reproduce the failure locally by running the test 10
times in a row:

```
for i in {1..10}; do pytest tests/snuba/api/endpoints/test_organization_trace.py::OrganizationEventsTraceEndpointTest::test_with_uptime_results ||
  exit 1; done
```

It should fail at least once.

# Fix

The fix is simple: rely on `additional_attributes["guid"]` to sort the
API response items. Incidentally, it's exactly what is done to get the
sequence number:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L471-L475

We just need to use the same syntax.

# Reproduction

Run the test 10 times in a row locally (same command as above) and see
that it now passes consistently.

# Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated
in the State of Delaware in 2015 as Functional Software, Inc. and is
gonna need some rights from me in order to utilize my contributions in
this here PR. So here's the deal: I retain all rights, title and
interest in and to my contributions, and by keeping this boilerplate
intact I confirm that Sentry can use, modify, copy, and redistribute my
contributions, under Sentry's choice of terms.
constantinius pushed a commit that referenced this pull request Sep 1, 2025
# Goal

The goal of this PR is to fix the flakiness reported in GH-97781. The
unit test `test_uptime_root_tree_with_orphaned_spans` in
`tests/snuba/api/endpoints/test_organization_trace.py` is randomly
failing in the CI with this error:

```
AssertionError: Span 0 differs (excluding children)
assert {'additional_...06400.15, ...} == {'additional_...906400.3, ...}
  
  Omitting 15 identical items, use -vv to show
  Differing items:
  {'duration': 150.0} != {'duration': 300.0}
  {'description': 'Uptime Check [success] - https://www.example.com/ (200)'} != {'description': 'Uptime Check [success] - https://example.com/ (301)'}
...
```

# Explanation of the flakiness

The failing assertion is:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L669-L673

`assert_expected_results` is a helper method defined in the test class
which asserts that the API response matches expected results from input
trace items. To do this, it sorts both the API response and the input
trace item with a `(guid, request_sequence)` custom sort key:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L465-L476

The issue is that the API response items don't have an `attributes` key
but only an `additional_attributes` key. In the `sort_key` method, their
`guid` will therefore default to their `event_id`, which is a
randomly-generated UUID. While input trace items (`[redirect_result,
final_result]`) will be ordered deterministically (based on their
`request_sequence`, since they have the same `guid` of `check-123`), API
response items will be ordered randomly based on their random UUID,
making the assertion fail randomly (because the assertion doesn't
compare related items).

You can easily reproduce the failure locally by running the test 10
times in a row:

```
for i in {1..10}; do pytest tests/snuba/api/endpoints/test_organization_trace.py::OrganizationEventsTraceEndpointTest::test_with_uptime_results ||
  exit 1; done
```

It should fail at least once.

# Fix

The fix is simple: rely on `additional_attributes["guid"]` to sort the
API response items. Incidentally, it's exactly what is done to get the
sequence number:


https://github.com/getsentry/sentry/blob/3fa39f0740f17dab4c934b352525dbb5978e8f6e/tests/snuba/api/endpoints/test_organization_trace.py#L471-L475

We just need to use the same syntax.

# Reproduction

Run the test 10 times in a row locally (same command as above) and see
that it now passes consistently.

# Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated
in the State of Delaware in 2015 as Functional Software, Inc. and is
gonna need some rights from me in order to utilize my contributions in
this here PR. So here's the deal: I retain all rights, title and
interest in and to my contributions, and by keeping this boilerplate
intact I confirm that Sentry can use, modify, copy, and redistribute my
contributions, under Sentry's choice of terms.
@github-actions github-actions bot locked and limited conversation to collaborators Sep 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Trigger: getsentry tests Once code is reviewed: apply label to PR to trigger getsentry tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants