fix(releases): Add build number and code to semver index (DB migration) #102747

srest2021 · 2025-11-05T00:48:28Z

relates to REPLAY-803

linear · 2025-11-05T00:48:31Z

REPLAY-803 Build code considered when detecting regressions but not when finding the current release version

github-actions · 2025-11-05T00:55:54Z

This PR has a migration; here is the generated SQL for src/sentry/migrations/1003_rebuild_semver_index_with_build_fields.py

for 1003_rebuild_semver_index_with_build_fields in sentry

--
-- Remove index sentry_release_semver_idx from release
--
DROP INDEX CONCURRENTLY IF EXISTS "sentry_release_semver_idx";
--
-- Create index sentry_release_semver_idx on F(organization), OrderBy(F(major), descending=True), OrderBy(F(minor), descending=True), OrderBy(F(patch), descending=True), OrderBy(F(revision), descending=True), OrderBy(CASE WHEN <Q: (AND: ('prerelease', ''))> THEN Value(1), ELSE Value(0), descending=True), OrderBy(F(prerelease), descending=True), OrderBy(F(build_number), descending=True), OrderBy(F(build_code), descending=True) on model release
--
CREATE INDEX CONCURRENTLY "sentry_release_semver_idx" ON "sentry_release" ("organization_id", "major" DESC, "minor" DESC, "patch" DESC, "revision" DESC, (CASE WHEN "prerelease" = '' THEN 1 ELSE 0 END) DESC, "prerelease" DESC, "build_number" DESC, "build_code" DESC);

github-actions · 2025-11-05T19:38:32Z

This PR has a migration; here is the generated SQL for src/sentry/migrations/1004_rebuild_semver_index_with_build_fields.py

for 1004_rebuild_semver_index_with_build_fields in sentry

--
-- Remove index sentry_release_semver_idx from release
--
DROP INDEX CONCURRENTLY IF EXISTS "sentry_release_semver_idx";
--
-- Create index sentry_release_semver_idx on F(organization), OrderBy(F(major), descending=True), OrderBy(F(minor), descending=True), OrderBy(F(patch), descending=True), OrderBy(F(revision), descending=True), OrderBy(CASE WHEN <Q: (AND: ('prerelease', ''))> THEN Value(1), ELSE Value(0), descending=True), OrderBy(F(prerelease), descending=True), OrderBy(F(build_number), descending=True), OrderBy(F(build_code), descending=True) on model release
--
CREATE INDEX CONCURRENTLY "sentry_release_semver_idx" ON "sentry_release" ("organization_id", "major" DESC, "minor" DESC, "patch" DESC, "revision" DESC, (CASE WHEN "prerelease" = '' THEN 1 ELSE 0 END) DESC, "prerelease" DESC, "build_number" DESC, "build_code" DESC);

JoshFerge · 2025-11-05T19:44:22Z

src/sentry/migrations/1004_add_semver_with_build_code_index.py

+        ("sentry", "1003_group_history_prev_history_safe_removal"),
+    ]
+
+    operations = [


i will let the folks on the db migration reviewers group chime in but i think we want to first create the new index, then drop the old one after in a separate migration.

Yeah, we should switch these if we go ahead with this

wedamija

Could you describe the queries you want to make with this?

My vague memory of this that build number/code didn't make sense to sort by. Do you have some example rows here just so I can make sure I'm following this correctly?

wedamija · 2025-11-05T19:44:52Z

src/sentry/migrations/1004_add_semver_with_build_code_index.py

+        ("sentry", "1003_group_history_prev_history_safe_removal"),
+    ]
+
+    operations = [


Yeah, we should switch these if we go ahead with this

srest2021 · 2025-11-05T21:32:07Z

Could you describe the queries you want to make with this?

My vague memory of this that build number/code didn't make sense to sort by. Do you have some example rows here just so I can make sure I'm following this correctly?

@wedamija Yes definitely. According to the semver spec build code should not be included in the semver ordering. However it turns out that when detecting regressions we have been including the build_number and build_code columns in the ordering (see these test cases in my followup PR, which demonstrate the existing behavior for regression detection and act as our "target" behavior for semver ordering). We also have a case here (thread here) for the resolve in next release feature which points out this inconsistency. By adding the build code and number to the semver ordering, we can make sure that the "next" release picked to resolve by follows the same ordering. Here is the test I modified in the followup to confirm and demonstrate the new ordering.

I'll make sure to create a new index for this. The changes in the followup PR will also be gated under a new feature flag (WIP) so we can roll this out safely.

github-actions · 2025-11-05T22:40:37Z

This PR has a migration; here is the generated SQL for src/sentry/migrations/1004_add_semver_with_build_code_index.py

for 1004_add_semver_with_build_code_index in sentry

--
-- Create index sentry_release_semver_new_idx on F(organization), OrderBy(F(major), descending=True), OrderBy(F(minor), descending=True), OrderBy(F(patch), descending=True), OrderBy(F(revision), descending=True), OrderBy(CASE WHEN <Q: (AND: ('prerelease', ''))> THEN Value(1), ELSE Value(0), descending=True), OrderBy(F(prerelease), descending=True), OrderBy(CASE WHEN <Q: (AND: ('build_code__isnull', False), ('build_number__isnull', True))> THEN Value(2), WHEN <Q: (AND: ('build_number__isnull', False))> THEN Value(1), ELSE Value(0), descending=True), OrderBy(F(build_number), descending=True), OrderBy(F(build_code), descending=True) on model release
--
CREATE INDEX CONCURRENTLY "sentry_release_semver_new_idx" ON "sentry_release" ("organization_id", "major" DESC, "minor" DESC, "patch" DESC, "revision" DESC, (CASE WHEN "prerelease" = '' THEN 1 ELSE 0 END) DESC, "prerelease" DESC, (CASE WHEN ("build_code" IS NOT NULL AND "build_number" IS NULL) THEN 2 WHEN "build_number" IS NOT NULL THEN 1 ELSE 0 END) DESC, "build_number" DESC, "build_code" DESC);

wedamija · 2025-11-05T23:04:55Z

Could you describe the queries you want to make with this?
My vague memory of this that build number/code didn't make sense to sort by. Do you have some example rows here just so I can make sure I'm following this correctly?

@wedamija Yes definitely. According to the semver spec build code should not be included in the semver ordering. However it turns out that when detecting regressions we have been including the build_number and build_code columns in the ordering (see these test cases in my followup PR, which demonstrate the existing behavior for regression detection and act as our "target" behavior for semver ordering). We also have a case here (thread here) for the resolve in next release feature which points out this inconsistency. By adding the build code and number to the semver ordering, we can make sure that the "next" release picked to resolve by follows the same ordering. Here is the test I modified in the followup to confirm and demonstrate the new ordering.

I'll make sure to create a new index for this. The changes in the followup PR will also be gated under a new feature flag (WIP) so we can roll this out safely.

Ok, this all sounds reasonable. One thing I want to check is whether the index will actually be used here. It might be fine to leave the existing index as is, since maybe there aren't that many builds per release.

Are you able to print out an actual sql query that uses this sort so that we can verify whether the new columns in this index will get used?

srest2021 · 2025-11-06T18:18:16Z

@wedamija The use case I'm interested in is the greatest_semver_release function, which with the changes would generate a sql query like this. Would of course be easier to skip having to create the new index if possible--let me know what you think.

SELECT
  "sentry_release"."id",
  "sentry_release"."organization_id",
  "sentry_release"."status",
  "sentry_release"."version",
  "sentry_release"."ref",
  "sentry_release"."url",
  "sentry_release"."date_added",
  "sentry_release"."date_started",
  "sentry_release"."date_released",
  "sentry_release"."data",
  "sentry_release"."owner_id",
  "sentry_release"."commit_count",
  "sentry_release"."last_commit_id",
  "sentry_release"."authors",
  "sentry_release"."total_deploys",
  "sentry_release"."last_deploy_id",
  "sentry_release"."package",
  "sentry_release"."major",
  "sentry_release"."minor",
  "sentry_release"."patch",
  "sentry_release"."revision",
  "sentry_release"."prerelease",
  "sentry_release"."build_code",
  "sentry_release"."build_number",
  "sentry_release"."user_agent",
  CASE WHEN "sentry_release"."prerelease" = '' THEN 1 ELSE 0 END AS "prerelease_case",
  CASE WHEN ("sentry_release"."build_code" IS NOT NULL
  AND "sentry_release"."build_number" IS NULL) THEN 2 WHEN "sentry_release"."build_number" IS NOT NULL THEN 1 ELSE 0 END AS "build_code_case"
FROM
  "sentry_release"
INNER JOIN
  "sentry_release_project"
  ON ("sentry_release"."id" = "sentry_release_project"."release_id")
WHERE
  ("sentry_release"."organization_id" = 123
  AND "sentry_release_project"."project_id" = 123
  AND "sentry_release"."major" IS NOT NULL)
ORDER BY
  "sentry_release"."major" DESC,
  "sentry_release"."minor" DESC,
  "sentry_release"."patch" DESC,
  "sentry_release"."revision" DESC,
  26 DESC, --prerelease_case column
  "sentry_release"."prerelease" DESC,
  27 DESC, --build_code_case column
  "sentry_release"."build_number" DESC,
  "sentry_release"."build_code" DESC

wedamija · 2025-11-06T18:29:40Z

@wedamija The use case I'm interested in is the greatest_semver_release function, which with the changes would generate a sql query like this. Would of course be easier to skip having to create the new index if possible--let me know what you think.

SELECT
  "sentry_release"."id",
  "sentry_release"."organization_id",
  "sentry_release"."status",
  "sentry_release"."version",
  "sentry_release"."ref",
  "sentry_release"."url",
  "sentry_release"."date_added",
  "sentry_release"."date_started",
  "sentry_release"."date_released",
  "sentry_release"."data",
  "sentry_release"."owner_id",
  "sentry_release"."commit_count",
  "sentry_release"."last_commit_id",
  "sentry_release"."authors",
  "sentry_release"."total_deploys",
  "sentry_release"."last_deploy_id",
  "sentry_release"."package",
  "sentry_release"."major",
  "sentry_release"."minor",
  "sentry_release"."patch",
  "sentry_release"."revision",
  "sentry_release"."prerelease",
  "sentry_release"."build_code",
  "sentry_release"."build_number",
  "sentry_release"."user_agent",
  CASE WHEN "sentry_release"."prerelease" = '' THEN 1 ELSE 0 END AS "prerelease_case",
  CASE WHEN ("sentry_release"."build_code" IS NOT NULL
  AND "sentry_release"."build_number" IS NULL) THEN 2 WHEN "sentry_release"."build_number" IS NOT NULL THEN 1 ELSE 0 END AS "build_code_case"
FROM
  "sentry_release"
INNER JOIN
  "sentry_release_project"
  ON ("sentry_release"."id" = "sentry_release_project"."release_id")
WHERE
  ("sentry_release"."organization_id" = 123
  AND "sentry_release_project"."project_id" = 123
  AND "sentry_release"."major" IS NOT NULL)
ORDER BY
  "sentry_release"."major" DESC,
  "sentry_release"."minor" DESC,
  "sentry_release"."patch" DESC,
  "sentry_release"."revision" DESC,
  26 DESC, --prerelease_case column
  "sentry_release"."prerelease" DESC,
  27 DESC, --build_code_case column
  "sentry_release"."build_number" DESC,
  "sentry_release"."build_code" DESC

As is, this index won't be used for the build columns because we're sorting on build_code_case_column. Postgres doesn't know how to use that index to build the case statement. So we have two options here:

Add a column like prerelease_case but for build_code_case to the index. If you take a look at the generated index above, you can see it being used. So you'd need to have <build_code_case_col>, build_number, build_code at the end of your index
You could try running this query as-is in redash against some orgs you want to test it on, and see if it's already fast enough. I'd probably recommend trying this first, instead of making this index more complicated, because I think it might be hard for postgres to even decide to use it. When testing in postgres, be aware that multiple runs of the same query end up cached, and so you need to test on different org/project ids, or wait long enough that the queries are removed from memory.

srest2021 · 2025-11-07T19:49:46Z

Update: We think it's ok to skip rebuilding the index when adding the build code columns to the semver ordering. We'll monitor the latency of the query on Datadog when changes are deployed.

srest2021 mentioned this pull request Nov 5, 2025

fix(releases): Consistently sort by build number & code in semver #102711

Draft

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 5, 2025

vercel bot deployed to Preview November 5, 2025 00:50 View deployment

srest2021 changed the title ~~fix(releases): Add build number and code to semver index~~ fix(releases): Add build number and code to semver index (DB migration) Nov 5, 2025

vercel bot deployed to Preview November 5, 2025 19:33 View deployment

srest2021 marked this pull request as ready for review November 5, 2025 19:39

srest2021 requested review from a team as code owners November 5, 2025 19:39

JoshFerge reviewed Nov 5, 2025

View reviewed changes

wedamija reviewed Nov 5, 2025

View reviewed changes

make separate index

35fe4a5

srest2021 force-pushed the srest2021/REPLAY-803-migration branch from 4a89b6c to 35fe4a5 Compare November 5, 2025 22:33

vercel bot deployed to Preview November 5, 2025 22:35 View deployment

srest2021 closed this Nov 7, 2025

Uh oh!

fix(releases): Add build number and code to semver index (DB migration) #102747

fix(releases): Add build number and code to semver index (DB migration) #102747

Uh oh!

Conversation

srest2021 commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear bot commented Nov 5, 2025

Uh oh!

github-actions bot commented Nov 5, 2025

Uh oh!

github-actions bot commented Nov 5, 2025

Uh oh!

JoshFerge Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

wedamija Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

wedamija left a comment

Choose a reason for hiding this comment

Uh oh!

wedamija Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

srest2021 commented Nov 5, 2025

Uh oh!

github-actions bot commented Nov 5, 2025

Uh oh!

wedamija commented Nov 5, 2025

Uh oh!

srest2021 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wedamija commented Nov 6, 2025

Uh oh!

srest2021 commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

srest2021 commented Nov 5, 2025 •

edited

Loading

srest2021 commented Nov 6, 2025 •

edited

Loading