fix(flags): reject leading-zero versions in sortableSemVer#59290
Merged
Conversation
The HogQL sortableSemVer function used a permissive regex that treated "3.07" as [3, 7], so the flag rollout blast-radius view counted users with version "3.07" as matching a "version >= 3.7" filter even though runtime flag evaluation (Rust, strict semver crate) rejects them. Users were confused when the rollout preview disagreed with the live cohort. Make HogQL's sortableSemVer match the Rust crate: only strict X.Y.Z with no leading zeros (optionally with a 'v' prefix or pre-release/build suffix). Invalid input now becomes NULL, which is falsy in WHERE and excludes the row from any semver comparison. Also add a Rust test pinning the strict-rejection behavior for "3.07", "3.7", and "3.0" so the source-of-truth path can't drift back.
Contributor
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
Contributor
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
posthog/hogql/test/test_query.py:1905-1957
**Prefer parameterised tests**
`test_sortable_semver_rejects_invalid` and `test_sortable_semver_invalid_excluded_from_comparisons` test multiple distinct inputs by concatenating them into one SQL string. Per the team's convention ("We always prefer parameterised tests"), each invalid-version case should ideally be its own parameterised invocation so that failures pinpoint the exact offending input rather than the whole query. The batch-query design here is efficient for ClickHouse round-trips, but a `@parameterized.expand` (or `@pytest.mark.parametrize`) approach on individual `SELECT sortableSemVer('{version}') IS NULL` calls would make failures immediately actionable without having to inspect the result-set difference.
Reviews (1): Last reviewed commit: "fix(flags): reject leading-zero versions..." | Re-trigger Greptile |
ClickHouse rejects Nullable(Array(String)), which is the type the previous `nullIf(extract(...), '')` approach produced once the HogQL codegen wrapped the comparison with ifNull(). Three TestBlastRadius integration tests failed with: Nested type Array(String) cannot be inside Nullable type Switch to a per-element-nullable shape instead: keep extract() non- nullable (empty string on no-match) and convert each split component with toInt64OrNull. Invalid input now parses to [NULL] of type Array(Nullable(Int64)), which ClickHouse accepts. Element-wise array comparison propagates NULL through any operator (>, >=, <, <=, =, !=), so invalid versions still get excluded from every semver filter — matching Rust's behavior — without producing a forbidden Nullable array wrapper.
splitByChar('.', '') returns [] in ClickHouse, not [''] — so my last
commit would have parsed invalid versions to [] instead of [NULL]. The
practical problem: [] < [1, 2, 3] is true in ClickHouse, which would
silently include invalid versions in 'version < X' filters (the same
class of bug we set out to fix, just on the other operator).
Wrap extract() in coalesce(nullIf(extract, ''), '_') so the empty-match
case feeds splitByChar a single-character sentinel instead. The
sentinel splits to ['_'], which toInt64OrNull maps to [NULL] —
restoring the intended Array(Nullable(Int64)) behavior where any
comparison with an invalid version returns NULL and the row is
excluded from every operator.
ClickHouse compares arrays containing NULL elements as the *greatest* value (not as NULL/unknown), so an invalid version parsed to [NULL] would still satisfy '>= 3.7.0' filters in WHERE — leaving the original blast-radius / runtime-evaluation discrepancy unresolved. Wrap every semver comparison (=, !=, >, >=, <, <=, ~, ^, .*) in property.py with an explicit match() against the strict semver regex. Invalid property values are now dropped before the array comparison fires, regardless of the operator — finally matching what Rust's semver crate does at flag evaluation time. Also rewrite the test_sortable_semver_comparison tests, which had been asserting raw sortableSemver() comparisons would return NULL for invalid input. They don't — that's the whole reason we need the gate. The new tests only cover the valid-vs-valid case; invalid-exclusion is verified end-to-end through the existing TestBlastRadius tests.
gustavohstrassburger
approved these changes
May 21, 2026
gustavohstrassburger
approved these changes
May 21, 2026
oliverb123
approved these changes
May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The HogQL
sortableSemVerfunction (used to compute the rolloutblast-radius preview for feature flags with semver comparisons)
extracted the first dot-separated digit run from the property value
without enforcing strict semver rules. As a result
\"3.07\"wasparsed as
[3, 7]— exactly the same as\"3.7\"— so the previewcounted users with version
\"3.07\"as matching aversion >= 3.7filter.
Runtime flag evaluation (Rust,
semvercrate)rejects
\"3.07\"outright. So a flag that the preview said would hitN users actually hit fewer at runtime, and customers were filing the
discrepancy as a bug.
Changes
sortablesemver(posthog/hogql/functions/posthog.py):swap the lenient
(\\d+(\\.\\d+)+)extraction for a strict, anchoredregex that only matches
MAJOR.MINOR.PATCHwith no leading zeros,optionally with a
vprefix or pre-release / build-metadata suffix.Invalid input now becomes
NULL(vianullIf(extract(...), '')),which propagates through
splitByChar/arrayMapand makes everysemver comparison
NULL— equivalent tofalseinWHERE, so therow is excluded. This matches what the Rust path does when
Version::parsereturnsNone.rust/feature-flags/src/properties/property_matching.rs):extend
test_semver_invalid_versionswith explicit assertions for\"3.07\",\"3.7\", and\"3.0\"to pin the strict-rejectionbehavior of the source-of-truth path so it can't silently relax.
posthog/hogql/test/test_query.py):tighten
test_sortable_semverto only sort strict X.Y.Z inputs,rewrite
test_sortable_semver_outputto cover the new accept-list(plain,
vprefix, pre-release suffix, all-zero), and add two newtests —
test_sortable_semver_rejects_invalid(lists everythingRust rejects and asserts
IS NULL) andtest_sortable_semver_invalid_excluded_from_comparisons(the exactbug repro:
sortableSemVer('3.07') >= sortableSemVer('3.7.0')isnow
NULLinstead oftrue).posthog/hogql/printer/test/test_printer.py):update the expected ClickHouse SQL string for the new template.
Behavior change to be aware of
Any value that isn't strict X.Y.Z (with no leading zeros) is now
treated as unmatchable for semver operators in the blast-radius
preview. That includes two-component versions like
\"3.7\",four-plus-component versions like
\"1.2.3.4\", and trailing garbagelike
\"2.2.0.betabac\". This is the intentional alignment with Rust— blast radius now reflects what flag evaluation will actually do —
but it does change the number reported in the UI for projects whose
app-version property doesn't use strict semver. Worth a heads-up in
release notes.
How did you test this code?
I'm an agent (Claude Code).
cargo test --package feature-flags --lib test_semver_invalid_versionspasses with the new assertions for
\"3.07\",\"3.7\",\"3.0\".a Python harness (mirroring the
extract+ capture-group-1 +nullIfflow), covering: strict valid (1.2.3,0.0.0,v1.2.3-alpha+build, whitespace padding), the user-reported bug(
3.07), other leading-zero shapes (01.2.3,1.02.3,1.2.03,01.02.03), wrong arity (3.0,3.7,1.2.3.4,0.9,1.9.233434.10,0.0.0.0.1000), structural junk (1..2.3,.1.2.3,1.2.3.,1.-2.3,2.2.0.betabac), and totallyunparseable input (
abc,\"\"). All 25+ matched expectations.bring up local ClickHouse, leaving verification to CI.
Open item for reviewers
The fix relies on ClickHouse propagating
NULLthroughsplitByChar('.', NULL)andarrayMap(f, NULL). The docs saynon-NULL-aware functions return
NULLonNULLinput, but I couldn'tconfirm specifically for the higher-order
arrayMappath. If CIreveals it does not propagate, the fallback is to special-case
sortablesemverinposthog/hogql/printer/clickhouse.py(next tolookupOrganicSourceType) and emit an explicitif(empty(extract(...)), NULL, arrayMap(...))that referencesargs[0]twice. Happy to land that follow-up if the integration tests flag it.
Publish to changelog?
no
🤖 Agent context
Authored by Claude Code (Opus 4.7, 1M context) — Dylan kicked it off
with a description of the user-visible blast-radius bug and an
explicit ask to match the Rust path strictly (chosen via
`AskUserQuestion` over only-fix-leading-zeros and
make-Rust-more-lenient). I rejected an earlier draft that just tweaked
the regex inside `extract` (it would silently truncate `"3.07"` to
`"3.0"`) and went with anchored-regex + capture-group-1 + `nullIf`
instead because it falls back to `NULL` rather than to a bogus
parsed array. The Rust test was added first so the source-of-truth
behavior is pinned regardless of how the HogQL fallback shakes out.