Skip to content

[BugFix] Fixed bad path regex to work with legal simple json paths#67988

Merged
stdpain merged 2 commits intoStarRocks:mainfrom
brent-statsig:fix_bad_json_path_regex
Jan 21, 2026
Merged

[BugFix] Fixed bad path regex to work with legal simple json paths#67988
stdpain merged 2 commits intoStarRocks:mainfrom
brent-statsig:fix_bad_json_path_regex

Conversation

@brent-statsig
Copy link
Contributor

@brent-statsig brent-statsig commented Jan 15, 2026

Why I'm doing:

I have JSON keys that contain the characters -. While looking at profiles, I noticed that queries for those keys aren't properly pushing down their filter predicates. This regex prevents the FE from considering this a valid column access expression, which then causes the significantly less performant jsonMerge to run, causing a pretty significant performance regression to all queries accessing these keys.

What I'm doing:

Added the - character I use on a regular basis to the regex.

I noticed in the be code paths, this is the regex that is used, plus one that was commented out? It is unclear to me if there is a better option, but this is what I saw: link to code

// static const re2::RE2 JSON_PATTERN("^([a-zA-Z0-9_\\-\\:\\s#\\|\\.]*)(?:\\[([0-9]+)\\])?");
// json path cannot contains: ", [, ]
const re2::RE2 SIMPLE_JSONPATH_PATTERN(R"(^([^\"\[\]]*)(?:\[([0-9]+|\*)\])?)", re2::RE2::Quiet);

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

Note

Updates JsonPathRewriteRule to broaden simple JSON path validation from ^[a-zA-Z0-9_]+$ to ^[a-zA-Z0-9\-_]+$.

  • Enables rewrite of JSON functions when field names include hyphens, improving predicate/projection pushdown on JSON columns

Written by Cursor Bugbot for commit d1e6084. This will update automatically on new commits. Configure here.

@brent-statsig brent-statsig requested a review from a team as a code owner January 15, 2026 11:32
@CLAassistant
Copy link

CLAassistant commented Jan 15, 2026

CLA assistant check
All committers have signed the CLA.

@alvin-celerdata
Copy link
Contributor

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!

@alvin-celerdata alvin-celerdata changed the title Fixed bad path regex to work with legal simple json paths [BugFix] Fixed bad path regex to work with legal simple json paths Jan 15, 2026
@sonarqubecloud
Copy link

@brent-statsig
Copy link
Contributor Author

Actually - I think I need to include more things. I noticed I query the json path $.custom."k8s.container.status.last_terminated_reason". According to spec, basically anything in quotes is ~legal, and the current code doesn't really validate the difference between quotes and unquoted paths.

I don't know nearly enough about what is supported for JSON path though, so I'm a bit apprehensive to keep just adding characters and hope something doesn't break. Can someone with more knowledge take a look and make a recommendation?

@github-actions
Copy link
Contributor

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Contributor

[FE Incremental Coverage Report]

pass : 1 / 1 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/optimizer/rule/tree/JsonPathRewriteRule.java 1 1 100.00% []

@github-actions
Copy link
Contributor

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@stdpain stdpain self-assigned this Jan 21, 2026
@stdpain stdpain merged commit 62ce106 into StarRocks:main Jan 21, 2026
60 of 62 checks passed
@github-actions
Copy link
Contributor

@Mergifyio backport branch-4.1

@github-actions github-actions bot removed the 4.1 label Jan 21, 2026
@github-actions
Copy link
Contributor

@Mergifyio backport branch-4.0

@github-actions
Copy link
Contributor

@Mergifyio backport branch-3.5

@mergify
Copy link
Contributor

mergify bot commented Jan 21, 2026

backport branch-4.1

✅ Backports have been created

Details

@mergify
Copy link
Contributor

mergify bot commented Jan 21, 2026

backport branch-4.0

✅ Backports have been created

Details

@mergify
Copy link
Contributor

mergify bot commented Jan 21, 2026

backport branch-3.5

✅ Backports have been created

Details

Cherry-pick of 62ce106 has failed:

On branch mergify/bp/branch-3.5/pr-67988
Your branch is up to date with 'origin/branch-3.5'.

You are currently cherry-picking commit 62ce106137.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Unmerged paths:
  (use "git add/rm <file>..." as appropriate to mark resolution)
	deleted by us:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/tree/JsonPathRewriteRule.java

no changes added to commit (use "git add" and/or "git commit -a")

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

mergify bot pushed a commit that referenced this pull request Jan 21, 2026
mergify bot pushed a commit that referenced this pull request Jan 21, 2026
mergify bot pushed a commit that referenced this pull request Jan 21, 2026
…67988)

(cherry picked from commit 62ce106)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/tree/JsonPathRewriteRule.java
wanpengfei-git pushed a commit that referenced this pull request Jan 28, 2026
…ackport #67988) (#68218)

Co-authored-by: brent-statsig <126510059+brent-statsig@users.noreply.github.com>
wanpengfei-git pushed a commit that referenced this pull request Jan 28, 2026
…ackport #67988) (#68219)

Co-authored-by: brent-statsig <126510059+brent-statsig@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants