Make the number pattern regular expression more efficient#1213
Merged
akx merged 1 commit intopython-babel:masterfrom Jul 3, 2025
Merged
Make the number pattern regular expression more efficient#1213akx merged 1 commit intopython-babel:masterfrom
akx merged 1 commit intopython-babel:masterfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR makes the regular expression used for parsing number patterns more efficient while preserving equivalent parsing behavior for CLDR data.
- Consolidates multiple regex components into a single, more efficient pattern.
- Updates the matching in parse_pattern to use the new regex.
Comments suppressed due to low confidence (2)
babel/numbers.py:1204
- Consider adding a comment explaining the rationale behind the new regex pattern to clarify its intended improvements in efficiency and maintainability.
_number_pattern_re = re.compile(
babel/numbers.py:1239
- Ensure that the change from number_re to _number_pattern_re is intentional and thoroughly tested, as per the pull request description.
rv = _number_pattern_re.search(pattern)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1213 +/- ##
==========================================
- Coverage 91.98% 91.97% -0.01%
==========================================
Files 27 27
Lines 4693 4688 -5
==========================================
- Hits 4317 4312 -5
Misses 376 376
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I verified that all patterns parsed for importing CLDR data are parsed equivalently using the new regular expression. The inefficient regular expression was brought to our attention by GitHub user s-sanskar – thanks! Co-authored-by: s-sanskar <sanskarpok11@gmail.com>
c0f4caa to
019ba6c
Compare
tomasr8
reviewed
Jul 1, 2025
Member
tomasr8
left a comment
There was a problem hiding this comment.
IIUC the two subpatterns in PREFIX_PATTERN are swapped? Really hard to visually compare two long regexes 😄
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I verified that all patterns parsed for importing CLDR data are parsed equivalently using the new regular expression.
The inefficient regular expression was brought to our attention by GitHub user @s-sanskar – thanks!