Skip to content

gh-152847: Reject POSIX TZ rule with non-digit day-of-year in _zoneinfo.py#152848

Merged
StanFromIreland merged 3 commits into
python:mainfrom
tonghuaroot:fix-gh-152847-zoneinfo-doy
Jul 2, 2026
Merged

gh-152847: Reject POSIX TZ rule with non-digit day-of-year in _zoneinfo.py#152848
StanFromIreland merged 3 commits into
python:mainfrom
tonghuaroot:fix-gh-152847-zoneinfo-doy

Conversation

@tonghuaroot

@tonghuaroot tonghuaroot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

The pure-Python zoneinfo parser validated the Mm.w.d transition rule
strictly, but the Jn (Julian) and n (0-based) day-of-year branches of
_parse_dst_start_end() fell through to a bare int(date) with no format
check. int() accepts input the C accelerator rejects, so the two
implementations disagreed on the same POSIX TZ string.

The C accelerator reads the day-of-year field with
parse_digits(&ptr, 1, 3, &day) in Modules/_zoneinfo.c, consuming 1 to 3
ASCII digits (Py_ISDIGIT) and nothing else. This PR adds the matching guard

if re.fullmatch(r"\d{1,3}", date, re.ASCII) is None:
    raise ValueError(f"Invalid dst start/end date: {dststr}")

before int(), so the pure parser now matches the C accelerator exactly for
this field — not stricter, not looser.

The most notable case is a silent miscompile rather than a crash: int('1_0')
is 10, so AAA4BBB,J1_0,J300/2 previously built a valid but different zone
(DST on day 10) in pure Python while the C accelerator raised ValueError.
The fix also aligns the J+1, leading-space, 4+-digit-width (J0001), and
non-ASCII-digit cases. The 1-to-3-digit bound is deliberate: the C parser
consumes at most three digits, so \d+ would make the pure parser accept
J0001, which C rejects. Leading zeros within three digits (J01, J001)
are still accepted by both. The existing _DayOffset range check
([julian, 365]) is untouched, so no numeric-range behaviour changes.

Verified with a C-vs-pure differential (10 divergent inputs before, 0 after),
a zero-regression pass over all 499 bundled IANA zones (byte-identical through
both implementations), and the full test_zoneinfo suite. Added invalid-TZ
cases (J1_0, 1_0, J+1, J 1, 1, J0001, 0001) and leading-zero
valid controls (J001, 001); these run against both TZStrTest and
CTZStrTest.

This covers a distinct field in the same POSIX TZ parity audit as gh-152212
(std offset), gh-152246 (Mm.w.d separator), and gh-152248 (abbreviation
regex).

… POSIX TZ rules

The J and n day-of-year branches of _parse_dst_start_end() fell through to a
bare int(date), accepting input the C accelerator rejects (for example J1_0,
which int() reads as day 10, silently building a different zone). Guard the
branch with an re.ASCII digit match mirroring the C parser's
parse_digits(1, 3), so both implementations agree.
@StanFromIreland

StanFromIreland commented Jul 2, 2026

Copy link
Copy Markdown
Member

Just for the record, quoting RFC 9636:

the TZ environment variable [...] MUST encode the POSIX portable character set as ASCII

This is what the C accelerator currently complies with.

@StanFromIreland StanFromIreland left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one little nit.

Comment thread Lib/test/test_zoneinfo/test_zoneinfo.py Outdated
@StanFromIreland StanFromIreland changed the title gh-152847: Reject non-digit day-of-year in pure-Python zoneinfo POSIX TZ rules gh-152847: Reject POSIX TZ rule non-digit day-of-year in pure-Python zoneinfo POSIX TZ rules Jul 2, 2026
@StanFromIreland StanFromIreland changed the title gh-152847: Reject POSIX TZ rule non-digit day-of-year in pure-Python zoneinfo POSIX TZ rules gh-152847: Reject POSIX TZ rule with non-digit day-of-year in _zoneinfo.py Jul 2, 2026
@StanFromIreland StanFromIreland merged commit 31864bd into python:main Jul 2, 2026
54 checks passed
@StanFromIreland StanFromIreland added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes needs backport to 3.15 pre-release feature fixes, bugs and security fixes labels Jul 2, 2026
@miss-islington-app

Copy link
Copy Markdown

Thanks @tonghuaroot for the PR, and @StanFromIreland for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

@miss-islington-app

Copy link
Copy Markdown

Thanks @tonghuaroot for the PR, and @StanFromIreland for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

@miss-islington-app

Copy link
Copy Markdown

Thanks @tonghuaroot for the PR, and @StanFromIreland for merging it 🌮🎉.. I'm working now to backport this PR to: 3.15.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jul 2, 2026

Copy link
Copy Markdown

GH-152908 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.13 bugs and security fixes label Jul 2, 2026
@bedevere-app

bedevere-app Bot commented Jul 2, 2026

Copy link
Copy Markdown

GH-152909 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.14 bugs and security fixes label Jul 2, 2026
@bedevere-app

bedevere-app Bot commented Jul 2, 2026

Copy link
Copy Markdown

GH-152910 is a backport of this pull request to the 3.15 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jul 2, 2026
@StanFromIreland

Copy link
Copy Markdown
Member

Merged, thanks.

StanFromIreland added a commit that referenced this pull request Jul 2, 2026
…`_zoneinfo.py` (GH-152848) (#152908)

(cherry picked from commit 31864bd)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
StanFromIreland added a commit that referenced this pull request Jul 2, 2026
…`_zoneinfo.py` (GH-152848) (#152910)

(cherry picked from commit 31864bd)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
StanFromIreland added a commit that referenced this pull request Jul 2, 2026
…`_zoneinfo.py` (GH-152848) (#152909)

(cherry picked from commit 31864bd)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants