Closed
Description
Bug report
Bug description:
strptime()
fails to parse month names containing "İ" (U+0130, LATIN CAPITAL LETTER I WITH DOT ABOVE) like "İyun" or "İyul". This affects locales 'az_AZ', 'ber_DZ', 'ber_MA' and 'crh_UA'.
This happens because 'İ'.lower() == 'i\u0307'
, but the re
module only supports 1-to-1 character matching in case-insensitive mode.
There are several ways to fix this:
- Do not convert month names (and any other names) to lower case in
_strptime.LocaleTime
. This is a large change and it would make converting names to indices more complex and/or slower. This is universal way which would work with any other letters which are converted to multiple characters in lower case (but currently 'İ' the only such case in Linux locales). - Just fix the regular expressions created from lower-cased month names. This only works for 'İ' in month names, but this is all that is needed for now.
The third way -- converting the input string to lower case before parsing -- does not work, because Z
for timezone is case-sensitive.
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status
Done