Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imaplib: Time2Internaldate() returns localized strings #55233

Closed
spaetz mannequin opened this issue Jan 27, 2011 · 22 comments
Closed

imaplib: Time2Internaldate() returns localized strings #55233

spaetz mannequin opened this issue Jan 27, 2011 · 22 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@spaetz
Copy link
Mannequin

spaetz mannequin commented Jan 27, 2011

BPO 11024
Nosy @abalkin, @bitdancer, @floriankisser
Files
  • imaplib_Time2Internaldate_locale_fix.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-01-23.14:41:22.081>
    created_at = <Date 2011-01-27.13:35:00.586>
    labels = ['type-bug', 'library']
    title = 'imaplib: Time2Internaldate() returns localized strings'
    updated_at = <Date 2018-10-23.13:50:16.910>
    user = 'https://bugs.python.org/spaetz'

    bugs.python.org fields:

    activity = <Date 2018-10-23.13:50:16.910>
    actor = 'floriankisser'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-01-23.14:41:22.081>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2011-01-27.13:35:00.586>
    creator = 'spaetz'
    dependencies = []
    files = ['20591']
    hgrepos = []
    issue_num = 11024
    keywords = ['patch']
    message_count = 22.0
    messages = ['127186', '127187', '127189', '127195', '127199', '127200', '127225', '127257', '127328', '127330', '127339', '127341', '127342', '127356', '127361', '127363', '130850', '163513', '234507', '234554', '234556', '328317']
    nosy_count = 7.0
    nosy_names = ['belopolsky', 'r.david.murray', 'lavajoe', 'spaetz', 'python-dev', 'Pilessio', 'floriankisser']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue11024'
    versions = ['Python 3.1', 'Python 2.7', 'Python 3.2']

    @spaetz
    Copy link
    Mannequin Author

    spaetz mannequin commented Jan 27, 2011

    imaplib's Time2Internaldate returns invalid (as localized) INTERNALDATE strings. Appending a message with such a time string leads to a:
    19 BAD Command Argument Error. 11 (for MS Exchange IMAP servers)

    it returned "26-led-2011 18:23:44 +0100", however:

    http://tools.ietf.org/html/rfc2060.html defines:
    date_month ::= "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" /
    "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec"

    so it expects an English date format.

    imaplib's Time2Internaldate uses time.strftime() to create the final string which uses the current locale, returning things such as:

    "26-led-2011 18:23:44 +0100" rather than "26-Jan-2011 18:23:44 +0100".

    For the right thing to do, we would need to set locale.setlocale(locale.LC_TIME, '') to get English formatting or we would need to use some home-grown parser that hardcodes the proper terms.

    @spaetz spaetz mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 27, 2011
    @spaetz
    Copy link
    Mannequin Author

    spaetz mannequin commented Jan 27, 2011

    P.S. To replicate this in ipython:

    import locale, imaplib
    
    locale.setlocale(locale.LC_ALL,'de_CH.utf8')
    imaplib.Time2Internaldate(220254431)
    Out[1]: '"24-Dez-1976 06:47:11 +0100"'

    (Note the German 'Dez' rather than 'Dec')

    @spaetz
    Copy link
    Mannequin Author

    spaetz mannequin commented Jan 27, 2011

    CC'ing lavajoe as he seemed to be busy with some of imaplib's Date stuff the last couple of days.

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 27, 2011

    Sebastian,

    Yes, in fact Alexander Belopolsky (belopolsky) brought up the the locale issue for this very function in one of the other issue comments.

    The invert function, Internaldate2tuple(), actually does its own parsing using a regex match (and so does not have the problem), but you are right, Time2Internaldate() has this issue.

    @spaetz
    Copy link
    Mannequin Author

    spaetz mannequin commented Jan 27, 2011

    I think I found the issue he mentioned, however it was about the functions taking the local time (rather than UTC), which is fine.

    The problem is that Time2Internaldate is used for every .append() operation internally, producing invalid dates which are handed to the IMAP server. So in most cases, the IMAP server will silently ignore the time and use the current time (as per IMAP RFC) or it will complain and barf out (as the MS Exchange server rightly does.

    So this is more than just an inconvenience, it outright prevents intenational users from APPENDing new messages to a server (or silently bodges the message date) as there is no way around using that function...

    Sorry if this sounds like whining :-) I don't even have a patch handy...

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 27, 2011

    Yes, that's serious, certainly.

    A patch should be fairly straightforward, given that part of the formatting logic is already there (for the TZ offset at the end). You just need to format the 6 values, and do a lookup for the month name.

    If you want to try to work up one, I can take a look, or maybe, if I have some time today, I'll try to do one quickly...

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 27, 2011

    OK, I attached a patch that should work. Note that this patch works for Python 2 and Python 3.

    As an aside, the str type is still returned as before (even in Python 3), and the _month_names list uses str. As has been discussed, it may be more proper to return a bytes array and be consistent throughout imaplib, but this is not addressed here.

    Also, I return a leading zero on the day instead of a leading space, since this appears to be what is returned by two IMAP servers I have just tested (gmail's and dovecot).

    @spaetz
    Copy link
    Mannequin Author

    spaetz mannequin commented Jan 28, 2011

    Added file: imaplib_Time2Internaldate_locale_fix.patch

    The patch looks very good to me and works. I agree that we should be
    returning a bytearray but this is should not be part of this issue.

    For all that it's worth:
    Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>

    @abalkin
    Copy link
    Member

    abalkin commented Jan 28, 2011

    Two nitpicks:

    1. To avoid repetition, I would now define Mon2num as
    Mon2num = dict(zip(_month_names, range(1, 13)))
    1. Please keep lines under 79 characters long.

    This does not seem important enough to push to RC2, but if you think otherwise please get RM approval.

    @abalkin
    Copy link
    Member

    abalkin commented Jan 28, 2011

    Also, isn't day supposed to be space- rather than 0- padded?

    @abalkin
    Copy link
    Member

    abalkin commented Jan 28, 2011

    On Fri, Jan 28, 2011 at 2:44 PM, Alexander Belopolsky
    <report@bugs.python.org> wrote:
    ..

    Also, isn't day supposed to be space- rather than 0- padded?

    To the best of my understanding, rfc 2060 requires space-padded day
    (strftime code %e):

    """
    date_day_fixed ::= (SPACE digit) / 2digit
    ;; Fixed-format version of date_day
    ...
    date_time ::= <"> date_day_fixed "-" date_month "-" date_year
    SPACE time SPACE zone <">
    ...
    msg_att ::= ...
    "INTERNALDATE" SPACE date_time /
    ...
    """

    See http://tools.ietf.org/html/rfc2060.html

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 28, 2011

    Also, isn't day supposed to be space- rather than 0- padded?

    This is not clear to me. RFC2822 (referenced from RFC3501 for internal date) discusses date formats, but as used in the header. In this case, day is specified as "([FWS] 1*2DIGIT)", which implies optional space and 1 or 2 digit day. I am not sure this disallows leading-zero format. But this date spec also says dates should be space-separated (like "12 Jan 2011"), and clearly INTERNALDATE needs "-" (like "12-Jan-2011"). Therefore, I cannot see this date format as being authoritative fro INTERNALDATE.

    Also, RFC3501, in chage #71, is extra confusing in that it puts the 3-letter month in all-caps. Python's Internaldate2tuple(), e.g., cannot handle this currently (nor can it handle a single-digit day with no space or 0, but its regex does handle a leading zero, which led me to think 0 is OK).

    Also, it seems that gmail's imap server and Dovecot imap server return leading zero, not leading space, when you fetch INTERNALDATE. So I concluded from all this that 0 might actually be preferred. If this is true, leading zero is better also in that it is less error-prone (e.g., strip can remove the leading space, which will cause problems).

    I'll keep looking for definitive info, but if you know of some I missed, please let me know.

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 28, 2011

    Our messages crossed... :)

    Hm, I see that in RFC 3501, as well (which obsoletes 2060).

    But... I wonder: does "(SP DIGIT) / 2DIGIT" mean that " 1" and "01" are both OK? It seems ambiguous to me.

    I still don't see why major IMAP servers are returning leading zeros if now allowed...

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 28, 2011

    Here's a new patch. I would still like to discuss the leading space vs. leading zero issue, but I have reverted to using a leading space in this patch - fewer changes that way.

    The long line is also fixed (sorry about that - yes, long lines are ugly). And I have used your suggested Mon2num dict creation. Note that I do an encode() in there for compatibility with Python 3, which throws an exception if the keys are not bytes arrays (consistent with the fix in bpo-10939).

    @abalkin
    Copy link
    Member

    abalkin commented Jan 28, 2011

    I would write the formatting code as follows:

    ('"%2d-%s-%04d %02d:%02d:%02d %+03d%02d"' %
    ((tt[2], _month_names[tt[1]], tt[0]) +
    tt[3:6] + divmod(zone//60, 60)))

    The above also assumes that month names are stored in a 1-based array:

    _month_names = [None, 'Jan', ...]

    Note that %2d format code takes care of space-padding.

    If you think the expression that I conjured is too cryptic, get the temporal data from timetuple first with say

    y, m, d, H, M, S = tt[:6]

    and use named variables in the formatting expression.

    @lavajoe
    Copy link
    Mannequin

    lavajoe mannequin commented Jan 29, 2011

    Not cryptic at all - looks great! New patch attached with associated tweaks.

    @bitdancer
    Copy link
    Member

    The tests for this function are...not sufficient. I don't think I'm comfortable committing a patch without improving the tests. Ideally there would also be a test that the locale does not affect the result, which would need to be skipped if the chosen test locale was not available on the machine running the tests.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 23, 2012

    New changeset 42b9d9d795f7 by Alexander Belopolsky in branch 'default':
    Issues bpo-11024: Fixes and additional tests for Time2Internaldate.
    http://hg.python.org/cpython/rev/42b9d9d795f7

    @Pilessio
    Copy link
    Mannequin

    Pilessio mannequin commented Jan 22, 2015

    Not working patch, if I use this method on append I've all messages with 1970 year

    @Pilessio
    Copy link
    Mannequin

    Pilessio mannequin commented Jan 23, 2015

    Is anybody working with this case?

    @bitdancer
    Copy link
    Member

    I'm not sure why this issue is still open. It looks like Alexander committed the fix.

    If you are seeing a problem, I think that would be a new bug, and you should open a new issue giving details on how to reproduce the problem you are seeing.

    @floriankisser
    Copy link
    Mannequin

    floriankisser mannequin commented Oct 23, 2018

    imaplib_Time2Internaldate_locale_fix.patch was never applied to Python 2.7. Regarding the comments I found no reason why, so this is still an issue, I guess.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants