Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timespec optional flag to datetime isoformat() to choose the precision #63674

Closed
smontanaro opened this issue Nov 1, 2013 · 94 comments
Closed
Assignees
Labels
easy extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@smontanaro
Copy link
Contributor

smontanaro commented Nov 1, 2013

BPO 19475
Nosy @malemburg, @gvanrossum, @tim-one, @terryjreedy, @abalkin, @vstinner, @ezio-melotti, @berkerpeksag, @vadmium, @matrixise, @alessandrocucci
Files
  • issue19475.patch: Patch
  • issue19475_v2.patch: Patch
  • issue19475_v3.patch: Patch
  • issue19475_v4.patch: Patch v4
  • issue19475_v5.patch: Patch v5
  • issue19475_v6.patch: Patch v6
  • issue19475_v7.patch: Patch v7
  • issue19475_v8.patch: Patch v8
  • issue19475_v9.patch: Patch v9
  • issue19475_v10_datetime_time.patch: added timespec to time.isoformat
  • issue19475_v11.patch: Patch v11
  • issue19475_v12.patch
  • issue19475_v13.patch: added milliseconds format to timespec
  • issue19475_v14.patch: added milliseconds to documentation
  • issue19475_v15.patch
  • issue19475_v16.patch
  • issue19475_v17.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/abalkin'
    closed_at = <Date 2016-03-06.19:58:58.041>
    created_at = <Date 2013-11-01.17:05:21.528>
    labels = ['extension-modules', 'easy', 'type-feature']
    title = 'Add timespec optional flag to datetime isoformat() to choose the precision'
    updated_at = <Date 2016-03-06.19:58:58.039>
    user = 'https://github.com/smontanaro'

    bugs.python.org fields:

    activity = <Date 2016-03-06.19:58:58.039>
    actor = 'python-dev'
    assignee = 'belopolsky'
    closed = True
    closed_date = <Date 2016-03-06.19:58:58.041>
    closer = 'python-dev'
    components = ['Extension Modules']
    creation = <Date 2013-11-01.17:05:21.528>
    creator = 'skip.montanaro'
    dependencies = []
    files = ['40018', '40024', '40117', '41329', '41371', '41381', '41387', '41388', '41391', '41420', '41448', '41464', '41988', '41994', '42007', '42031', '42063']
    hgrepos = []
    issue_num = 19475
    keywords = ['patch', 'easy']
    message_count = 94.0
    messages = ['201917', '201918', '201922', '201924', '201925', '201926', '201929', '201931', '201932', '201934', '201935', '201943', '202220', '202238', '202239', '202242', '202243', '202270', '202274', '202276', '202282', '221958', '221959', '247265', '247401', '247402', '247439', '247723', '247948', '256458', '256470', '256475', '256476', '256477', '256478', '256479', '256480', '256481', '256483', '256486', '256513', '256536', '256552', '256556', '256560', '256561', '256591', '256764', '256813', '256855', '256858', '256866', '256967', '256991', '257196', '257202', '257203', '257284', '257288', '257534', '258289', '258473', '258477', '258479', '258480', '258481', '258482', '258483', '258485', '258493', '260629', '260645', '260695', '260725', '260875', '260876', '260878', '261060', '261066', '261068', '261073', '261074', '261077', '261078', '261080', '261081', '261082', '261083', '261084', '261085', '261086', '261133', '261135', '261269']
    nosy_count = 14.0
    nosy_names = ['lemburg', 'gvanrossum', 'tim.peters', 'terry.reedy', 'belopolsky', 'vstinner', 'ezio.melotti', 'cvrebert', 'python-dev', 'berker.peksag', 'martin.panter', 'matrixise', 'jerry.elmore', 'acucci']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue19475'
    versions = ['Python 3.6']

    @smontanaro
    Copy link
    Contributor Author

    smontanaro commented Nov 1, 2013

    I have a CSV file. Here are a few rows:

    "2013-10-30 14:26:46.000528","1.36097023829"
    "2013-10-30 14:26:46.999755","1.36097023829"
    "2013-10-30 14:26:47.999308","1.36097023829"
    "2013-10-30 14:26:49.002472","1.36097023829"
    "2013-10-30 14:26:50","1.36097023829"
    "2013-10-30 14:26:51.000549","1.36097023829"
    "2013-10-30 14:26:51.999315","1.36097023829"
    "2013-10-30 14:26:52.999703","1.36097023829"
    "2013-10-30 14:26:53.999640","1.36097023829"
    "2013-10-30 14:26:54.999139","1.36097023829"

    I want to parse the strings in the first column as timestamps. I can, and often do, use dateutil.parser.parse(), but in situations like this where all the timestamps are of the same format, it can be incredibly slow. OTOH, there is no single format I can pass to datetime.datetime.strptime() that will parse all the above timestamps. Using "%Y-%m-%d %H:%M:%S" I get errors about the leftover microseconds. Using "%Y-%m-%d %H:%M:%S".%f" I get errors when I try to parse a timestamp which doesn't have microseconds.

    Alas, it is datetime itself which is to blame for this problem. The above timestamps were all printed from an earlier Python program which just dumps the str() of a datetime object to its output CSV file. Consider:

    >>> dt = dateutil.parser.parse("2013-10-30 14:26:50")
    >>> print dt
    2013-10-30 14:26:50
    >>> dt2 = dateutil.parser.parse("2013-10-30 14:26:51.000549")
    >>> print dt2
    2013-10-30 14:26:51.000549

    The same holds for isoformat():

    >>> print dt.isoformat()
    2013-10-30T14:26:50
    >>> print dt2.isoformat()
    2013-10-30T14:26:51.000549

    Whatever happened to "be strict in what you send, but generous in what you receive"? If strptime() is going to complain the way it does, then str() should always generate a full timestamp, including microseconds. The above is from a Python 2.7 session, but I also confirmed that Python 3.3 behaves the same.

    I've checked 2.7 and 3.3 in the Versions list, but I don't think it can be fixed there. Can the __str__ and isoformat methods of datetime (and time) objects be modified for 3.4 to always include the microseconds? Alternatively, can the %S format character be modified to consume optional decimal point and microseconds? I rate this as "easy" considering the easiest fix is to modify __str__ and isoformat, which seems unchallenging.

    @smontanaro smontanaro added type-bug An unexpected behavior, bug, or error extension-modules C modules in the Modules dir easy labels Nov 1, 2013
    @ezio-melotti
    Copy link
    Member

    ezio-melotti commented Nov 1, 2013

    See bpo-7342.

    @bitdancer
    Copy link
    Member

    bitdancer commented Nov 1, 2013

    It may be simple but as Ezio has pointed out, it has already been rejected :)

    The problem with being generous in what you accept in this context is that the parsing is using a specific format string, and the semantics of that format string are based on external "standards" and are pretty inflexible.

    The pythonic solution, IMO, is to have datetime's constructor accept what its str produces. And indeed, exactly this has been suggested by Alexander Belopolsky in bpo-15873. So I'm going to close this one as a duplicate of that one.

    @smontanaro
    Copy link
    Contributor Author

    smontanaro commented Nov 1, 2013

    I don't accept your conclusion. I understand that making %S consume microseconds or ".%f" be "optional" would be a load. What's the problem with forcing __str__ and isoformat to emit microseconds in all cases though? That would allow you to parse what they produce using existing code. No new constructor needed.

    The issue of sometimes emitting microseconds, sometimes not, is annoying, even beyond this issue. I think for consistency's sake it makes sense for the string version of datetime and time objects to always be the same length.

    @bitdancer
    Copy link
    Member

    bitdancer commented Nov 1, 2013

    It's not my conclusion. It's Guido's and the other developers who designed datetime. Argue with them. (I'd guess it would be better argued on python-ideas rather than python-dev, but use your own judgement.)

    @tim-one
    Copy link
    Member

    tim-one commented Nov 1, 2013

    The decision to omit microseconds when 0 was a Guido pronouncement, back when datetime was first written. The idea is that str() is supposed to be friendly, and for the vast number of applications that don't use microseconds at all, it's unfriendly to shove ".000000" in their face all the time. Much the same reason is behind why, e.g., str(2.0) doesn't produce "2.0000000000000000".

    I doubt this will change. If you want to use a single format, you could massage the data first, like

    if '.' not in dt:
        dt += ".000000"

    @smontanaro
    Copy link
    Contributor Author

    smontanaro commented Nov 1, 2013

    Okay, so no to __str__. What about isoformat?

    @tim-one
    Copy link
    Member

    tim-one commented Nov 1, 2013

    I don't know, Skip. Since .isoformat() and str() have always worked this way, and that was intentional, it's probably going to take a strong argument to change either.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Nov 1, 2013

    Well, I don't know if this sways anything, but I was probably responsible, and I think my argument was something about not all timestamp sources having microseconds, and not wanting to emit the ".000000" in that case. If I could go back I'd probably do something else; after all str(1.0) doesn't return '1' either. But that's water under the bridge; "fixing" this is undoubtedly going to break a lot of code.

    Maybe we can give isoformat() a flag parameter to force the inclusion or exclusion of the microseconds (with a default of None meaning the current behavior)?

    @gvanrossum gvanrossum reopened this Nov 1, 2013
    @smontanaro
    Copy link
    Contributor Author

    smontanaro commented Nov 1, 2013

    The ultimate culprit here is actually the csv module. :-) It calls str() on every element it's about to write. In my applications which write to CSV files I can special case datetime objects.

    I will stop swimming upstream.

    @bitdancer
    Copy link
    Member

    bitdancer commented Nov 1, 2013

    I suppose in an ideal world the csv module would have some sort of hookable serialization protocol, like the database modules do :)

    @terryjreedy
    Copy link
    Member

    terryjreedy commented Nov 1, 2013

    As I understand Guido's message, he reopened this to consider adding a new parameter.

    Given an existing csv file like that given, either Tim's solution or
    try: parse with microseconds
    except ValueError: parse without
    should work.

    @terryjreedy terryjreedy changed the title Inconsistency between datetime's str()/isoformat() and its strptime() method Add microsecond flag to datetime isoformat() Nov 1, 2013
    @terryjreedy terryjreedy added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Nov 1, 2013
    @abalkin
    Copy link
    Member

    abalkin commented Nov 5, 2013

    +1 on adding an option to isoformat(). We already have an optional <sep> argument, so the symmetry with __str__ is not complete. To make this option more useful, rather than implementing always_emit_microseconds=False flag, I would add a keyword argument 'precision' that would take ('hour'|'minute'|'second'|millisecond'|'microsecond') value.

    @elixir
    Copy link
    Mannequin

    elixir mannequin commented Nov 5, 2013

    I would like to implement this feature. I already wrote the Python part. Is there anything else to decide?

    @vstinner
    Copy link
    Member

    vstinner commented Nov 5, 2013

    2013/11/5 Alexander Belopolsky <report@bugs.python.org>:

    +1 on adding an option to isoformat(). We already have an optional <sep> argument, so the symmetry with __str__ is not complete. To make this option more useful, rather than implementing always_emit_microseconds=False flag, I would add a keyword argument 'precision' that would take ('hour'|'minute'|'second'|millisecond'|'microsecond') value.

    Hour precision is not part of the ISO 8601 standard.

    "resolution" is maybe a better name for the new parameter than "precision":
    http://www.python.org/dev/peps/pep-0418/#glossary

    The new parameter should be added to datetime.datetime.isoformat() but
    also datetime.time.isoformat().

    @abalkin
    Copy link
    Member

    abalkin commented Nov 5, 2013

    +1 on all Victor's points.

    I like 'resolution' because this is the term that datetime module uses already:

    >>> from datetime import *
    >>> datetime.resolution
    datetime.timedelta(0, 0, 1)

    There is a slight chance of confusion stemming from the fact that datetime.resolution is timedelta, but proposed parameter is a string.

    I believe ISO 8601 uses the word "accuracy" to describe this kind of format variations. I am leaning towards "resolution", but would like to hear from others. Here are the candidates:

    1. resolution
    2. accuracy
    3. precision

    (Note that "accuracy" is the shortest but "resolution" is the most correct.)

    @malemburg
    Copy link
    Member

    malemburg commented Nov 5, 2013

    On 05.11.2013 21:31, STINNER Victor wrote:

    2013/11/5 Alexander Belopolsky <report@bugs.python.org>:
    > +1 on adding an option to isoformat(). We already have an optional <sep> argument, so the symmetry with __str__ is not complete. To make this option more useful, rather than implementing always_emit_microseconds=False flag, I would add a keyword argument 'precision' that would take ('hour'|'minute'|'second'|millisecond'|'microsecond') value.

    Hour precision is not part of the ISO 8601 standard.

    "resolution" is maybe a better name for the new parameter than "precision":
    http://www.python.org/dev/peps/pep-0418/#glossary

    The new parameter should be added to datetime.datetime.isoformat() but
    also datetime.time.isoformat().

    Since this ticket is about being able to remove the seconds fraction
    part, I think it's better to use a name that is not already overloaded
    with other meanings, e.g. show_us=False or show_microseconds=False.

    BTW: Have you thought about the rounding/truncation issues
    associated with not showing microseconds ?

    A safe bet is truncation, but this can lead to inaccuracies of
    up to a second. Rounding is difficult, since it can lead to
    a "60" second value showing up for e.g. 11:00:59.95 seconds,
    or the need to return "12:00:00" for 11:59:59.95.

    @abalkin
    Copy link
    Member

    abalkin commented Nov 6, 2013

    MAL: Have you thought about the rounding/truncation issues
    associated with not showing microseconds ?

    I believe it has to be the truncation. Rounding is better left to the user code where it can be done either using timedelta arithmetics or at the time source. I would expect that in the majority of cases where lower resolution printing is desired the times will be already at lower resolution at the source.

    @malemburg
    Copy link
    Member

    malemburg commented Nov 6, 2013

    On 06.11.2013 16:51, Alexander Belopolsky wrote:

    MAL: Have you thought about the rounding/truncation issues
    associated with not showing microseconds ?

    Sure, otherwise I wouldn't have mentioned it :-)

    mxDateTime always uses 2 digit fractions when displaying date/time values.
    This has turned out to be a good compromise between accuracy and
    usability. In early version, I used truncation, but that caused
    (too many) roundtrip problems, so I started using careful rounding
    in later versions:

    /* Fix a second value for display as string.

    Seconds are rounded to the nearest microsecond in order to avoid
    cases where e.g. 3.42 gets displayed as 03.41 or 3.425 is diplayed
    as 03.42.

    Special care is taken for second values which would cause rounding
    to 60.00 -- these values are truncated to 59.99 to avoid the value
    of 60.00 due to rounding to show up even when the indicated time
    does not point to a leap second. The same is applied for rounding
    towards 61.00 (leap seconds).

    The second value returned by this function should be formatted
    using '%05.2f' (which rounds to 2 decimal places).

    */

    This approach has worked out well, though YMMV.

    I believe it has to be the truncation. Rounding is better left to the user code where it can be done either using timedelta arithmetics or at the time source. I would expect that in the majority of cases where lower resolution printing is desired the times will be already at lower resolution at the source.

    In practice you often don't know the resolution of
    the timing source. Nowadays, the reverse of what you said
    is usually true: the source resolution is higher than the
    precision you use to print it.

    MS SQL Server datetime is the exception to that rule, with a
    resolution of 333ms and weird input "rounding":

    http://msdn.microsoft.com/en-us/library/ms187819.aspx

    For full seconds, truncation will add an error of +/- 1 second,
    whereas rounding only adds +/- 0.5 seconds. This is what convinced
    me to use rounding instead of truncation.

    @abalkin
    Copy link
    Member

    abalkin commented Nov 6, 2013

    I am afraid that the rounding issues may kill this proposal. Can we start with something simple? For example, we can start with show=None keyword argument and allow a single value 'microseconds' (or 'us'). This will solve the issue at hand with a reasonable syntax: t.isoformat(show='us'). If other resolutions will be required, we can later add more values and may even allow t.isoformat(show=2) to show 2 decimal digits.

    @smontanaro
    Copy link
    Contributor Author

    smontanaro commented Nov 6, 2013

    I am afraid that the rounding issues may kill this proposal. Can we start with something simple? For example, we can start with show=None keyword argument and allow a single value 'microseconds' (or 'us'). This will solve the issue at hand with a reasonable syntax: t.isoformat(show='us'). If other resolutions will be required, we can later add more values and may even allow t.isoformat(show=2) to show 2 decimal digits.

    I don't think the meaning of this proposed show keyword argument
    should be overloaded as you suggest. If you show microseconds, just
    show all of them.

    Furthermore...

    If we go far enough back, my original problem was really that the
    inclusion of microseconds in csv module output was inconsistent,
    making it impossible for me to later parse those values in another
    script using a fixed strptime format. Since the csv module uses str()
    to convert input values for output, nothing you do to isoformat() will
    have any effect on my original problem.

    In my own code (where I first noticed the problem) I acquiesced, and
    changed this

    d["time"] = now

    to this:

    d["time"] = now.strftime("%Y-%m-%dT%H:%M:%S.%f")

    where "now" is a datetime object. I thus guarantee that I can parse
    these timestamps later using the same format. I realize the inclusion
    of "T" means my fields changed in other ways, but that was
    intentional, and not germane to this discussion.

    So, fiddle all you want with isoformat(), but do it right. I vote that
    if you want to add a show parameter it should simply include all
    fields down to that level, omitting any lower down. If people want to
    round or truncate things you can give them that option, returning a
    suitably adjusted, new datetime object. I don't think rounding,
    truncation, or other numeric operations should be an element of
    conversion to string form. This does not happen today:

    >>> import datetime
    >>> x = datetime.datetime.now()
    >>> x
    datetime.datetime(2013, 11, 6, 12, 19, 5, 759020)
    >>> x.strftime("%Y-%m-%d %H:%M:%S")
    '2013-11-06 12:19:05'

    (%S doesn't produce "06")

    Skip

    @abalkin
    Copy link
    Member

    abalkin commented Jun 30, 2014

    Here is some "prior art": GNU date utility has an --iso-8601[=timespec] option defined as

    ‘-I[timespec]’
    ‘--iso-8601[=timespec]’
    Display the date using the ISO 8601 format, ‘%Y-%m-%d’.
    The argument timespec specifies the number of additional terms of the time to include. It can be one of the following:

    ‘auto’
    Print just the date. This is the default if timespec is omitted.
    ‘hours’
    Append the hour of the day to the date.
    ‘minutes’
    Append the hours and minutes.
    ‘seconds’
    Append the hours, minutes and seconds.
    ‘ns’
    Append the hours, minutes, seconds and nanoseconds.
    If showing any time terms, then include the time zone using the format ‘%z’.

    https://www.gnu.org/software/coreutils/manual/html_node/Options-for-date.html

    @abalkin
    Copy link
    Member

    abalkin commented Jan 17, 2016

    I left some comments on Rietveld.

    @abalkin
    Copy link
    Member

    abalkin commented Jan 17, 2016

    I don't really think nanoseconds belong here.

    What about milliseconds? I'll leave it for Guido to make a call on nanoseconds. My vote is +0.5.

    If they don't
    exist anywhere else in the module, why should they be suddenly
    introduced here?

    The timespec feature is modeled after GNU date --iso-8601[=timespec] option which does support nanoseconds. It is fairly common to support nanoseconds these days and it does not cost much to implement.

    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented Jan 17, 2016

    What about milliseconds? I'll leave it for Guido to make a call on nanoseconds. My vote is +0.5.
    The only reason I didn't mention milliseconds because they exist in timedelta instantiation. And really, being the only place in the whole module they're as confusing there as would be nanoseconds.

    > If they don't
    > exist anywhere else in the module, why should they be suddenly
    > introduced here?

    The timespec feature is modeled after GNU date --iso-8601[=timespec] option which does support nanoseconds. It is fairly common to support nanoseconds these days and it does not cost much to implement.

    Yes, but the module does not support nanoseconds. And putting any such options would require a huge banner saying that the nanosecond option will just always result in three zeros at the end. My suggestion is not to pretend that we suddenly "support" nanoseconds, but rather to follow the actual implementation of the module and add the support for nanoseconds timespec when the module actually adds support for them.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Jan 18, 2016

    You can leave out the nanoseconds but please do add the milliseconds. I'm sure they would find more use than the option to show only the hours.

    @alessandrocucci
    Copy link
    Mannequin

    alessandrocucci mannequin commented Feb 21, 2016

    New patch

    @vadmium
    Copy link
    Member

    vadmium commented Feb 21, 2016

    Left some review suggestions

    @alessandrocucci
    Copy link
    Mannequin

    alessandrocucci mannequin commented Feb 22, 2016

    New patch after @martin.panter comments on Rietveld. I left only this:

    • 'milliseconds': Append the hours, minutes, seconds and milliseconds.

    vadmium 2016/02/21 23:30:20
    I think this should explain that fractions are truncated to zero, never
    rounded
    up. At least for fractions of milliseconds, although this could apply
    to the
    other options as well.

    I think is quite obvious that a datetime.now() can't be rounded to the future if microseconds are 999500.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 23, 2016

    About rounding: I’m not too sure what people would expect. Obviously it is much easier to implement truncating to zero. But it is different to many other rounding cases in Python; that is why I thought to make it explicit.

    >>> datetime.fromtimestamp(59.9999999).isoformat(timespec="microseconds")
    '1970-01-01T00:01:00.000000'
    >>> datetime.fromtimestamp(59.999999).isoformat(timespec="milliseconds")
    '1970-01-01T00:00:59.999'
    >>> format(59.999999, ".3f")
    '60.000'

    @alessandrocucci
    Copy link
    Mannequin

    alessandrocucci mannequin commented Feb 25, 2016

    Oh, now I see your point.

    I've uploaded a new patch with a note for that.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Feb 25, 2016

    Out of context here, but regarding round vs. truncate, IIUC for time
    truncating down is the norm. My digital clock shows "12:00" for the
    duration of the minute starting at noon. People look for clocks to
    flip to know when it is exactly a given time (if the clock is accurate
    enough).

    @abalkin
    Copy link
    Member

    abalkin commented Feb 25, 2016

    We discussed truncation vs. rounding some time ago. See msg202270 and the posts around it. The consensus was the same as Guido's current advise: do the truncation.

    @alessandrocucci
    Copy link
    Mannequin

    alessandrocucci mannequin commented Mar 1, 2016

    @belopolsky could you please review one of the latest two patches submitted? I think I've done all required. Now I'll wait from you if I have to do more.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 1, 2016

    Guido,

    Did you consider MAL's msg202274? I am still in favor of truncation, but would like to make sure we are not missing something that MAL knows from experience.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 1, 2016

    Another argument for truncation is that this is what GNU date does:

    $ date --iso-8601=seconds --date="2016-03-01 15:00:00.999"
    2016-03-01T15:00:00-0500

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Mar 1, 2016

    Given that we're talking about what to do when we're suppressing the usecs I don't think roundtripping matters. :-)

    @vstinner
    Copy link
    Member

    vstinner commented Mar 1, 2016

    Given that we're talking about what to do when we're suppressing the usecs I don't think roundtripping matters. :-)

    I changed many times how Python rounds nanoseconds in the private PyTime API, and I got a bug report because of that! => issue bpo-23517.

    By the way, I wrote an article to explain the history the private PyTime API, especially changes on rounding ;-) https://haypo.github.io/pytime.html

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Mar 1, 2016

    But what should we do in your opinion?

    @abalkin
    Copy link
    Member

    abalkin commented Mar 1, 2016

    I hope my prediction "I am afraid that the rounding issues may kill this proposal" (see msg202276) will not come true.

    I think the correct way to view "timespec" is a way to suppress/enforce printing of trailing digits.

    Users that choose printing less than full usec format should make sure that their datetime instances are properly rounded before printing.

    Unfortunately, I does not look like the datetime module makes rounding easy. The best I can think of is something like

    def round_datetime(dt, delta):
        dt0 = datetime.combine(dt.date(), time(0))
        return dt0 + round((dt - dt0) / delta) * delta

    Maybe a datetime.round() method along these lines will be a worthwhile addition?

    @vstinner
    Copy link
    Member

    vstinner commented Mar 1, 2016

    But what should we do in your opinion?

    Use ROUND_FLOOR rounding method.

    time.time(), datetime.datetime.now(), etc. round the current time using the ROUND_FLOOR rounding method.

    Only datetime.datetime.fromtimestamp() uses ROUND_HALF_EVEN, but it's more an exception than the rule: this function uses a float as input. To be consistent, we must use the same rounding method than other Python functions taking float as parameter, like round(), so use ROUND_HALF_EVEN.

    So I suggest to also use ROUND_FLOOR for .isoformat().

    Hopefully, we don't have to discuss about the exact rounding method for negative numbers, since the minimum datetime object is datetime.datetime(1, 1, 1) which is "positive" ;-)

    You have a similar rounding question for file timestamps. Depending on the file system, you may have a resolution of 2 seconds (FAT), 1 second (ext3) or 1 nanosecond (ext4). But Linux syscalls accept subsecond resolution. The Linux kernel uses ROUND_FLOOR rounding method if I recall correctly. I guess that it's a requirement for makefiles. If you already experimented a system clock slew, you may understand me :-)

    For full seconds, truncation will add an error of +/- 1 second,
    whereas rounding only adds +/- 0.5 seconds. This is what convinced
    me to use rounding instead of truncation.

    What is truncation? Is it the ROUND_FLOOR (towards -inf) rounding method? Like math.floor(float).

    Python int(float) uses ROUND_DOWN (towards zero) which is different than ROUND_FLOOR, but only different for negative numbers. int(-0.9) returns 0, whereas math.floor(-0.9) returns -1.

    I guess that "rounding" means ROUND_HALF_EVEN here? The funny "Round to nearest with ties going to nearest even integer" rounding method. Like round(float).

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Mar 1, 2016

    Except for the case where you're closer than half a usec from the next value, IMO rounding makes no sense when suppressing digits. I most definitely would never want 9:59:59 to be rounded to 10:00 when suppressing seconds. If you really think there are use cases for that you could add a 'round=True' flag (as long as it defaults to False). That seems better than supporting rounding on datetime objects themselves. But I think you're just speculating.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Mar 1, 2016

    IIUC truncation traditionally means "towards zero" -- that's why we have separate "floor" and "ceiling" operations meaning "towards [negative] infinity". Fortunately we shouldn't have to deal with negative values here so floor and truncate mean the same thing. Agreed that isoformat() should also truncate.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 1, 2016

    Maybe a datetime.round() method along these lines will be a worthwhile addition?

    Sorry, what is the use case of this method?

    @abalkin
    Copy link
    Member

    abalkin commented Mar 1, 2016

    Personally, I don't rounding is that useful. My working assumption is that users will select say timespec='millisecond' only when they know that their time source produces datetime instances with millisecond precision and they don't want to kill more trees by printing redundant 0's.

    MAL's objection this this line of arguments was that some time sources have odd resolution (he reported MS SQL's use of 333 ms) and a user may want to have a perfect round-tripping when using a sub-usec timespec and such an odd time source or destination.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 1, 2016

    Personally, I don't rounding is that useful.

    Nice, it looks like I agree with you on using ROUNDING_FLOOR :-)

    I don't think that we should be prepared for theorical user requests, but rather focus on the concrete and well defined current existing user request: "Add timespec optional flag to datetime isoformat() to choose the precision".

    Let's wait until users request a datetime.round() method to understand better concrete issues.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 1, 2016

    I feel odd trying to advocate a POV that I disagree with, so let me just quote MAL:

    """
    In practice you often don't know the resolution of
    the timing source. Nowadays, the reverse of what you said
    is usually true: the source resolution is higher than the
    precision you use to print it.
    ..

    For full seconds, truncation will add an error of +/- 1 second,
    whereas rounding only adds +/- 0.5 seconds. This is what convinced
    me to use rounding instead of truncation.
    """

    I somehow missed this argument when Marc-Andre made it, so I want to make sure that it is properly considered before we finalize this issue.

    @alessandrocucci
    Copy link
    Mannequin

    alessandrocucci mannequin commented Mar 2, 2016

    Meanwhile I made corrections after @belopolsky latest review

    @abalkin
    Copy link
    Member

    abalkin commented Mar 2, 2016

    Alessandro, thank you very much for your work and perseverance. I will do my best to commit this next weekend.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 6, 2016

    New changeset eb120f50df4a by Alexander Belopolsky in branch 'default':
    Closes bpo-19475: Added timespec to the datetime.isoformat() method.
    https://hg.python.org/cpython/rev/eb120f50df4a

    @python-dev python-dev mannequin closed this as completed Mar 6, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests