Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateFormatter shows microseconds instead of %f for years <= 1900 #3242

Merged
merged 1 commit into from Jun 18, 2015

Conversation

tacaswell
Copy link
Member

closes #3179

@tacaswell tacaswell added this to the v1.4.0 milestone Jul 12, 2014
def strftime(self, dt, fmt):
fmt = self.illegal_s.sub(r"\1", fmt)
fmt = fmt.replace("%s", "s")
if dt.year > 1900:
return cbook.unicode_safe(dt.strftime(fmt))

# Dalke: I hope I did this math right. Every 28 years the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is scary…

@NelleV
Copy link
Member

NelleV commented Jul 12, 2014

That looks good.
Note: I think we should try to have this patch ported back to the standard library.

@tacaswell tacaswell modified the milestones: v1.4.x, v1.4.0 Jul 13, 2014
@WeatherGod
Copy link
Member

Travis failure is memory-related on py2.7 only.

I would like to see a test added for this.

@tacaswell
Copy link
Member Author

@azjps Do you have any interest in adding a test for this?

@azjps
Copy link
Contributor

azjps commented Oct 20, 2014

I wasn't able to build matplotlib on both my home (windows) and work (linux) environments causing me to forget about this commit. I might be able to get it to build now; if so I'll add tests soon. 👍

@tacaswell
Copy link
Member Author

For windows builds the docs now point to a set of scripts that reportedly reliably build mpl on windows.

@azjps
Copy link
Contributor

azjps commented Nov 12, 2014

Sorry for delay :(, I was having trouble getting tests to work. Seems that there are a few issues:

  • mdates.DateFormatter("%y") doesn't work for years before 1901. ("%Y" implementation is something of a hack)
  • mdates.DateFormatter("%w") doesn't work for years before 1899 (although my understanding is that this is the main point of this strftime 28-year cycling implementation?). The implementation now tries to (1) account for non leap-years on the centuries and then (2) shift by a multiple of 28 years to preserve the correct calendar, although (1) already has caused the calendar to be mismatched. For example, this procedure maps the calendar of the non-leap-year 1898 to the leap-year 1984. If so then possible solutions are to (1) instead of shifting by a multiple of 28 years, shift by a multiple of 2800 years (2) drop support before 1900. Maybe someone who worked on the implementation of DateFormatter.strftime can elaborate?
  • dt.strftime("%W") could be compared with dt.isocalendar()[1] but they don't necessarily match for the first week of a year. However, it shouldn't be a difficult test to add.
def test_date_formatter_strftime():
    """
    Tests that DateFormatter matches datetime.strftime,
    check microseconds for years before 1900 for bug #3179.
    """
    def assert_equal(a, b): 
        """Re-defined here for debugging ease"""
        if a != b:
            print a
            print b
            print "-----"
        else:
            print "Passed", a

    def test_strftime_fields(dt):
        """For datetime object dt, check DateFormatter fields"""
        formatter = mdates.DateFormatter("%w %d %m %y %Y %H %I %M %S %f")
        # Compute date fields without using datetime.strftime,
        # since datetime.strftime does not work before year 1900
        formatted_date_str = (
            "{weekday} {day:02d} {month:02d} {year:02d} {full_year:4d} "
            "{hour24:02d} {hour12:02d} {minute:02d} {second:02d} "
            "{microsecond:06d}".format(  # {weeknum:02d}
            # %w Sunday=0, weekday() Monday=0
            weekday = str((dt.weekday() + 1) % 7),
            day = dt.day,
            month = dt.month,
            year = dt.year % 100,
            full_year = dt.year,
            hour24 = dt.hour,
            hour12 = ((dt.hour-1) % 12) + 1,
            minute = dt.minute,
            second = dt.second,
            microsecond = dt.microsecond))
            # weeknum = dt.isocalendar()[1]))
        assert_equal(formatter.strftime(dt, formatter.fmt), formatted_date_str)

    for dt in [datetime.datetime(year, 1, 1) for year in range(1898, 1902)]:
        test_strftime_fields(dt)

Here is the output of this test:

In [2]: from .. import test_date_formatter_strftime

In [3]: test_date_formatter_strftime()
5 01 01 88 1898 00 12 00 00 000000
6 01 01 98 1898 00 12 00 00 000000
-----
0 01 01 89 1899 00 12 00 00 000000
0 01 01 99 1899 00 12 00 00 000000
-----
1 01 01 90 1900 00 12 00 00 000000
1 01 01 00 1900 00 12 00 00 000000
-----
Passed 2 01 01 01 1901 00 12 00 00 000000

Here's a thread on python datetime strftime support pre-1900.

@azjps
Copy link
Contributor

azjps commented Nov 17, 2014

Should I raise these as a new issue?

@tacaswell
Copy link
Member Author

I am a bit confused on how these are different from the original issue you
raised.

Could you add a minimal test to this pr (presumably you fixed something).
If there tightly connected issues please fix them here. If the other issues
are related, but can be treated independently, fix them in a new pr.

On Mon, Nov 17, 2014, 12:59 azjps notifications@github.com wrote:

Should I raise these as a new issue?


Reply to this email directly or view it on GitHub
#3242 (comment)
.

@azjps
Copy link
Contributor

azjps commented Nov 19, 2014

Its strictly speaking unrelated, what I'd noticed originally was that %f wasn't working (because author of mdates.DateFormatter.strftime used time module instead of datetime module), which is easy to fix. What I notice now is that %y and %w/%W/%U/etc also don't work, which isn't as easy to fix, as its effectively an issue that the authors of the python datetime library themselves decided was too tricky and declined to implement.

I will change to a simpler test for this pr and raise what I wrote above as another issue.

@tacaswell tacaswell modified the milestones: v1.4.x, 1.5.0 Feb 7, 2015
@azjps azjps force-pushed the microsecondformatter branch 2 times, most recently from b208e4a to e8fb275 Compare February 25, 2015 16:43
@azjps
Copy link
Contributor

azjps commented Feb 25, 2015

Came back to this since there's actually a pretty good reason for wanting to use datetimes before 1900. Everything I wrote above about %w was wrong, it was my own strftime implementation that was incorrect. (In my defense the strftime implementation is something of a hack with regards to assumptions about a time object's internals.)

However, %y %f %x were indeed not handled correctly, and unfortunately I don't think there's a way to handle %f correctly just using time/datetime's strftime. Instead I added a flag with which DateFormatter will first use some regular expressions to try and replace %y %f %x with the correct values:

In [1]: import datetime; import matplotlib.dates as mdates

In [2]: df = mdates.DateFormatter("%%%y %%%f %%%x")

In [3]: df.strftime(datetime.datetime(111, 2, 3, 4, 5, 6, 123456), df.fmt)
Out[3]: u'%79 %%f %02/03/79'

In [4]: df.replace_directives_before_1900 = True

In [5]: df.strftime(datetime.datetime(111, 2, 3, 4, 5, 6, 123456), df.fmt)
Out[5]: u'%11 %123456 %02/03/11'

I'm not confident that this regular expression handling works in all cases so I didn't set it as the default.

For convenience, here's a longer test:

  def test_date_formatter_strftime():
      """
      Tests that DateFormatter matches datetime.strftime,
      check microseconds for years before 1900 for bug #3179.
      """
      import datetime as datetime
      import matplotlib.dates as mdates
      def assert_equal(a, b):
          """Re-defined here for debugging ease"""
          if a != b:
              print(a)
              print(b)
              print("-----")
          else:
              print("Passed", a)

      def test_strftime_fields(dt):
          """For datetime object dt, check DateFormatter fields"""
          formatter = mdates.DateFormatter("%w %d %m %Y %H %I %M %S")
          # Compute date fields without using datetime.strftime,
          # since datetime.strftime does not work before year 1900
          formatted_date_str = (
              "{weekday} {day:02d} {month:02d} {full_year:4d} "
              "{hour24:02d} {hour12:02d} {minute:02d} {second:02d}"
              .format(
              # weeknum=dt.isocalendar()[1],  # %U/%W {weeknum:02d}
              # %w Sunday=0, weekday() Monday=0
              weekday=str((dt.weekday() + 1) % 7),
              day=dt.day,
              month=dt.month,
              full_year=dt.year,
              hour24=dt.hour,                            
              hour12=((dt.hour-1) % 12) + 1,
              minute=dt.minute,           
              second=dt.second))
          assert_equal(formatter.strftime(dt, formatter.fmt), formatted_date_str)

          formatter2 = mdates.DateFormatter("%y %f %x")
          formatter2.replace_directives_before_1900 = True
          formatted_date_str2 = (
              "{year:02d} {microsecond:06d} {month:02d}/{day:02d}/{year:02d}".format(
              day=dt.day, month=dt.month, year=dt.year % 100, microsecond=dt.microsecond))
          assert_equal(formatter2.strftime(dt, formatter2.fmt), formatted_date_str2)

      for year in range(1, 3000, 71):
          # Iterate through random set of years
          test_strftime_fields(datetime.datetime(year, 1, 1))
          test_strftime_fields(datetime.datetime(year, 2, 3, 4, 5, 6, 123456))

@azjps
Copy link
Contributor

azjps commented Feb 25, 2015

Looks as if Travis doesn't like my test, does it display a stack trace somewhere? It seems to pass tests when I build and run tests locally (python setup.py install && python tests.py).

@tacaswell
Copy link
Member Author

@azjps


======================================================================
FAIL: Tests that DateFormatter matches datetime.strftime,
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/matplotlib-1.5.dev1-py2.7-linux-x86_64.egg/matplotlib/tests/test_dates.py", line 197, in test_date_formatter_strftime
    test_strftime_fields(datetime.datetime(year, 1, 1))
  File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/matplotlib-1.5.dev1-py2.7-linux-x86_64.egg/matplotlib/tests/test_dates.py", line 193, in test_strftime_fields
    assert_equal(formatter2.strftime(dt, formatter2.fmt), formatted_date_str2)
AssertionError: u'18 000000 01/01/1918' != u'18 000000 01/01/18'
- 18 000000 01/01/1918
?                  --
+ 18 000000 01/01/18

It looks like the problem is that you are doing year % 100 in generating the expected string.

@tacaswell
Copy link
Member Author

and if click on the red 'x' it will take you to travis and you can eventually get down to a page with the log from running the tests.

@tacaswell
Copy link
Member Author

ping @azjps

@azjps
Copy link
Contributor

azjps commented May 21, 2015

Ah, %x returns the current locale's representation of the current date, and the year can be displayed in different ways (but usually with either the 2-digit or 4-digit year). And the set of locales seems to be platform-specific, so writing a test for this might be a bit tricky. Maybe we should just let %x display the wrong year before 1900?

replace %y %x %f with the appropriate values.
"""
if replace_directives:
replace_map = \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use 4 spaces for indents to match the rest of the code base?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we try to avoid the trailing backslash continuation. How about using

replace_map = dict(f='{:06d}'.format(dt.microsecond),
                   x='{:02d}/{:02d}/{:02d}'.format(
                               dt.month, dt.day, dt.year % 100),
                   y='{:02d}'.format(dt.year % 100))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support py2.6, so you would have to number the formatters.

On Sun, May 24, 2015 at 3:22 PM, Eric Firing notifications@github.com
wrote:

In lib/matplotlib/dates.py
#3242 (comment):

  • def strftime(self, dt, fmt):
  •    fmt = self.illegal_s.sub(r"\1", fmt)
    
  •    fmt = fmt.replace("%s", "s")
    
  •    if dt.year > 1900:
    
  •        return cbook.unicode_safe(dt.strftime(fmt))
    
  •    Dalke: I hope I did this math right.  Every 28 years the
    
  •    calendar repeats, except through century leap years excepting
    
  •    the 400 year leap years.  But only if you're using the Gregorian
    
  •    calendar.
    
  •    This implementation uses time's strftime and only handles 4-digit
    
  •    years.  If replace_directive is set to True, as a hack we try and
    
  •    replace %y %x %f with the appropriate values.
    
  •    """
    
  •    if replace_directives:
    
  •      replace_map = \
    

Also, we try to avoid the trailing backslash continuation. How about using

replace_map = dict(f='{:06d}'.format(dt.microsecond),
x='{:02d}/{:02d}/{:02d}'.format(
dt.month, dt.day, dt.year % 100),
y='{:02d}'.format(dt.year % 100))


Reply to this email directly or view it on GitHub
https://github.com/matplotlib/matplotlib/pull/3242/files#r30954958.

@tacaswell
Copy link
Member Author

so to get this right we would have to inspect the locale and match that?

I think a better work around is to drop all of the % 100 to match the locale on travis and to break the ambiguity 1890/1990 and 1910/2010.

@azjps
Copy link
Contributor

azjps commented May 25, 2015

Actually, I think I've managed to sort out the issue (passed test with my current locale .. crossing my fingers for travis 😛). Now I just replace all 4-digit years (i.e. % 1000), then replace all 2-digit years (i.e. % 100). Behaviorally this should give sane behavior for all cases I can think of.

Thanks for comments, I think I accounted for style concerns too.

@tacaswell
Copy link
Member Author

Can you add a note to https://github.com/matplotlib/matplotlib/tree/master/doc/users/whats_new so the new feature (the default value of fmt + strftime as a useful part of the public API) gets advertised?

This good other wise, I think the use of regex is fine.

Thanks for your work on this!

It also fails to replace %y or %x correctly, since
its strftime implementation replaces only 4-digit years.

Instead, we now use a regular expression to replace %f
with the microsecond value. (We could also be tricky and
call strftime again with the original datetime object ..)
We also make substitutions for both 2-digit and 4-digit
years, which for example are displayed by %y, %Y, and %x.

Minor point: in time.h, strftime will not use padding for
%y and %Y but will use zero-padding for %x. Since this is
unlikely to ever be a cause of concern and isn't documented
anywhere afaik, I've used zero-padding for all three.
(Certainly it's preferable to just printing the wrong year!)

Add tests and (maybe excessively long?) comments.

closes matplotlib#3179

Change-Id: Idff9d06cbc6dc00a3cb8dcf113983d82dbdd3fde
@azjps
Copy link
Contributor

azjps commented May 27, 2015

Yay! Aside, is git commit --amend considered the usual workflow for incrementally updating these github pull requests? Its a habit I have from using gerrit.

@tacaswell
Copy link
Member Author

We tend to mostly just add new commits. We probably should be better about squashing, but keeping the history of a feature around is also useful and Commits are cheap.

tacaswell added a commit that referenced this pull request Jun 18, 2015
ENH: DateFormatter for years <= 1900
@tacaswell tacaswell merged commit 8609886 into matplotlib:master Jun 18, 2015
@tacaswell tacaswell mentioned this pull request Jul 22, 2015
tacaswell added a commit that referenced this pull request Jul 22, 2015
Merged locally to deal with conflicts arising from #3242 (merged as
8609886) which also touched the
pre-1900 date formatting code.

Resolved conflicts in favor of #3242 which refactored the pre-1900 code
into a helper function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug : (minor) time axis labels show "%f" instead of microseconds for years up to 1900
5 participants