Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to resample over DST boundaries #9623

Closed
wants to merge 1 commit into from
Closed

Conversation

rockg
Copy link
Contributor

@rockg rockg commented Mar 10, 2015

closes #5172
closes #8744
closes #8653
closes #9173
closes #9468
closes #8794

fixes all issues but one attached to #5172. There were several issues as mentioned by my comments in the master list. They are:

  1. Offset classes were not working properly over DST boundaries. I added flags on the class to handle specially.
  2. normalize_date does not work with DST change dates as the tz offset is not adjusted. normalize is added to Timestamp as discussed in ENH: add .normalize() to Timestamp? #8794 (I don't like how this was done, but it seemed the easiest rather than duplicating lots of code in tslib.date_normalize.
  3. _adjust_dates_anchored was not localizing properly once times were adjusted.

@rockg
Copy link
Contributor Author

rockg commented Mar 10, 2015

The outstanding issue is #9119 which is timezone related, but not DST related.

@jreback
Copy link
Contributor

jreback commented Mar 10, 2015

@rockg awesome!

I listed all of the issues, pls give a check.

pls give a release note (listing all of the issues) as well.

@jreback jreback added Bug Timezones Timezone data dtype Resample resample method labels Mar 10, 2015
@jreback jreback added this to the 0.16.0 milestone Mar 10, 2015
@jreback
Copy link
Contributor

jreback commented Mar 10, 2015

@rockg can you run a perf test (looks like it should be fine, but pls check)

@jreback
Copy link
Contributor

jreback commented Mar 10, 2015

@rockg I'd like to get this in the 0.16.0 rc, which I am going to do prob tomorrow (0.16. prob 1 week later), can you update?

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

Yes, fixing everything up. Should have something in an hour.

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

Wow that vbench takes over an hour (and it only works for python 2 which was a nice surprise). There are a few date things in here which surprised me.

reshape_stack_simple                         |   3.8433 |   3.5164 |   1.0930 |
frame_ctor_dtindex_BusinessDayx1             |   2.2870 |   2.0920 |   1.0932 |
dataframe_resample_min_numpy                 |   3.2500 |   2.9697 |   1.0944 |
frame_ctor_dtindex_CustomBusinessDayx1       |   2.3030 |   2.1026 |   1.0953 |
frame_ctor_dtindex_BDayx2                    |   2.3340 |   2.1287 |   1.0965 |
frame_ctor_dtindex_YearBeginx1               |   2.2504 |   2.0523 |   1.0965 |
frame_ctor_dtindex_CBMonthEndx1              |   6.5024 |   5.8760 |   1.1066 |
groupby_ngroups_100_sum                      |   0.7041 |   0.6344 |   1.1099 |
groupby_ngroups_100_count                    |   0.5260 |   0.4703 |   1.1183 |
reshape_unstack_simple                       |   5.0033 |   4.4693 |   1.1195 |
concat_empty_frames2                         |   1.4807 |   1.3180 |   1.1234 |
timeseries_custom_bmonthend_incr             |   0.3620 |   0.3190 |   1.1348 |
panel_shift_minor                            |   0.1633 |   0.1436 |   1.1372 |
replace_replacena                            |   2.7843 |   2.4397 |   1.1413 |
timeseries_custom_bmonthbegin_decr_n         |   0.4679 |   0.4090 |   1.1442 |
replace_fillna                               |   3.4404 |   3.0044 |   1.1451 |
timeseries_iter_periodindex_preexit          |  14.1080 |  12.3093 |   1.1461 |
timeseries_custom_bmonthbegin_incr_n         |   0.4480 |   0.3854 |   1.1625 |
strings_lower                                |   5.3800 |   4.4383 |   1.2122 |
timeseries_custom_bday_cal_decr              |   0.0687 |   0.0520 |   1.3211 |
timeseries_custom_bday_decr                  |   0.0679 |   0.0513 |   1.3235 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0690 |   0.0517 |   1.3354 |
timeseries_custom_bday_cal_incr              |   0.0617 |   0.0447 |   1.3808 |
timeseries_custom_bday_cal_incr_n            |   0.0620 |   0.0447 |   1.3879 |
timeseries_year_apply                        |   0.0453 |   0.0319 |   1.4179 |
timeseries_year_incr                         |   0.0490 |   0.0337 |   1.4528 |
timeseries_custom_bday_apply_dt64            |   0.0520 |   0.0356 |   1.4598 |
timeseries_custom_bday_incr                  |   0.0477 |   0.0320 |   1.4888 |
timeseries_custom_bday_apply                 |   0.0456 |   0.0306 |   1.4909 |
frame_object_equal                           |  21.1930 |  11.3687 |   1.8642 |
frame_nonunique_equal                        |  21.2193 |  11.3026 |   1.8774 |

@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

vbench looks fine, the things that take < 1ms are really hard to time correctly.
usually I pass -r regex which will only do those tests that are most interesting.

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

Great news.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

I think this is ok, however, need to check one thing. Can you make a pickle from 0.15.2 and add to the tests, like this: https://github.com/pydata/pandas/blob/master/pandas/tests/test_categorical.py#L2640 (need to test that the DateOffsets survive this change). Basically save say a dict of a couple of examples, pickle it to a file (in 0.15.2), then read in (in the tests) and make sure they compare equal (I think they should), but since adding this attribute, need to make sure.

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

The pickle test worked fine. Think we should be good on that front.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

ok looks good

will merge in the morning

if u have s chance see if anything needs to be updated in the docs about tz (eg in resample section)

changes to offset classes that weren't working over
such boundaries as well as adding normalize() on
Timestamp.
@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

merged via dcc68d7

thanks!

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

@jreback...thanks. There are a couple travis errors on two of 2.7 jobs with the pickle test (however one of them succeeded). Any thoughts?

======================================================================
ERROR: test_pickle_v0_15_2 (pandas.tseries.tests.test_offsets.TestCommon)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/pydata/pandas/pandas/tseries/tests/test_offsets.py", line 424, in test_pickle_v0_15_2
    self.assertEqual(offsets, read_pickle(pickle_path))
  File "/home/travis/build/pydata/pandas/pandas/io/pickle.py", line 60, in read_pickle
    return try_read(path)
  File "/home/travis/build/pydata/pandas/pandas/io/pickle.py", line 57, in try_read
    return pc.load(fh, encoding=encoding, compat=True)
  File "/home/travis/build/pydata/pandas/pandas/compat/pickle_compat.py", line 116, in load
    return up.load()
  File "/home/travis/miniconda/envs/pandas/lib/python2.6/pickle.py", line 858, in load
    dispatch[key](self)
  File "/home/travis/build/pydata/pandas/pandas/compat/pickle_compat.py", line 20, in load_reduce
    stack[-1] = func(*args)
  File "/home/travis/miniconda/envs/pandas/lib/python2.6/copy_reg.py", line 48, in _reconstructor
    obj = object.__new__(cls)
TypeError: object.__new__(X): X is not a type object (classobj)

@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

I rebuilt the pick on 0.15.2 and it works fine. How did you create the pickle?

I have only seen this error IIRC if you pickle I think in text mode.

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015 via email

@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

Python 2?

@rockg
Copy link
Contributor Author

rockg commented Mar 11, 2015

Almost positive (I can confirm tonight). I am positive that I ran the test locally with Python 2.7. Anyways, I'm not sure it's worth more thought right now. Thanks for taking care of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment