New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge_asof() must be able to operate with timezone-aware DatetimeIndex #14844

Closed
chrisaycock opened this Issue Dec 9, 2016 · 0 comments

Comments

Projects
None yet
2 participants
@chrisaycock
Contributor

chrisaycock commented Dec 9, 2016

I can perform the following merge just fine:

left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'),
                                              freq='D', periods=5),
                     'value1':np.arange(5)})
right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'),
                                               freq='D', periods=5),
                      'value2':list("ABCDE")})
pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))

However, adding a timezone to the DatetimeIndex doesn't work:

import pytz
left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'),
                                              freq='D', periods=5, tz=pytz.timezone('UTC')),
                     'value1':np.arange(5)})
right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'),
                                               freq='D', periods=5, tz=pytz.timezone('UTC')),
                      'value2':list("ABCDE")})
pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))

I get the oddly worded

MergeError: incompatible tolerance, must be compat with type <class 'pandas.tseries.index.DatetimeIndex'>

The solution is actually very simple. _AsOfMerge._get_merge_keys() needs to check for is_datetime64tz_dtype() in addition to is_datetime64_dtype() when there is a tolerance. I should also fix the error message to be clearer.

@jreback jreback added this to the 0.19.2 milestone Dec 9, 2016

@jreback jreback closed this in e991141 Dec 10, 2016

yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

Merge remote-tracking branch 'origin/master' into bf-cython
* origin/master: (22 commits)
  BUG: astype falsely converts inf to integer (GH14265) (#14343)
  BUG: Apply min_itemsize to index even when not appending
  DOC: warning section on memory overflow when joining/merging dataframes on index with duplicate keys (#14788)
  BLD: missing - on secure
  BLD: new access token on pandas-dev
  TST: Test DatetimeIndex weekend offset (#14853)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ...

yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

Merge commit 'v0.19.0-174-g81a2f79' into releases
* commit 'v0.19.0-174-g81a2f79': (156 commits)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ENH: Introduce UnsortedIndexError  GH11897 (#14762)
  ENH: Add the ability to have a separate title for each subplot when plotting (#14753)
  DOC: Fix grammar and formatting typos (#14803)
  BLD: try new build credentials for pandas-docs
  TST: Test pivot with categorical data
  MAINT: Cleanup pandas/src/parser (#14740)
  ...

yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

Merge branch 'releases' (as of v0.19.0-174-g81a2f79) into debian
release 0.19.1 was from release branch

* releases: (156 commits)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ENH: Introduce UnsortedIndexError  GH11897 (#14762)
  ENH: Add the ability to have a separate title for each subplot when plotting (#14753)
  DOC: Fix grammar and formatting typos (#14803)
  BLD: try new build credentials for pandas-docs
  TST: Test pivot with categorical data
  MAINT: Cleanup pandas/src/parser (#14740)
  ...

jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
closes #14844

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14845 from chrisaycock/GH14844 and squashes the following commits:

97b73a8 [Christopher C. Aycock] BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)

(cherry picked from commit e991141)

ischurov added a commit to ischurov/pandas that referenced this issue Dec 19, 2016

BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
closes #14844

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14845 from chrisaycock/GH14844 and squashes the following commits:

97b73a8 [Christopher C. Aycock] BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment