merge_asof() must be able to operate with timezone-aware DatetimeIndex #14844

Closed
chrisaycock opened this Issue Dec 9, 2016 · 0 comments

Comments

Projects
None yet
2 participants
Contributor

chrisaycock commented Dec 9, 2016

I can perform the following merge just fine:

left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'),
                                              freq='D', periods=5),
                     'value1':np.arange(5)})
right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'),
                                               freq='D', periods=5),
                      'value2':list("ABCDE")})
pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))

However, adding a timezone to the DatetimeIndex doesn't work:

import pytz
left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'),
                                              freq='D', periods=5, tz=pytz.timezone('UTC')),
                     'value1':np.arange(5)})
right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'),
                                               freq='D', periods=5, tz=pytz.timezone('UTC')),
                      'value2':list("ABCDE")})
pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))

I get the oddly worded

MergeError: incompatible tolerance, must be compat with type <class 'pandas.tseries.index.DatetimeIndex'>

The solution is actually very simple. _AsOfMerge._get_merge_keys() needs to check for is_datetime64tz_dtype() in addition to is_datetime64_dtype() when there is a tolerance. I should also fix the error message to be clearer.

jreback added this to the 0.19.2 milestone Dec 9, 2016

jreback closed this in e991141 Dec 10, 2016

@yarikoptic yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

@yarikoptic yarikoptic Merge remote-tracking branch 'origin/master' into bf-cython
* origin/master: (22 commits)
  BUG: astype falsely converts inf to integer (GH14265) (#14343)
  BUG: Apply min_itemsize to index even when not appending
  DOC: warning section on memory overflow when joining/merging dataframes on index with duplicate keys (#14788)
  BLD: missing - on secure
  BLD: new access token on pandas-dev
  TST: Test DatetimeIndex weekend offset (#14853)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ...
e796e8b

@yarikoptic yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

@yarikoptic yarikoptic Merge commit 'v0.19.0-174-g81a2f79' into releases
* commit 'v0.19.0-174-g81a2f79': (156 commits)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ENH: Introduce UnsortedIndexError  GH11897 (#14762)
  ENH: Add the ability to have a separate title for each subplot when plotting (#14753)
  DOC: Fix grammar and formatting typos (#14803)
  BLD: try new build credentials for pandas-docs
  TST: Test pivot with categorical data
  MAINT: Cleanup pandas/src/parser (#14740)
  ...
6c87601

@yarikoptic yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 12, 2016

@yarikoptic yarikoptic Merge branch 'releases' (as of v0.19.0-174-g81a2f79) into debian
release 0.19.1 was from release branch

* releases: (156 commits)
  BLD: escape GH_TOKEN in build_docs
  TST: Correct results with np.size and crosstab (#4003) (#14755)
  Frame benchmarking sum instead of mean (#14824)
  CLN: lint of test_base.py
  BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
  BUG: GH11847 Unstack with mixed dtypes coerces everything to object
  TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
  BLD: try new gh token for pandas-docs
  CLN/PERF: clean-up of the benchmarks (#14099)
  ENH: add timedelta as valid type for interpolate with method='time' (#14799)
  DOC: add section on groupby().rolling/expanding/resample (#14801)
  TST: add test to confirm GH14606 (specify category dtype for empty) (#14752)
  BLD: use org name in build-docs.sh
  BF(TST): use = (native) instead of < (little endian) for target data types (#14832)
  ENH: Introduce UnsortedIndexError  GH11897 (#14762)
  ENH: Add the ability to have a separate title for each subplot when plotting (#14753)
  DOC: Fix grammar and formatting typos (#14803)
  BLD: try new build credentials for pandas-docs
  TST: Test pivot with categorical data
  MAINT: Cleanup pandas/src/parser (#14740)
  ...
dd7e977

@jorisvandenbossche jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

@jorisvandenbossche Christopher C. Aycock + jorisvandenbossche BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
closes #14844

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14845 from chrisaycock/GH14844 and squashes the following commits:

97b73a8 [Christopher C. Aycock] BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)

(cherry picked from commit e991141)
7f53ea8

@ischurov ischurov added a commit to ischurov/pandas that referenced this issue Dec 19, 2016

@ischurov Christopher C. Aycock + ischurov BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
closes #14844

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14845 from chrisaycock/GH14844 and squashes the following commits:

97b73a8 [Christopher C. Aycock] BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
38b52fd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment