Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Inconsistent datetime comparison with Tz #12601

Closed
sinhrks opened this issue Mar 12, 2016 · 10 comments

Comments

Projects
None yet
4 participants
@sinhrks
Copy link
Member

commented Mar 12, 2016

Related to #8306. On current master, Timestamp comparison results in TypeError if its timezones are different. However, Index and Series implicitly converts tz to GMT

pd.Timestamp('2016-01-01 12:00', tz='US/Eastern') > pd.Timestamp('2016-01-01 08:00')
# TypeError: Cannot compare tz-naive and tz-aware timestamps

# same result as idx.tz_convert(None) > pd.Timestamp('2016-01-01 08:00')
idx = pd.date_range('2016-01-01 12:00', periods=10, freq='H', tz='Asia/Tokyo')
idx > pd.Timestamp('2016-01-01 08:00')
# array([False, False, False, False, False, False,  True,  True,  True,  True], dtype=bool)

Numeric ops raises TypeError as expected.

idx - pd.Timestamp('2016-01-01 08:00')
# TypeError: Timestamp subtraction must have the same timezones or no timezones
@gliptak

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

I opened numpy/numpy#7390 Does this belong to pandas instead? Thanks

@jreback

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

this is an invalid dtype for numpy and not defined there
further what you are doing doesn't make any sense

@gliptak

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

@jreback I'm validating that the ts column has datetime64 with timezone (just comparing it to datetime64 fails ...). How would this need to be coded?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

use .select_dtypes or an com.is_datetimelike or com.is_datetime64tz_dtype

numpy doesn't know about/respect this (its really a bug in the dtype definition and i don't know when/if ever will be fixed/allowed).

@jreback

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

Here is also the method to coerce. EDT is not a timezone, and what dateutil is doing is wrong and doesn't give you anything useful.

In [34]: df = pd.DataFrame(["Mar 10, 2016 11:20 PM EDT"], columns=['ts'])

In [35]: pd.to_datetime(df['ts']).astype('datetime64[us, US/Eastern]')
Out[35]: 
0   2016-03-10 23:20:00-05:00
Name: ts, dtype: datetime64[ns, US/Eastern]
@gliptak

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

Thank you for the pointers.

In [4]: df = pd.DataFrame([parse("Mar 10, 2016 11:20 PM EDT")], columns=['ts'])
In [16]: df['ts'] = pd.to_datetime(df['ts']).astype('datetime64[us, US/Eastern]')
In [19]: df.dtypes['ts'] == np.dtype('datetime64[ns]')
Out[19]: False

So how am I to compare? Thanks

@jreback

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

what are you trying to do? why do you need to compare? what are you comparing? most ops will simply work, you rarely actually need to compare things, if you need to sub-select use .select_dtypes(...) as I indicated.

@gliptak

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

Sorry, I didn't offer context. I came across this working unit tests for pydata/pandas-datareader#188

dtypes = [np.dtype(x) for x in ['float64', 'float64', 'datetime64[ns]']]
tm.assert_series_equal(df.dtypes, pd.Series(dtypes, index=exp_columns))

I had to force no timezone for the compare above to succeed ...
Could you show how to rewrite df.dtypes['ts'] == np.dtype('datetime64[ns]') with .select_dtypes(...)?
Thanks

@sinhrks

This comment has been minimized.

Copy link
Member Author

commented Mar 14, 2016

@gliptak I don't quite understood also. pydata/pandas-datareader#188 is merged and the test has been passed. Pls update pydata/pandas-datareader#188 if there is any problem. I assume this issue is unrelated to yours.

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 26, 2016

@jreback jreback referenced this issue May 10, 2016

Open

COMPAT: comparisons master issue #13129

1 of 7 tasks complete

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 21, 2016

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017

@mroeschke mroeschke referenced this issue Jun 23, 2018

Merged

TST: Clean old timezone issues PT2 #21612

10 of 10 tasks complete

@jreback jreback modified the milestones: Next Major Release, 0.24.0 Jun 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.