BUG: scalar assignment of a tz-aware is object dtype #19843

jreback · 2018-02-22T15:19:48Z

[3] should be a datetime64[ns, UTC]

In [1]: df = pd.DataFrame({'A': [0, 1]})

In [3]: df['now'] = pd.Timestamp('20130101', tz='UTC')

In [4]: df
Out[4]: 
   A                        now
0  0  2013-01-01 00:00:00+00:00
1  1  2013-01-01 00:00:00+00:00

In [5]: df.dtypes
Out[5]: 
A       int64
now    object
dtype: object

In [6]: df['now2'] = pd.DatetimeIndex([pd.Timestamp('20130101', tz='UTC')]).repeat(len(df))

In [7]: df.dtypes
Out[7]: 
A                     int64
now                  object
now2    datetime64[ns, UTC]
dtype: object

The text was updated successfully, but these errors were encountered:

DylanDmitri · 2018-02-22T19:39:23Z

I will try and fix this.

jreback · 2018-02-23T01:53:58Z

great!

DylanDmitri · 2018-02-23T09:06:05Z

Currently, infer_dtype_from_scalar (on datetimey/timestampy objects) returns a np.datetime64 if no timezone is given, and defaults to np.object_ on objects with timezones. Fixing this problem means returning something else, rather than np.object_.

Ideally return DatetimeTZDtypeType. However, this crashes on np.empty(shape, dtype=dtype) in cast_scalar_to_array. Seems like this should work, but it doesn't.

Quick fix is returning np.datetime64 rather than np.object_. You lose the timezone name, but numpy applies the correct offset before saving so the numbers are correct. This change doesn't break any tests, and results in the following behavior:

In [1]: df = pd.DataFrame({'A': [0, 1]})

In [3]: df['now'] = pd.Timestamp('20130101', tz='UTC')

In [5]: df.dtypes
Out[5]: 
A               int64
now    datetime64[ns]
dtype: object

In [6]: df['now2'] = pd.DatetimeIndex([pd.Timestamp('20130101', tz='UTC')]).repeat(len(df))

In [7]: df.dtypes
Out[7]: 
A                     int64
now          datetime64[ns]
now2    datetime64[ns, UTC]
dtype: object

Raises some inconsistencies, potentially problems with mixing in timezone-naive datetimes. Is the quick fix good enough?

jreback · 2018-02-23T11:13:01Z

@DylanDmitri you don't want to ever have numpy deal with timezones, they are completely wrong. infer_dtype_from_scalar has a pandas_dtype parameter that will make this work. We should actully just change this to do this by default (though this might break other things)

DylanDmitri · 2018-03-02T18:24:32Z

Been busy the last week, sorry. Here's the problem code (from line 2874 of frame.py)

# BEFORE
value = cast_scalar_to_array(len(self.index), value)
value = maybe_cast_to_datetime(value, value.dtype)

Main issue: cast_scalar_to_array defaults to dtype np.object_, which is then ignored by maybe_cast_to_datetime. Want to capture the real pandas dtype, and then pass that into maybe_cast_to_datetime, which then works properly.

# AFTER
from pandas.core.dtypes.cast import infer_dtype_from_scalar
pandas_dtype, _ = infer_dtype_from_scalar(value, pandas_dtype=True)

value = cast_scalar_to_array(len(self.index), value)
value = maybe_cast_to_datetime(value, pandas_dtype)

This fixes the problem. Will check tests, and have a PR soon.

* master: (47 commits) Run tests in conda build [ci skip] (pandas-dev#22190) TST: Check DatetimeIndex.drop on DST boundary (pandas-dev#22165) CI: Fix Travis failures due to lint.sh on pandas/core/strings.py (pandas-dev#22184) Documentation: typo fixes in MultiIndex / Advanced Indexing (pandas-dev#22179) DOC: added .join to 'see also' in Series.str.cat (pandas-dev#22175) DOC: updated Series.str.contains see also section (pandas-dev#22176) 0.23.4 whatsnew (pandas-dev#22177) fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973) BUG: Fix get dummies unicode error (pandas-dev#22131) Fixed py36-only syntax [ci skip] (pandas-dev#22167) DEPR: pd.read_table (pandas-dev#21954) DEPR: Removing previously deprecated datetools module (pandas-dev#6581) (pandas-dev#19119) BUG: Matplotlib scatter datetime (pandas-dev#22039) CLN: Use public method to capture UTC offsets (pandas-dev#22164) implement tslibs/src to make tslibs self-contained (pandas-dev#22152) Fix categorical from codes nan 21767 (pandas-dev#21775) BUG: Better handling of invalid na_option argument for groupby.rank(pandas-dev#22124) (pandas-dev#22125) use memoryviews instead of ndarrays (pandas-dev#22147) Remove depr. warning in SeriesGroupBy.count (pandas-dev#22155) API: Default to_* methods to compression='infer' (pandas-dev#22011) ...

jreback added Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Timezones Timezone data dtype Difficulty Intermediate labels Feb 22, 2018

jreback added this to the 0.23.0 milestone Feb 22, 2018

DylanDmitri added a commit to DylanDmitri/pandas that referenced this issue Mar 2, 2018

fix: scalar timestamp assignment (pandas-dev#19843)

5a9b1b9

DylanDmitri mentioned this issue Mar 2, 2018

fix: scalar timestamp assignment (#19843) #19973

Merged

4 tasks

jreback modified the milestones: 0.23.0, Next Major Release Apr 14, 2018

jreback modified the milestones: Contributions Welcome, 0.23.4, 0.24.0 Aug 2, 2018

jreback closed this as completed in #19973 Aug 2, 2018

jreback pushed a commit that referenced this issue Aug 2, 2018

fix: scalar timestamp assignment (#19843) (#19973)

6e1e1e4

dberenbaum pushed a commit to dberenbaum/pandas that referenced this issue Aug 3, 2018

fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973)

fd594cf

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973)

daa0023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: scalar assignment of a tz-aware is object dtype #19843

BUG: scalar assignment of a tz-aware is object dtype #19843

jreback commented Feb 22, 2018

DylanDmitri commented Feb 22, 2018

jreback commented Feb 23, 2018

DylanDmitri commented Feb 23, 2018 •

edited

Loading

jreback commented Feb 23, 2018

DylanDmitri commented Mar 2, 2018 •

edited

Loading

BUG: scalar assignment of a tz-aware is object dtype #19843

BUG: scalar assignment of a tz-aware is object dtype #19843

Comments

jreback commented Feb 22, 2018

DylanDmitri commented Feb 22, 2018

jreback commented Feb 23, 2018

DylanDmitri commented Feb 23, 2018 • edited Loading

jreback commented Feb 23, 2018

DylanDmitri commented Mar 2, 2018 • edited Loading

DylanDmitri commented Feb 23, 2018 •

edited

Loading

DylanDmitri commented Mar 2, 2018 •

edited

Loading