You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When assigning a timedelta64 array to a subset of a new column of a DataFrame, missing data is not filled with NaT as expected; rather, the new column is cast to float64 and NaN is used instead. This cast does not usually occur when all values are present, except when there are already float64 columns but no timedelta64 columns in the DataFrameand indexing is done through .ix or .loc.
It's possible these should be two separate issues.
There are a lot of issues involving NaT in the issue tracker; I'm not 100% sure that this isn't a duplicate. (Nor am I 100% sure this isn't intended behavior, but if it is I'd expect it to be documented more prominently.)
importnumpyasnpimportpandasaspdone_hour=60*60*10**9temp=pd.DataFrame({}, index=pd.date_range('2014-1-1', periods=4))
temp['A'] =np.array([1*one_hour]*4, dtype='m8[ns]')
temp.loc[:,'B'] =np.array([2*one_hour]*4, dtype='m8[ns]')
temp.loc[:3,'C'] =np.array([3*one_hour]*3, dtype='m8[ns]')
temp.ix[:,'D'] =np.array([4*one_hour]*4, dtype='m8[ns]')
temp.ix[:3,'E'] =np.array([5*one_hour]*3, dtype='m8[ns]')
temp['F'] =np.timedelta64('NaT')
temp.ix[:-1,'F'] =np.array([6*one_hour]*3, dtype='m8[ns]')
temp# A B C D E F#2014-01-01 01:00:00 02:00:00 1.080000e+13 04:00:00 1.800000e+13 06:00:00#2014-01-02 01:00:00 02:00:00 1.080000e+13 04:00:00 1.800000e+13 06:00:00#2014-01-03 01:00:00 02:00:00 1.080000e+13 04:00:00 1.800000e+13 06:00:00#2014-01-04 01:00:00 02:00:00 NaN 04:00:00 NaN NaT# # [4 rows x 6 columns]temp=pd.DataFrame({}, index=pd.date_range('2014-1-1', periods=4))
# Partial assignment convertstemp.ix[:-1,'A'] =np.array([1*one_hour]*3, dtype='m8[ns]')
# DataFrame is all floats; convertstemp.ix[:,'B'] =np.array([2*one_hour]*4, dtype='m8[ns]')
# .ix and .loc behave the sametemp.loc[:,'C'] =np.array([3*one_hour]*4, dtype='m8[ns]')
# straight column assignment doesn't converttemp['D'] =np.array([4*one_hour]*4, dtype='m8[ns]')
# Now there are timedeltas; doesn't converttemp.ix[:,'E'] =np.array([5*one_hour]*4, dtype='m8[ns]')
# .ix and .loc still behave the sametemp.loc[:,'F'] =np.array([6*one_hour]*4, dtype='m8[ns]')
temp# A B C D E \#2014-01-01 3.600000e+12 7.200000e+12 1.080000e+13 04:00:00 05:00:00 #2014-01-02 3.600000e+12 7.200000e+12 1.080000e+13 04:00:00 05:00:00 #2014-01-03 3.600000e+12 7.200000e+12 1.080000e+13 04:00:00 05:00:00 #2014-01-04 NaN 7.200000e+12 1.080000e+13 04:00:00 05:00:00 # # F #2014-01-01 06:00:00 #2014-01-02 06:00:00 #2014-01-03 06:00:00 #2014-01-04 06:00:00 # # [4 rows x 6 columns]temp=pd.DataFrame({}, index=pd.date_range('2014-1-1', periods=4))
# No columns yet, no conversiontemp.ix[:,'A'] =np.array([2*one_hour]*4, dtype='m8[ns]')
# A#2014-01-01 02:00:00#2014-01-02 02:00:00#2014-01-03 02:00:00#2014-01-04 02:00:00# # [4 rows x 1 columns]
The text was updated successfully, but these errors were encountered:
When assigning a
timedelta64
array to a subset of a new column of aDataFrame
, missing data is not filled withNaT
as expected; rather, the new column is cast tofloat64
andNaN
is used instead. This cast does not usually occur when all values are present, except when there are alreadyfloat64
columns but notimedelta64
columns in theDataFrame
and indexing is done through.ix
or.loc
.It's possible these should be two separate issues.
There are a lot of issues involving
NaT
in the issue tracker; I'm not 100% sure that this isn't a duplicate. (Nor am I 100% sure this isn't intended behavior, but if it is I'd expect it to be documented more prominently.)The text was updated successfully, but these errors were encountered: