Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted conversion of Timestamp to long #3593

Closed
eike-welk opened this issue May 13, 2013 · 5 comments · Fixed by #3595
Closed

Unwanted conversion of Timestamp to long #3593

eike-welk opened this issue May 13, 2013 · 5 comments · Fixed by #3595
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@eike-welk
Copy link

Method combine_first performs an unwanted conversion of Timestamp to long. Pandas version is "0.11.0". Here is an IPython session demonstrating the behavior, which is IMHO a bug:

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '0.11.0'

In [4]: from datetime import datetime

In [5]: df0 = pd.DataFrame({"a":[datetime(2000, 1, 1), datetime(2000, 1, 2), datetime(2000, 1, 3)]})

In [6]: df0
Out[6]: 
                    a
0 2000-01-01 00:00:00
1 2000-01-02 00:00:00
2 2000-01-03 00:00:00

In [7]: df1 = pd.DataFrame({"a":[None, None, None]})

In [8]: df1
Out[8]: 
      a
0  None
1  None
2  None

In [9]: df2 = df1.combine_first(df0)

In [10]: df2
Out[10]: 
                    a
0  946684800000000000
1  946771200000000000
2  946857600000000000

In [11]: type(df2["a"][0])
Out[11]: long

In [12]: type(df0["a"][0])
Out[12]: pandas.tslib.Timestamp

I think df2 should contain time stamps like df0.

@jreback
Copy link
Contributor

jreback commented May 13, 2013

this is a bug, fixed in #3595 (merging soon)

@jreback
Copy link
Contributor

jreback commented May 13, 2013

FYI, using None can result in sometimes weird results, prefer to use np.nan. None are normally converted to np.nan, but if you have a series with all None this does not occur and you have a dtype of object. You can solve this by specifying dtype=float64 or just use np.nan

@jreback
Copy link
Contributor

jreback commented May 13, 2013

merged - thanks for the report

@jreback jreback closed this as completed May 13, 2013
@eike-welk
Copy link
Author

Thanks for fixing this bug so fast!

And thanks for mentioning None vs. nan in a column that contains timestamps. These Nones are leftovers from the time when I used Pandas 0.7.x (a few days ago) which would not convert datetime objects into Timestamp objects (and further into int64 I suppose).

@jreback
Copy link
Contributor

jreback commented May 14, 2013

the None issue was refering to how the series are actually created. For datetime64[ns], missing is represented by NaT (which setting with np.nan will turn into). A bit confusing, but unfortunately this is how numpy deals with types. see here: http://pandas.pydata.org/pandas-docs/dev/missing_data.html#datetimes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
2 participants