Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.from_records doesn't handle missing dates (None) #6140

Closed
dbew opened this issue Jan 28, 2014 · 3 comments · Fixed by #6142
Closed

DataFrame.from_records doesn't handle missing dates (None) #6140

dbew opened this issue Jan 28, 2014 · 3 comments · Fixed by #6142
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@dbew
Copy link
Contributor

dbew commented Jan 28, 2014

When you construct a DataFrame from a numpy recarray with datetime data and None for missing dates you get an error.

arrdata = [np.array([datetime.datetime(2005, 3, 1, 0, 0), None])]
dtypes = [('EXPIRY', '<M8[m]')]
recarray = np.core.records.fromarrays(arrdata, dtype=dtypes)

df = pd.DataFrame.from_records(recarray)
Traceback (most recent call last):
  File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/ipython-1.1.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 2830, in run_code
    exec code_obj in self.user_global_ns, self.user_ns
  File "<ipython-input-33-ae01f48c3b82>", line 1, in <module>
    df = pd.DataFrame.from_records(recarray)
  File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_408_g464c1f9-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 841, in from_records
    columns)
  File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_408_g464c1f9-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 4473, in _arrays_to_mgr
    return create_block_manager_from_arrays(arrays, arr_names, axes)
  File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_408_g464c1f9-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 3748, in create_block_manager_from_arrays
    construction_error(len(arrays), arrays[0].shape[1:], axes, e)
  File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_408_g464c1f9-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 3720, in construction_error
    passed,implied))
ValueError: Shape of passed values is (1,), indices imply (1, 2)

This is a regression from 0.11.0. Stepping through the code it looks the initial error is raised in tslib.cast_to_nanoseconds and then caught and re-raised in create_block_manager_from_arrays

Incidentally, construction does work from a simple array instead of a recarray:

pd.DataFrame(np.array([datetime.datetime(2005, 3, 1, 0, 0), None]))
Out[36]: 
           0
0 2005-03-01
1        NaT

[2 rows x 1 columns]
@jreback
Copy link
Contributor

jreback commented Jan 28, 2014

hmm prob not testing this case
on master too?

@dbew
Copy link
Contributor Author

dbew commented Jan 28, 2014

Yes, this is on master.

@jreback
Copy link
Contributor

jreback commented Jan 28, 2014

was a bug because wasn't converting the datetime64[m]; fyi pandas keeps datetimes intnerally as datetime64[ns]

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants