Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better error message for to_datetime #4928

Closed
cpcloud opened this issue Sep 22, 2013 · 6 comments · Fixed by #5157

Comments

@cpcloud
Copy link
Member

commented Sep 22, 2013

cc @danbirken

In [2]: to_datetime([1,'1'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-1d4cd9e078aa> in <module>()
----> 1 to_datetime([1,'1'])

/home/phillip/Documents/code/py/pandas/pandas/tseries/tools.pyc in to_datetime(arg, errors, dayfirst, utc, box, format, coerce, unit)
    137         return Series(values, index=arg.index, name=arg.name)
    138     elif com.is_list_like(arg):
--> 139         return _convert_listlike(arg, box=box)
    140
    141     return _convert_listlike(np.array([ arg ]), box=box)[0]

/home/phillip/Documents/code/py/pandas/pandas/tseries/tools.pyc in _convert_listlike(arg, box)
    117                 result = tslib.array_to_datetime(arg, raise_=errors == 'raise',
    118                                                  utc=utc, dayfirst=dayfirst,
--> 119                                                  coerce=coerce, unit=unit)
    120             if com.is_datetime64_dtype(result) and box:
    121                 result = DatetimeIndex(result, tz='utc' if utc else None)

/home/phillip/Documents/code/py/pandas/pandas/tslib.so in pandas.tslib.array_to_datetime (pandas/tslib.c:16487)()

TypeError: object of type 'int' has no len()
@danbirken

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2013

Yeah this is a little strange. I don't know enough about the history of these functions, but in the documentation of to_datetime() it says:

arg : string, datetime, array of strings (with possible NAs)

However, array_to_datetime() and by proxy to_datetime() supports certain cases of being passed arrays of ints or floats:

In [2]: pd.to_datetime([1, 2])
Out[2]:
<class 'pandas.tseries.index.DatetimeIndex'>
[1970-01-01 00:00:00.000000001, 1970-01-01 00:00:00.000000002]
Length: 2, Freq: None, Timezone: None

In [3]: pd.to_datetime([1.5, 2.5])
Out[3]:
<class 'pandas.tseries.index.DatetimeIndex'>
[1970-01-01 00:00:00.000000001, 1970-01-01 00:00:00.000000002]
Length: 2, Freq: None, Timezone: None

So I can:

a) Make it so you must pass an array of strings to to_datetime(), otherwise throw a nice exception that says you passed in a list that wasn't strings

b) Make it so array_to_datetime() (and by proxy to_datetime()) returns the same array back in case it cannot be processed. So like:

# Current behavior for two un-processable strings
In [4]: pd.to_datetime(['1', '2'])
Out[4]: array(['1', '2'], dtype=object)

# Potential behavior for your input.  Un-processable, just return it back
In [4]: pd.to_datetime([1, '1'])
Out[4]: array([1, '1'], dtype=object)

c) some other option I didn't think of

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2013

their are a bunch test case I think in tseries/test/test_timeseries IIRC.

on errors it will just return the input, unless errors='strict', or if coerce=True then will make errors into NaT

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 7, 2013

@danbirken can you address this one? I think that b) is the correct response here (it will just return it to you), unless errors='raise' is passed in which case it will raise

although, maybe in this case as the input is completely bogus, you could raise....

danbirken added a commit to danbirken/pandas that referenced this issue Oct 8, 2013
@danbirken

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2013

Sorry for the delay in responding -- I'm still on a very long vacation :)

I think this change should do it.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2013

gr8...just roll that into a PR and put a release notes entry...thankxs

@danbirken

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2013

Put into PR, release notes should be in there.

@jreback jreback closed this in #5157 Oct 8, 2013

jreback added a commit that referenced this issue Oct 8, 2013
Merge pull request #5157 from danbirken/gh4928
BUG: Fix to_datetime() uncaught error with unparseable inputs #4928
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.