Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR? no error anymore when converting out of bounds datetime64[non-ns] data #26206

Open
jorisvandenbossche opened this issue Apr 24, 2019 · 5 comments

Comments

@jorisvandenbossche
Copy link
Member

commented Apr 24, 2019

Didn't directly find a related issue, but on master / 0.24 / 0.23, we see:

In [1]: pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))
Out[1]: 
0   1677-09-21 00:25:26.290448384
dtype: datetime64[ns]

while on pandas 0.22.0:

In [1]: pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))
---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-1-b3f7cbbf1054> in <module>()
----> 1 pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    264                                        raise_cast_failure=True)
    265 
--> 266                 data = SingleBlockManager(data, index, fastpath=True)
    267 
    268         generic.NDFrame.__init__(self, data, fastpath=True)

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in __init__(self, block, axis, do_integrity_check, fastpath)
   4400         if not isinstance(block, Block):
   4401             block = make_block(block, placement=slice(0, len(axis)), ndim=1,
-> 4402                                fastpath=True)
   4403 
   4404         self.blocks = [block]

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in make_block(values, placement, klass, ndim, dtype, fastpath)
   2955                      placement=placement, dtype=dtype)
   2956 
-> 2957     return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
   2958 
   2959 # TODO: flexible with index=None and/or items=None

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in __init__(self, values, placement, fastpath, **kwargs)
   2468     def __init__(self, values, placement, fastpath=False, **kwargs):
   2469         if values.dtype != _NS_DTYPE:
-> 2470             values = tslib.cast_to_nanoseconds(values)
   2471 
   2472         super(DatetimeBlock, self).__init__(values, fastpath=True,

pandas/_libs/tslib.pyx in pandas._libs.tslib.cast_to_nanoseconds()

pandas/_libs/tslib.pyx in pandas._libs.tslib._check_dts_bounds()

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2262-04-12 00:00:00

In [2]: pd.__version__
Out[2]: '0.22.0'

cc @jbrockmendel any idea if this was changed on purpose or to what refactoring could have been the cause of this change?

@jorisvandenbossche

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

Additional observation: it only seems to be the Series and DataFrame constructors that have this issue, others like pd.array, pd.to_datetime, pd.Index all still raise the OutOfBoundsDatetime error.

It might be that #18231 is the cause (it touches maybe_castable, which led to sanitize_array no longer to return the original datetime64[D] data), but if that is the case, then it was an unintentional side-effect and should also be fixed differently now (the change in that PR to maybe_castable seems logical).

@jbrockmendel

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

Off the top of my head I don't know where in the DTA refactor process this would have been changed. maybe_castable seems like a reasonable guess for a place to look.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

Yeah, if my observation from above is true, this has nothing to do with any of the DTA refactoring. In fact, we should probably use more of the "array creation from different kind of data" functionality that is gathered in array/datetimes.py (as this is handling the case correctly) in the Series/DataFrame construction

@gfyoung gfyoung added the Timeseries label Apr 26, 2019

@gfyoung

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 17, 2019

@jreback jreback added this to the 0.25.0 milestone Jun 17, 2019

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 18, 2019

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 21, 2019

@jreback jreback modified the milestones: 0.25.0, 1.0 Jun 28, 2019

@Vinci08

This comment has been minimized.

Copy link

commented Jul 22, 2019

I am having an out of bounds error, which forces me to downgrade pandas to 0.24.2. Here's my code:

 df['DAT'] = pd.Series(df['DAT'].values, dtype='datetime64[ns]')
 df['DAT'] = pd.to_datetime(df['DAT'], errors='coerce').dt.strftime('%m-%d-%Y')

It gave me this error:

Traceback (most recent call last):
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1979, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data)
  File "pandas\_libs\tslibs\conversion.pyx", line 200, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'datetime.date'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "I:\spreadsheet\run_dat.py", line 196, in <module>
    run_and_pull()
  File "I:\spreadsheet\run_dat.py", line 93, in run_and_pull
    df[col] = pd.Series(df[col].values, dtype='datetime64[ns]')
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\series.py", line 311, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\internals\construction.py", line 664, in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\internals\construction.py", line 784, in _try_cast
    subarr = maybe_cast_to_datetime(arr, dtype)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\dtypes\cast.py", line 1052, in maybe_cast_to_datetime
    value = to_datetime(value, errors=errors)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\util\_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 787, in to_datetime
    cache_array = _maybe_cache(arg, format, cache, convert_listlike)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 156, in _maybe_cache
    cache_dates = convert_listlike(unique_dates, True, format)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 460, in _convert_listlike_datetimes
    allow_object=True,
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1984, in objects_to_datetime64ns
    raise e
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1975, in objects_to_datetime64ns
    require_iso8601=require_iso8601,
  File "pandas\_libs\tslib.pyx", line 465, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 683, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 679, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 555, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslibs\np_datetime.pyx", line 118, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00

I am by no mean an expert, so if there is something from my end that caused this issue, please point it out. However, I ran the file with pandas 0.24.2 without any issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.