Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: where behaves badly when dtype of self is datetime or timedelta, and dtype of other is not #9804

Closed
wants to merge 1 commit into from

Conversation

evanpw
Copy link
Contributor

@evanpw evanpw commented Apr 3, 2015

There are a few weird behaviors that this fixes:

  1. If other is an int or float, then where throws an Exception while trying to call other.view
  2. If other is an np.int64, it works fine
  3. If other is an np.float64, there is no error, but the result is bizarre (it reinterprets the bits of the float as an integer, rather than casting it)
  4. If other is list-like and has a numerical dtype, then it throws an exception while try to call all on the value False (for some reason, comparing an ndarray with dtype datetime64 and an ndarray with an integer dtype just returns a scalar False rather than a boolean ndarray)
>>> s = Series(date_range('20130102', periods=2))
>>> s.where(s.isnull(), 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/evanpw/Workspace/pandas/pandas/core/generic.py", line 3399, in where
    try_cast=try_cast)
  File "/home/evanpw/Workspace/pandas/pandas/core/internals.py", line 2469, in where
    return self.apply('where', **kwargs)
  File "/home/evanpw/Workspace/pandas/pandas/core/internals.py", line 2451, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/evanpw/Workspace/pandas/pandas/core/internals.py", line 1080, in where
    result = func(cond, values, other)
  File "/home/evanpw/Workspace/pandas/pandas/core/internals.py", line 1063, in func
    v, o = self._try_coerce_args(v, o)
  File "/home/evanpw/Workspace/pandas/pandas/core/internals.py", line 1819, in _try_coerce_args
    other = other.view('i8')
AttributeError: 'int' object has no attribute 'view'
>>> s.where(s.isnull(), np.int64(10))
0   1970-01-01 00:00:00.000000010
1   1970-01-01 00:00:00.000000010
dtype: datetime64[ns]
>>> s.where(s.isnull(), np.float64(10))
0   2116-06-17 06:38:37.588971520
1   2116-06-17 06:38:37.588971520
dtype: datetime64[ns]
>>> s.where(s.isnull(), [0, 1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/evanpw/Workspace/pandas/pandas/core/generic.py", line 3326, in where
    if not (new_other == np.array(other)).all():
AttributeError: 'bool' object has no attribute 'all'

@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Apr 4, 2015
@jreback jreback added this to the 0.16.1 milestone Apr 4, 2015
values = values.view('i8')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see if we can consolidate some of this logic to com._infer_fill_value (which may need a bit of logic), but I think should work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. We already know the source and destination dtypes: datetime64 -> int64 or timedelta64 -> float64. What do we need to infer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not that we need to infer, my point is that a very similar code block exists in several places in the code base. needs to be consolidated to avoid repeating is again. Pls look thru and see where it is duplicated (e.g. in internals.py as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I still don't see the common functionality. You're talking about this function:

def _try_coerce_args(self, values, other):
    """ Coerce values and other to dtype 'i8'. NaN and NaT convert to
        the smallest i8, and will correctly round-trip to NaT if converted
        back in _try_coerce_result. values is always ndarray-like, other
        may not be """
    values = values.view('i8')

    if is_null_datelike_scalar(other):
        other = tslib.iNaT
    elif isinstance(other, datetime):
        other = lib.Timestamp(other).asm8.view('i8')
    elif hasattr(other, 'dtype') and com.is_integer_dtype(other):
        other = other.view('i8')
    else:
        other = np.array(other, dtype='i8')

    return values, other

and this one?

def _infer_fill_value(val):
    """
    infer the fill value for the nan/NaT from the provided scalar/ndarray/list-like
    if we are a NaT, return the correct dtyped element to provide proper block construction

    """

    if not is_list_like(val):
        val = [val]
    val = np.array(val,copy=False)
    if is_datetimelike(val):
        return np.array('NaT',dtype=val.dtype)
    elif is_object_dtype(val.dtype):
        dtype = lib.infer_dtype(_ensure_object(val))
        if dtype in ['datetime','datetime64']:
            return np.array('NaT',dtype=_NS_DTYPE)
        elif dtype in ['timedelta','timedelta64']:
            return np.array('NaT',dtype=_TD_DTYPE)
    return np.nan

I haven't seen any code block that looks similar to _try_coerce_args elsewhere either. Could you give an example?

@jreback
Copy link
Contributor

jreback commented Apr 14, 2015

merged via 1f9b699

thanks!

@jreback jreback closed this Apr 14, 2015
@evanpw evanpw deleted the datetime_where branch June 10, 2015 13:41
@evanpw evanpw restored the datetime_where branch September 19, 2015 00:34
@evanpw evanpw deleted the datetime_where branch September 19, 2015 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants