Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix arithmetic errors with timedelta64 dtypes #22390

Merged
merged 14 commits into from
Aug 23, 2018

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Aug 16, 2018

Kind of surprising, but I don't see any Issues for these bugs. Might be because many of them were found after parametrizing arithmetic tests, so instead of Issues we had xfails.

xref #20088
xref #22163

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@codecov
Copy link

codecov bot commented Aug 16, 2018

Codecov Report

Merging #22390 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22390      +/-   ##
==========================================
- Coverage   92.04%   92.04%   -0.01%     
==========================================
  Files         169      169              
  Lines       50740    50759      +19     
==========================================
+ Hits        46705    46722      +17     
- Misses       4035     4037       +2
Flag Coverage Δ
#multiple 90.45% <100%> (-0.01%) ⬇️
#single 42.24% <36.84%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/base.py 96.45% <100%> (+0.01%) ⬆️
pandas/core/indexes/range.py 95.73% <100%> (+0.02%) ⬆️
pandas/core/ops.py 96.76% <100%> (+0.04%) ⬆️
pandas/core/arrays/timedeltas.py 91.54% <0%> (-1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25e6a21...cdfb3bf. Read the comment docs.

@@ -576,7 +576,10 @@ Datetimelike
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`)
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`)
- Bug in :class:`DataFrame` with ``timedelta64[ns]`` dtype division by ``Timedelta``-like scalar incorrectly returning ``timedelta64[ns]`` dtype instead of ``float64`` dtype (:issue:`20088`,:issue:`22163`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you split out to a timedelta bug section (as this is getting kind of long)?

can be in a future PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we now have a separate section?

@@ -596,6 +597,9 @@ def _evaluate_numeric_binop(self, other):
# GH#19333 is_integer evaluated True on timedelta64,
# so we need to catch these explicitly
return op(self._int64index, other)
elif is_timedelta64_dtype(other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't the fall thru raise TypeError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t understand the question

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your comment is at odds with the checking i think. pls revise comments and/or checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment below this line # must be an np.ndarray; GH#22390 along with the check above elif is_timedelta64_dtype(other) is just saying this is an ndarray["timedelta64['ns']"]. I don't think these are at odds.

# timedelta64 dtypes because numpy casts integer dtypes to
# timedelta64 when operating with timedelta64
if isinstance(right, np.ndarray):
# upcast to TimedeltaIndex before dispatching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhh this seems like a lot of special casing, any way to make this simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only simplification I considered but didn't use was something like:

def maybe_upcast_for_op(obj):
     if type(obj) is timedelta:
        return pd.Timedelta(obj)
    if isinstance(obj, np.ndarray) and is_timedelta64_dtype(obj):
        return pd.TimedeltaIndex(obj)
    return obj

which would go right after get_op_result_name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe re-route the scalars to a separate path. This is getting way special casey here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll implement the maybe_upcast idea and see if that is prettier

@jreback jreback added Bug Timedelta Timedelta data type labels Aug 17, 2018
@jbrockmendel
Copy link
Member Author

This will probably need rebasing after #22350 goes through

index=left.index, name=res_name,
dtype=result.dtype)

elif type(right) is datetime.timedelta:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this case you can include above (before what you added), no?

if isintance(right, (datetime.timedelta, Timedelta)):
  right = Timedelta(right)
....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version code above casts to TimedeltaIndex, not Timedelta.

# timedelta64 dtypes because numpy casts integer dtypes to
# timedelta64 when operating with timedelta64
if isinstance(right, np.ndarray):
# upcast to TimedeltaIndex before dispatching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe re-route the scalars to a separate path. This is getting way special casey here.

@@ -596,6 +597,9 @@ def _evaluate_numeric_binop(self, other):
# GH#19333 is_integer evaluated True on timedelta64,
# so we need to catch these explicitly
return op(self._int64index, other)
elif is_timedelta64_dtype(other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your comment is at odds with the checking i think. pls revise comments and/or checks.

@jbrockmendel
Copy link
Member Author

Implemented maybe_upcast_for_op. There may be other places where this can be used; I'll take a look in a follow-up.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor doc-comment. rebase and ping on green.

@@ -576,7 +576,10 @@ Datetimelike
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`)
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`)
- Bug in :class:`DataFrame` with ``timedelta64[ns]`` dtype division by ``Timedelta``-like scalar incorrectly returning ``timedelta64[ns]`` dtype instead of ``float64`` dtype (:issue:`20088`,:issue:`22163`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we now have a separate section?

@jreback jreback added this to the 0.24.0 milestone Aug 22, 2018
@jbrockmendel
Copy link
Member Author

ping

@jreback jreback merged commit dd24e76 into pandas-dev:master Aug 23, 2018
@jreback
Copy link
Contributor

jreback commented Aug 23, 2018

thanks! nice cleaning

@jbrockmendel jbrockmendel deleted the rbug branch August 23, 2018 15:46
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants