New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Index __mul__-like ops with timedelta scalars #19333

Merged
merged 29 commits into from Feb 22, 2018

Conversation

Projects
None yet
3 participants
@jbrockmendel
Member

jbrockmendel commented Jan 21, 2018

Fixes the following current behavior for each of the numeric index classes and each of the basic timedelta-like scalars:

idx = pd.Index(range(3))
td = pd.Timedelta(days=1)

>>> idx * td
TypeError: can only perform ops with timedelta like values
>>> td * idx
TypeError: can only perform ops with timedelta like values
>>> td / idx
TypeError: can only perform ops with timedelta like values
>>> td // idx
TypeError: can only perform ops with timedelta like values
  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@@ -428,7 +428,9 @@ Conversion
- Bug in localization of a naive, datetime string in a ``Series`` constructor with a ``datetime64[ns, tz]`` dtype (:issue:`174151`)
- :func:`Timestamp.replace` will now handle Daylight Savings transitions gracefully (:issue:`18319`)
- Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
- Multiplication of :class:`TimedeltaIndex` by ``TimedeltaIndex`` will now raise ``TypeError`` instead of raising ``ValueError`` in cases of length mis-match (:issue:`????`)

This comment has been minimized.

@jreback

jreback Jan 21, 2018

Contributor

when you don't have an issue number, use the PR number

@@ -3864,7 +3864,21 @@ def dropna(self, how='any'):
return self._shallow_copy()
def _evaluate_with_timedelta_like(self, other, op, opstr, reversed=False):
raise TypeError("can only perform ops with timedelta like values")
# Timedelta knows how to operate with np.array, so dispatch to that

This comment has been minimized.

@jreback

jreback Jan 21, 2018

Contributor

shouldn't this be left alone, and rather define this in TimedeltaIndex?

This comment has been minimized.

@jbrockmendel

jbrockmendel Jan 21, 2018

Member

This for for numeric-dtyped indexes. e.g. pd.Index(range(3)) * pd.Timedelta(days=1) goes through this path.

This comment has been minimized.

@jreback

jreback Jan 21, 2018

Contributor

why doesn't this just raise NotImplementedError? which then Timedelta would handle? or is that too recursive a path?

This comment has been minimized.

@jbrockmendel

jbrockmendel Jan 21, 2018

Member

For one thing b/c there’s pytimedelta and timedelta64 that need to be handled.

@jreback

This comment has been minimized.

Contributor

jreback commented Jan 21, 2018

I just updated the whatsnew, to be more organized, so rebase again.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Jan 21, 2018

OK. Should I put the multiplication/division stuff in Numeric instead of Conversion? A few days back you suggested an Ops section, but I wasn't clear on what the scope of that would be.

@jreback

This comment has been minimized.

Contributor

jreback commented Jan 21, 2018

I would put this in datetimelike, numeric is for well, numeric types! this is sort of an arbitrary split but makes each section shorter

@@ -446,6 +447,9 @@ Numeric
- Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
- Bug in the :class:`DataFrame` constructor in which data containing very large positive or very large negative numbers was causing ``OverflowError`` (:issue:`18584`)
- Bug in :class:`Index` constructor with ``dtype='uint64'`` where int-like floats were not coerced to :class:`UInt64Index` (:issue:`18400`)
- Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
- Multiplication of :class:`TimedeltaIndex` by ``TimedeltaIndex`` will now raise ``TypeError`` instead of raising ``ValueError`` in cases of length mis-match (:issue`19333`)

This comment has been minimized.

@jreback

jreback Jan 21, 2018

Contributor

1st one ok, other 2 in datetimelike section

@@ -26,6 +26,36 @@ def full_like(array, value):
return ret
class TestIndexArithmetic(object):
@pytest.mark.parametrize('index', [pd.Int64Index(range(1, 11)),

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

can you add the issue number as a comment (or PR number if no issue)

class TestIndexArithmetic(object):
@pytest.mark.parametrize('index', [pd.Int64Index(range(1, 11)),
pd.UInt64Index(range(1, 11)),
pd.Float64Index(range(1, 11)),

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

is this how we are importing in this module, e.g. pd.*? usually we don't do this (its ok whichever, just want to be consistent)

This comment has been minimized.

@jbrockmendel

jbrockmendel Jan 22, 2018

Member

Looks like I'm being inconsistent here, will change.

expected = pd.TimedeltaIndex(['1 Day', '12 Hours'])
result = scalar_td / index
tm.assert_index_equal(result, expected)

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

can you also test that the commute raises here

@@ -297,7 +297,7 @@ def test_dti_mul_dti_raises(self):
def test_dti_mul_too_short_raises(self):
idx = self._holder(np.arange(5, dtype='int64'))
with pytest.raises(ValueError):

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

why did this change?

This comment has been minimized.

@jbrockmendel

jbrockmendel Jan 22, 2018

Member

Because the problem is now caught in TimedeltaIndex when checking what it is operating against, whereas before the problem was caught in numpy when trying to operate on differently-sized arrays.

pd.UInt64Index(range(1, 3)),
pd.Float64Index(range(1, 3)),
pd.RangeIndex(1, 3)])
@pytest.mark.parametrize('scalar_td', [Timedelta(days=1),

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

these tests really belong with the scalar timedelta tests I think

This comment has been minimized.

@jbrockmendel

jbrockmendel Jan 22, 2018

Member

Yah I thought about that and eventually decided that since the affected code was in the Index class, the test belonged in the Index tests. But either way works for me.

@@ -26,6 +26,36 @@ def full_like(array, value):
return ret
class TestIndexArithmetic(object):

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

rename this class its index with timedelta scalar

This comment has been minimized.

@jreback

jreback Jan 22, 2018

Contributor

add a comment. future readers are going to have no idea of where to find tests

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Jan 22, 2018

Any idea what's special about the circleCI setup? I can't replicate the error there.

@codecov

This comment has been minimized.

codecov bot commented Jan 22, 2018

Codecov Report

Merging #19333 into master will decrease coverage by 0.02%.
The diff coverage is 95.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19333      +/-   ##
==========================================
- Coverage   91.61%   91.59%   -0.03%     
==========================================
  Files         150      150              
  Lines       48892    48911      +19     
==========================================
+ Hits        44792    44798       +6     
- Misses       4100     4113      +13
Flag Coverage Δ
#multiple 89.96% <95.23%> (-0.03%) ⬇️
#single 41.77% <14.28%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/range.py 95.74% <100%> (+0.09%) ⬆️
pandas/core/indexes/base.py 96.41% <92.3%> (-0.04%) ⬇️
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa59954...9e3dec4. Read the comment docs.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Jan 24, 2018

@gfyoung any wisdom to offer about what makes circleci config special?

@gfyoung

This comment has been minimized.

Member

gfyoung commented Jan 24, 2018

any wisdom to offer about what makes circleci config special?

@jbrockmendel : I wish 😉 I've generally found Circle and AppVeyor to be more PITA than Travis due to weird failures like these. AFAIK, Travis and Circle are quite similar, so unless you have some kind of flakiness in your tests, I'm not 100% sure why you are getting these results.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Jan 24, 2018

@gfyoung OK thanks. I'd rebase and re-push and hope that it sorts itself out, but don't want to put the strain on Travis.

@gfyoung

This comment has been minimized.

Member

gfyoung commented Jan 24, 2018

but don't want to put the strain on Travis.

@jbrockmendel : No worries. I'm still kind of perplexed why Circle gets the completely wrong result when doing such a simple operation.

@jreback

This comment has been minimized.

Contributor

jreback commented Jan 25, 2018

looks fine. ping when all green.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 10, 2018

https://circleci.com/docs/2.0/ssh-access-jobs/

Looks like I need to send you a public key and have you authorize it for me to debug.

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 10, 2018

you simply need to sign up on CircleCI, then you can run personal builds.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 14, 2018

@gfyoung I made it onto CircleCI but couldn't replicate the test failure. Can I impose on you to give it a try? There's a beer and a high-five in it for you if that helps.

@gfyoung

This comment has been minimized.

Member

gfyoung commented Feb 14, 2018

@jbrockmendel : Can you push those to me on GitHub? 😄 I restarted both AppVeyor and Circle for you. That being said, not sure yet what's causing the discrepancy. Might be something architecturally-related (e.g. int32 vs. int64).

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 14, 2018

Oh is it a 32 bit build thats broken? I hadn't noticed that.

@jreback jreback added this to the 0.23.0 milestone Feb 15, 2018

values, other = other, values
with np.errstate(all='ignore'):
result = op(values, other)

This comment has been minimized.

@jreback

jreback Feb 15, 2018

Contributor

can you add some comments (or expand the top ones), this code does a lot

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 15, 2018

small comments, is there an issue about this (or just PR)? rebase and ping on green.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 15, 2018

small comments, is there an issue about this (or just PR)? rebase and ping on green.

Just the PR I think.

My attempt to SSH into CircleCI to debug the weird failing test failed. I'll try again at some point, hoping a fresh pair of eyeballs takes a crack at it in the interim.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 21, 2018

I think I got this fixed, reopening.

@jbrockmendel jbrockmendel reopened this Feb 21, 2018

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 21, 2018

what did it turn out to be?

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 21, 2018

What did it turn out to be?

I'll open an issue for the underlying cause because it might cause problems elsewhere: in sufficiently old numpy (the build that caught it used 1.9.2) we see the following:

value = np.timedelta64(86400000000000,'ns')
new_value = int(value)
assert new_value == value  # <-- in older numpy, this does not raise

As a result RangeIndex(np.timedelta64(1, 'ns'), np.timedelta64(10, 'ns'), np.timedelta64(1, 'ns')) doesn't raise and instead returns RangeIndex(1, 10, 1).

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 21, 2018

we do't need a new issue for this. see my comments there. rebase and I'll have a look.

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 21, 2018

value = np.timedelta64(86400000000000,'ns')
new_value = int(value)
assert new_value == value  # <-- in older numpy, this does not raise

there are many places in the code base where this is handled. This is the point of needing to check for timedeta before its checked for integer.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Feb 21, 2018

Rebase accomplished.

@jreback jreback merged commit 3ab8623 into pandas-dev:master Feb 22, 2018

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Contributor

jreback commented Feb 22, 2018

thanks! hey sometimes my stubborness pay off ..... thanks for sticking with it!

@jbrockmendel jbrockmendel deleted the jbrockmendel:idx_mul_tdi branch Feb 22, 2018

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment