TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test #23468

jschendel · 2018-11-02T23:14:04Z

To address @jreback's comment in the xref:

.get_indexer should be returning platform int - but it’s not ? (maybe just this case)

It looks like IntervalTree.get_indexer and any other similar IntervalTree methods (get_loc, get_indexer_nonunique, etc.) will always return int64 as the data comes from a Int64Vector.to_array().

If returning platform int is expected, I'm not sure that making this change now is the best use of time: I'm hitting these methods as part of the new IntervalIndex behavior specs, so probably best enforce platform int as part of that implementation.

pep8speaks · 2018-11-02T23:14:06Z

Hello @jschendel! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/tests/indexes/interval/test_interval_tree.py !

jreback

this just needs a small change in .get_indexer i think

this is inconsistent as II is the only index returning the wrong type

jreback · 2018-11-03T13:40:10Z

This is called in 1 case in get_indexer. I think this needs wrapping with ensure_platform_int (could be in get_indexer)

 def _find_non_overlapping_monotonic_bounds(self, key):
       if isinstance(key, IntervalMixin):
           start = self._searchsorted_monotonic(
               key.left, 'left', exclude_label=key.open_left)
           stop = self._searchsorted_monotonic(
               key.right, 'right', exclude_label=key.open_right)
       elif isinstance(key, slice):
           # slice
           start, stop = key.start, key.stop
           if (key.step or 1) != 1:
               raise NotImplementedError("cannot slice with a slice step")
           if start is None:
               start = 0
           else:
               start = self._searchsorted_monotonic(start, 'left')
           if stop is None:
               stop = len(self)
           else:
               stop = self._searchsorted_monotonic(stop, 'right')
       else:
           # scalar or index-like

           start = self._searchsorted_monotonic(key, 'left')
           stop = self._searchsorted_monotonic(key, 'right')
       return start, stop

…tests-int64

TomAugspurger · 2018-11-06T14:36:36Z

Merged master, to see if the code-check failure in https://travis-ci.org/pandas-dev/pandas/jobs/450067817#L2756 goes away. It wasn't clear to me what the issue was.

codecov · 2018-11-06T16:19:33Z

Codecov Report

Merging #23468 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #23468      +/-   ##
==========================================
- Coverage   92.25%   92.24%   -0.01%     
==========================================
  Files         161      161              
  Lines       51181    51224      +43     
==========================================
+ Hits        47217    47254      +37     
- Misses       3964     3970       +6

Flag	Coverage Δ
#multiple	`90.63% <ø> (-0.01%)`	⬇️
#single	`42.28% <ø> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/timedeltas.py	`89.88% <0%> (-0.84%)`	⬇️
pandas/core/arrays/base.py	`97.35% <0%> (-0.67%)`	⬇️
pandas/core/internals/blocks.py	`93.67% <0%> (-0.36%)`	⬇️
pandas/core/indexes/datetimes.py	`96.13% <0%> (-0.29%)`	⬇️
pandas/core/arrays/datetimelike.py	`95.9% <0%> (-0.19%)`	⬇️
pandas/core/arrays/categorical.py	`95.09% <0%> (-0.13%)`	⬇️
pandas/core/arrays/sparse.py	`91.71% <0%> (-0.13%)`	⬇️
pandas/io/json/json.py	`93.11% <0%> (-0.02%)`	⬇️
pandas/core/indexes/base.py	`96.45% <0%> (-0.01%)`	⬇️
pandas/core/reshape/reshape.py	`99.54% <0%> (-0.01%)`	⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd98841...7e83460. Read the comment docs.

jreback · 2018-11-07T14:06:28Z

@jschendel are you able to fix this rather than this work-around?

jschendel · 2018-11-07T19:27:45Z

@jreback : will look into this tonight. sorry, been super bogged down at work.

jschendel · 2018-11-08T03:00:15Z

Modified IntervalTree methods to return 'intp' dtype arrays via an astype after calling Int64Vector.to_array.

TomAugspurger · 2018-11-08T12:00:05Z

Fixed the linting error (untested).

jreback · 2018-11-08T12:30:35Z

lgtm. let's merge on green.

jreback · 2018-11-08T13:10:49Z

thanks @jschendel and @TomAugspurger

jreback · 2018-11-09T11:42:01Z

still failing
https://travis-ci.org/MacPython/pandas-wheels/jobs/452707967
see end of log

jschendel · 2018-11-09T18:13:49Z

Yeah, of course changing the IntervalTree code to return platform int would cascade to other things. Pretty shortsighted of me. Should be an easy fix of 'int64' --> 'intp' in the expected though, so will get that in later today.

It looks like one of the failures is unrelated to my changes though:

_______________________ TestSparseGroupBy.test_aggfuncs ________________________
self = <pandas.tests.sparse.test_groupby.TestSparseGroupBy object at 0xde3f08ec>
    def test_aggfuncs(self):
        sparse_grouped = self.sparse.groupby('A')
        dense_grouped = self.dense.groupby('A')
    
        result = sparse_grouped.mean().to_sparse()
        expected = dense_grouped.mean().to_sparse()
    
>       tm.assert_frame_equal(result, expected)
/venv/lib/python3.6/site-packages/pandas/tests/sparse/test_groupby.py:50: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/venv/lib/python3.6/site-packages/pandas/util/testing.py:1185: in assert_extension_array_equal
    assert_numpy_array_equal(left_valid, right_valid)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
left = array([-0.17387645482451206, 0.3414148016424936], dtype=object)
right = array([-0.17387645482451206, 0.3414148016424937], dtype=object)
err_msg = None
    def _raise(left, right, err_msg):
        if err_msg is None:
            if left.shape != right.shape:
                raise_assert_detail(obj, '{obj} shapes are different'
                                    .format(obj=obj), left.shape, right.shape)
    
            diff = 0
            for l, r in zip(left, right):
                # count up differences
                if not array_equivalent(l, r, strict_nan=strict_nan):
                    diff += 1
    
            diff = diff * 100.0 / left.size
            msg = '{obj} values are different ({pct} %)'.format(
                obj=obj, pct=np.round(diff, 5))
>           raise_assert_detail(obj, msg, left, right)
E           AssertionError: numpy array are different
E           
E           numpy array values are different (50.0 %)
E           [left]:  [-0.17387645482451206, 0.3414148016424936]
E           [right]: [-0.17387645482451206, 0.3414148016424937]
/venv/lib/python3.6/site-packages/pandas/util/testing.py:1146: AssertionError

This error was not present in build #763 and then appeared in build #764; I had not made any changes between those two builds.

…fixed * upstream/master: (47 commits) CLN: remove values attribute from datetimelike EAs (pandas-dev#23603) DOC/CI: Add linting to rst files, and fix issues (pandas-dev#23381) PERF: Speeds up creation of Period, PeriodArray, with Offset freq (pandas-dev#23589) PERF: define is_all_dates to shortcut inadvertent copy when slicing an IntervalIndex (pandas-dev#23591) TST: Tests and Helpers for Datetime/Period Arrays (pandas-dev#23502) Update description of Index._values/values/ndarray_values (pandas-dev#23507) Fixes to make validate_docstrings.py not generate warnings or unwanted output (pandas-dev#23552) DOC: Added note about groupby excluding Decimal columns by default (pandas-dev#18953) ENH: Support writing timestamps with timezones with to_sql (pandas-dev#22654) CI: Auto-cancel redundant builds (pandas-dev#23523) Preserve EA dtype in DataFrame.stack (pandas-dev#23285) TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pandas-dev#23468) BUG: raise if invalid freq is passed (pandas-dev#23546) remove uses of (ts)?lib.(NaT|iNaT|Timestamp) (pandas-dev#23562) BUG: Fix error message for invalid HTML flavor (pandas-dev#23550) ENH: Support EAs in Series.unstack (pandas-dev#23284) DOC: Updating DataFrame.join docstring (pandas-dev#23471) TST: coverage for skipped tests in io/formats/test_to_html.py (pandas-dev#22888) BUG: Return KeyError for invalid string key (pandas-dev#23540) BUG: DatetimeIndex slicing with boolean Index raises TypeError (pandas-dev#22852) ...

…ndas-dev#23468)

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test

f4372a2

jschendel added Testing pandas testing functions or related to the test suite Interval Interval data type 32bit 32-bit systems labels Nov 2, 2018

jschendel added this to the 0.24.0 milestone Nov 2, 2018

jreback reviewed Nov 2, 2018

View reviewed changes

Merge remote-tracking branch 'upstream/master' into jschendel-ivtree-…

9fedffa

…tests-int64

intp casting in ivtree

8f460a4

lint

7e83460

jreback merged commit 70fa75a into pandas-dev:master Nov 8, 2018

jschendel deleted the ivtree-tests-int64 branch November 9, 2018 23:40

jschendel mentioned this pull request Nov 9, 2018

TST: Use intp as expected dtype in IntervalIndex indexing tests #23609

Merged

JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pa…

3bb2f75

…ndas-dev#23468)

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pa…

cb1b288

…ndas-dev#23468)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pa…

3c67ae3

…ndas-dev#23468)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pa…

1a7d224

…ndas-dev#23468)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test #23468

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test #23468

jschendel commented Nov 2, 2018

pep8speaks commented Nov 2, 2018

jreback left a comment

jreback commented Nov 3, 2018

TomAugspurger commented Nov 6, 2018

codecov bot commented Nov 6, 2018 •

edited

Loading

jreback commented Nov 7, 2018

jschendel commented Nov 7, 2018

jschendel commented Nov 8, 2018

TomAugspurger commented Nov 8, 2018

jreback commented Nov 8, 2018

jreback commented Nov 8, 2018

jreback commented Nov 9, 2018

jschendel commented Nov 9, 2018

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test #23468

TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test #23468

Conversation

jschendel commented Nov 2, 2018

pep8speaks commented Nov 2, 2018

jreback left a comment

Choose a reason for hiding this comment

jreback commented Nov 3, 2018

TomAugspurger commented Nov 6, 2018

codecov bot commented Nov 6, 2018 • edited Loading

Codecov Report

jreback commented Nov 7, 2018

jschendel commented Nov 7, 2018

jschendel commented Nov 8, 2018

TomAugspurger commented Nov 8, 2018

jreback commented Nov 8, 2018

jreback commented Nov 8, 2018

jreback commented Nov 9, 2018

jschendel commented Nov 9, 2018

codecov bot commented Nov 6, 2018 •

edited

Loading