BLD: numpy master changes breaking #12049

Closed
jreback opened this Issue Jan 15, 2016 · 9 comments

Comments

Projects
None yet
3 participants
Contributor

jreback commented Jan 15, 2016

(24 hrs ago) good build: https://travis-ci.org/pydata/pandas/jobs/102356098: 1.11.0.dev0+51d2ecd
(1 hr ago) breaking lots of things: https://travis-ci.org/pydata/pandas/jobs/102596904: 1.11.0.dev0+aa6335c

@shoyer IIRC a couple of your PR's were merged in the last day.

here's an example:

======================================================================
FAIL: test_coercion_with_setitem_and_series (pandas.tests.test_indexing.TestSeriesNoneCoercion)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/pydata/pandas/pandas/tests/test_indexing.py", line 5247, in test_coercion_with_setitem_and_series
    expected_series.values, strict_nan=True)
  File "/home/travis/build/pydata/pandas/pandas/util/testing.py", line 866, in assert_numpy_array_equal
    raise_assert_detail(obj, msg, left, right)
  File "/home/travis/build/pydata/pandas/pandas/util/testing.py", line 825, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: numpy array are different
numpy array values are different (33.33333 %)
[left]:  [NaT, 2000-01-02T00:00:00.000000000+0000, 2000-01-03T00:00:00.000000000+0000]
[right]: [NaT, 2000-01-02T00:00:00.000000000+0000, 2000-01-03T00:00:00.000000000+0000]

jreback added this to the 0.18.0 milestone Jan 15, 2016

Contributor

jreback commented Jan 15, 2016

my guess is that we now need to compare M8[ns] by .view('i8') then comparing as the NaT will then compare equal.

Member

shoyer commented Jan 15, 2016

Yep, if this is datetime64 related it's my fault :). I though we were already using .view('i8') before making comparisons but I guess I was wrong. If so (especially if this breaks user facing stuff) we may need to hold off on these numpy fixes for a little longer (deprecation cycle?). Sigh...

On Fri, Jan 15, 2016 at 7:27 AM, Jeff Reback notifications@github.com
wrote:

my guess is that we now need to compare M8[ns] by .view('i8') then comparing as the NaT will then compare equal.

Reply to this email directly or view it on GitHub:
pydata#12049 (comment)

Contributor

jreback commented Jan 15, 2016

we can adapt. do you want me to open an issue on numpy?

In [1]: np.__version__
Out[1]: '1.10.2'

In [5]: arr = np.array([np.nan])

In [6]: np.array_equal(arr,arr)
Out[6]: False

In [7]: arr = np.array([np.datetime64('NaT')])

In [8]: np.array_equal(arr,arr)
Out[8]: True

ok so your change makes sense

In [1]: np.__version__
Out[1]: '1.11.0.dev0+aa6335c'

In [2]: In [5]: arr = np.array([np.nan])

In [7]: arr = np.array([np.nan])

In [8]: np.array_equal(arr,arr)
Out[8]: False

In [9]: arr = np.array([np.datetime64('NaT')])

In [10]: np.array_equal(arr,arr)
Out[10]: False
Member

shoyer commented Jan 15, 2016

Actually, looks like we might just be able to drop our assert_numpy_array_equivalent entirely in favor of numpy's assert_array_equal, which has supported NaN equality since at least August 2011 (numpy 1.7, I think):
numpy/numpy@67ece6b

Member

shoyer commented Jan 15, 2016

Unfortunately, this bug exists in our array_equivalent utility function, which we use for the equals method. This means that with this change datetime64 equality checks involving NaT will be broken. As much as I would love to just roll out the NumPy fix, doing it like this will assuredly result in unhappy users and unnecessary aggravation when sanity checks and test suites fail.

To fix this, I propose:

  • We roll back the NaT comparison fix in NumPy for now, issuing a deprecation warning instead for a numpy release or two.
  • We add the fix to array_equivalent in pandas, doing the appropriate cast to int64 to avoid needing to catch the deprecation warning (which can have performance consequences).
Contributor

jreback commented Jan 15, 2016

yep, prob need a special case for M8/m8 (alternatively we could have a conditional check on the numpy version).

Ok, lmk if you need anything on the numpy side.

njsmith commented Jan 15, 2016

BTW, numpy's travis jobs now continuously upload pre-built (as wheels) snapshots of numpy master to http://travis-dev-wheels.scipy.org/ -- might simplify your test infrastructure for testing against numpy master. (Looks like we're only doing this for py35 right now, but it'd be easy to add more builds if you need them.)

njsmith commented Jan 15, 2016

It looks like the magic incantation is:

pip install --pre --upgrade --no-index --timeout=60 --trusted-host travis-dev-wheels.scipy.org -f http://travis-dev-wheels.scipy.org/ numpy
Contributor

jreback commented Jan 15, 2016

@njsmith ok, thanks!
yeh that would be ideal to have 2.7/3.5 wheels (right now we are just testing vs 2.7 numpy master)

@jreback jreback added a commit to jreback/pandas that referenced this issue Jan 16, 2016

@jreback jreback COMPAT: numpy compat with NaT != NaT, #12049 358a334

jreback closed this in #12058 Jan 16, 2016

@jreback jreback added a commit that referenced this issue Jan 16, 2016

@jreback jreback Merge pull request #12058 from jreback/array_equivalent
COMPAT: numpy compat with NaT != NaT, #12049
ce37586
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment