Support for np.cov #3345

rjenc29 · 2018-09-27T11:54:09Z

Initial commit for CI.

rjenc29 · 2018-10-17T12:20:42Z

I've added a bunch of tests for different types / shapes of array-like inputs and some pre-processing logic to get the tests to pass (I expect the current CI to be clean).

I also made the tests @needs_blas and ripped out the naive matrix multiply stuff.

So we should be getting nearer the mark - let me know what you think.

If it's alright with you, I will remove the WIP from the PR title later today.

Cheers!

stuartarchibald · 2018-10-17T16:38:44Z

Sounds good, thanks! Yes, please feel free to remove with WIP when you are happy for review to take place, it sounds like most of the issues noted above are now covered?

rjenc29 · 2018-10-17T17:02:57Z

Yep, I think the issues noted are all covered one way or another and there are tests covering a bunch of other use cases.

Quite possible there are gaps / potential improvements, but it's probably a valid basis to review - I won't commit any more changes for the time being.

stuartarchibald · 2018-10-17T17:06:40Z

Great, thanks. Please merge in master to resolve conflicts and I'll review.

stuartarchibald

Thanks for the patches. I've done an initial review, it's a good impl. I've added some minor comments to clean up the code in a couple of places and there's a couple of potential bugs relating to typing to address, otherwise good to go. Thanks again for your efforts.

stuartarchibald · 2018-10-15T13:57:10Z

numba/tests/test_np_functions.py

+            _check(params)
+
+    @unittest.skipUnless(np_version >= (1, 10), "cov needs Numpy 1.10+")
+    def test_cov_egde_cases(self):


s/egde/edge/

stuartarchibald · 2018-10-26T11:11:44Z

docs/source/reference/numpysupported.rst

@@ -263,6 +263,7 @@ The following top-level functions are supported:
 * :func:`numpy.convolve` (only the 2 first arguments)
 * :func:`numpy.copy` (only the first argument)
 * :func:`numpy.correlate` (only the 2 first arguments)
+* :func:`numpy.cov` (only the 5 first arguments, requires NumPy >= 1.10)


Should note requirement of SciPy 0.16+.

stuartarchibald · 2018-10-26T11:18:54Z

numba/targets/arraymath.py

+            m_arr = np.atleast_2d(_asarray(m))
+
+            # transpose if asked to and not a (1, n) vector
+            if not rowvar and m_arr.shape[0] != 1:


Does the check for (1, n) need to happen, 1D array.T == 1D array?

Yeah, now that the "2D but single row" data shape is not allowed, this check is redundant - have removed it.

stuartarchibald · 2018-10-26T11:26:34Z

numba/targets/arraymath.py

+               "simply pass the row as a 1D array, i.e. m[0].")
+        raise RuntimeError(msg)
+
+_handle_m_dim_nop = register_jitable(lambda x:x)


space after the :

stuartarchibald · 2018-10-26T11:54:13Z

numba/targets/arraymath.py

+            if y in (None, types.none):
+                return np_cov_impl_single_variable
+
+        if isinstance(m, (types.Integer, types.Float, types.Complex, types.Boolean)):


I think checking inheritance from types.Number in the above condition should cover the numeric types (but not Boolean).

stuartarchibald · 2018-10-26T12:01:41Z

numba/tests/test_np_functions.py

+        y = np.arange(4)
+        with self.assertRaises(ValueError) as raises:
+            cfunc(m, y=y)
+        self.assertIn('m and y must have the same number of variables', str(raises.exception))


Perhaps this should refer to the dimension size mismatch? Is variables a bit generic/ambiguous?

I changed the wording a bit - I meant 'variables' in the statistics sense, but I agree the wording's poor. Hopefully it's now less poor :)

stuartarchibald · 2018-10-26T12:35:29Z

numba/targets/arraymath.py

+            return np.array(variance)
+
+        # identify up front if output is 0D
+        if isinstance(m, types.Array) and m.ndim == 1 or isinstance(m, types.Tuple):


I'm a bit puzzled by this, a Tuple type could have more than one dimension?

In [24]: print(type(numba.typeof(((0.1, 0.2), (0.11, 0.19), (0.09j, 0.21j))))) <class 'numba.types.containers.Tuple'>

Also, Numba has a load of Tuple related types.

In [30]: [x for x in dir(numba.types) if 'tuple' in x.lower()] Out[30]: ['BaseAnonymousTuple', 'BaseNamedTuple', 'BaseTuple', 'NamedTuple', 'NamedTupleClass', 'NamedUniTuple', 'Tuple', 'UniTuple', 'UniTupleIter']

(ignore the iter and class entries). I think this probably needs to check if m subclasses a BaseTuple and then if the types of the items in the tuple inherit from Number/Boolean, which would indicate a "1D" scenario?

if isinstance(m, types.BaseTuple): if all(isinstance(x, (types.Number, types.Boolean)) for x in m.types): if y in (None, types.none): return np_cov_impl_single_variable

might help?

Yeah, this was a bug on my part. The tests all miraculously passed due to the way the if gates were set up and the various test cases resolving to UniTuple vs Tuple type etc.

Should be sorted now, or at least nearer to being sorted. I added your example explicitly in tests.

stuartarchibald · 2018-10-26T13:06:56Z

numba/tests/test_np_functions.py

+        def m_variations():
+            # array inputs
+            yield np.array([[0, 2], [1, 1], [2, 0]]).T
+            yield self.rnd.randn(100).reshape(5, 20)


Perhaps add:

yield np.asfortranarray(np.array([[0, 2], [1, 1], [2, 0]]).T) yield self.rnd.randn(100).reshape(5, 20)[:,::2]

for a fortran order and a slice. Note the fortran order specified a few lines below is also C contig by virtue of it being a single "column" of data.

Done and good spot with the Fortan order blunder.

stuartarchibald · 2018-10-26T13:07:59Z

numba/tests/test_np_functions.py

+            yield np.linspace(-3, 3, 33).reshape(33, 1, order='F')
+
+            # non-array inputs
+            yield ((0.1, 0.2), (0.11, 0.19), (0.09, 0.21))


Perhaps add a non-homogeneous tuple in here? Like:

((0.1, 0.2), (0.11, 0.19), (0.09j, 0.21j))

? It should trip the issue described above WRT types.Tuple.

stuartarchibald · 2018-10-26T13:27:42Z

numba/targets/arraymath.py

+        m_dt = determine_dtype(m)
+        y_dt = determine_dtype(y)
+        dtype = np.result_type(m_dt, y_dt, np.float64)
+


Think ddof needs type checking to be an integer type, non-integral DOFs don't make sense do they? Seems like NumPy will accept floats so long as they are integral value, perhaps we can just accept types.Integer for now and catch this at compile time? A unit test checking this would then be good, thanks.

I added something (possibly a bit crude) which should allow int, bool and float if integral value - with some explicit tests. Let me know what you think.

I can't think of a case where anything other than 0 and 1 make sense, in the absence of weights (which I haven't implemented anyway).

Great, thanks.

rjenc29 · 2018-10-28T18:34:55Z

Thanks for the detailed feedback. The last bunch of commits should address all points raised. I will keep an eye on CI - apologies in advance for any jet-lag related gremlins on my part.

stuartarchibald

Thanks for the fixes. There's a few minor things I've spotted in the new changes, but otherwise good. Once these small things are resolved this can be merged. Thanks again.

stuartarchibald · 2018-10-29T15:08:27Z

numba/targets/arraymath.py

-                # 'variables' as the constraint on rows or columns depends on
-                # whether rowvar is True or False...
+                msg = ("m and y have incompatible dimensions and thus "
+                       "cannot be concatenated")


Can probably drop the part about concatenation, this is an implementation detail?

stuartarchibald · 2018-10-29T15:17:35Z

numba/targets/arraymath.py

+@register_jitable
+def _handle_ddof(ddof):
+    if not np.isfinite(ddof):
+        raise TypeError('Cannot convert non-finite ddof to integer')


Should this be a ValueError? The function is accepting a floating point type, it's just that it must be finite?

Yeah, I made this a ValueError initially and can't quite recall why I changed it - so it's back to a ValueError

stuartarchibald · 2018-10-29T15:18:03Z

numba/targets/arraymath.py

+    if not np.isfinite(ddof):
+        raise TypeError('Cannot convert non-finite ddof to integer')
+    if ddof - int(ddof) != 0:
+        raise TypeError('ddof must be integer')


Same here, ValueError? Also, perhaps ddof must be an integral value as floating point is being accepted?

stuartarchibald · 2018-10-29T15:28:24Z

numba/targets/arraymath.py

+            elif isinstance(ddof, types.Float):
+                _DDOF_HANDLER = _handle_ddof
+            else:
+                raise TypeError('ddof must be integer')


Think this can be a TypingError as it'll be caught at compile time? Does ddof must be a numerical scalar type make more sense too as it's a typing problem?

stuartarchibald · 2018-10-29T15:35:56Z

numba/tests/test_np_functions.py

+            msg = 'Cannot convert non-finite ddof to integer'
+            _check(m, ddof, msg)
+
+        for ddof in 'bacon', np.arange(4), 1.1, -3.142:


Nit: Perhaps s/bacon/junk/ as it's a more common nonsense input value?

Yeah, removed it - although I am always keen to include breakfast nouns in tests :)

stuartarchibald · 2018-10-29T15:44:34Z

numba/targets/arraymath.py

+        m_dt = determine_dtype(m)
+        y_dt = determine_dtype(y)
+        dtype = np.result_type(m_dt, y_dt, np.float64)
+


Great, thanks.

stuartarchibald · 2018-10-29T15:46:14Z

numba/targets/arraymath.py

+                return np_cov_impl_single_variable
+
+        if isinstance(m, types.Sequence):
+            if not isinstance(m.key[0], types.Sequence) and y in (None, types.none):


This may cause issues at some point, but we can defer concern until sequence type detection is altered.

Yes, there may be a tax to pay - but there are quite a few tests so it should at least be apparent when it's affected and needs a rethink

rjenc29 · 2018-10-29T23:02:13Z

I think I've actioned the above - all seems sensible to me. Let me know any residuals and I will close them out ASAP. Cheers!

stuartarchibald

Thanks for all the work on this and your persistence in the correct handling of all the details. This can be merged once CI passes. (CI passing requires the LLVM 7 upgrade, I'll kick off a new build once that's done but do not foresee any issues). Thanks again for the contribution!

seibert · 2018-10-31T18:55:58Z

merging since we've rolled back to LLVM 6

rjenc29 added 30 commits September 24, 2018 11:42

initial commit

8020e5f

implement y

4146ae1

tweak travis and tests.sh

9e28566

implement rowvar

6214f6a

handle non array inputs

ea5d7c7

baseline for CI

87bfcec

handle negative ddof

7c184e4

minor tweak

07748c5

handle complex input

0de4791

dtype evaluation

3fffaf4

add guard for array dims and associated tests

e24666b

require two or more variables

794a8ad

handle 1D array input

eb52932

handle empty array

7312260

handle empty array v2

8fad984

handle empty array m and y

14303ae

temporarily remove test case

27e31f0

temporarily remove test case

03ea0dc

temporarily remove test cases

39f1297

add test back

a276817

add test back

9fb51fa

add test back

9d17159

add test back

1735006

verify failing test

96a710f

verify failing test

6952b87

handle failing test

a6c3f1c

prep for PR

e3fe081

concatenate bugfix

3d286c5

revert travis and tests.sh

b3c1994

update tests

838b9b3

rjenc29 changed the title ~~[WIP] Support for np.cov~~ Support for np.cov Oct 17, 2018

merge master

611b62a

stuartarchibald requested changes Oct 26, 2018

View reviewed changes

rjenc29 added 10 commits October 28, 2018 09:12

fix typo and advertise SciPy requirement in docs

b824f7a

address PEP8 issue

739c89d

simplify type check

ce38b07

update exception message

9c7d3ad

added tests

ab87470

validate ddof

8f273fd

add references to numpy tests

24b9db9

bugfix tuple handling

b3292d9

additional tests and handle tuple of single tuple

50e2396

factor out common data prep

e9111ee

rjenc29 added 3 commits October 29, 2018 08:08

tweak to ddof handling

8921894

Merge branch 'master' into cov

fc5c55f

add required guards to test

5903fbd

stuartarchibald reviewed Oct 29, 2018

View reviewed changes

rjenc29 added 3 commits October 29, 2018 22:17

revert merge master and reapply test guards

e33f296

change exception types

1451718

Merge master back in to feature branach

e825651

stuartarchibald approved these changes Oct 30, 2018

View reviewed changes

seibert merged commit b44b5e6 into numba:master Oct 31, 2018

stuartarchibald mentioned this pull request Nov 14, 2018

Numba nopython mode does not support numpy.cov #3496

Closed

Support for np.cov #3345

Support for np.cov #3345

Conversation

rjenc29 commented Sep 27, 2018

rjenc29 commented Oct 17, 2018

stuartarchibald commented Oct 17, 2018

rjenc29 commented Oct 17, 2018

stuartarchibald commented Oct 17, 2018

stuartarchibald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjenc29 commented Oct 28, 2018

stuartarchibald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjenc29 commented Oct 29, 2018

stuartarchibald left a comment

Choose a reason for hiding this comment

seibert commented Oct 31, 2018