Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for np.clip and ndarray.clip. #3468

Closed
wants to merge 5 commits into from

Conversation

arvoelke
Copy link
Contributor

@arvoelke arvoelke commented Nov 2, 2018

As title.

Fixes #3465

@arvoelke arvoelke mentioned this pull request Nov 2, 2018
@arvoelke
Copy link
Contributor Author

arvoelke commented Nov 2, 2018

Note that I had to comment out a couple tests because they are currently broken! Something to do with None types. Help / pointers would be appreciated. Here is a snippet of the error from uncommenting the cfunc(a, None, None) test:

py.test numba/tests/test_array_methods.py::TestArrayMethods::test_clip
           numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E           Invalid use of Function(<function clip at 0x7f5ab69b6158>) with argument(s) of type(s): (array(float64, 1d, C), none, none, none)
E            * parameterized
E           In definition 0:
E               TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E           Invalid use of Function(<built-in function lt>) with argument(s) of type(s): (float64, none)
E           Known signatures:
E            * (bool, bool) -> bool
E            * (int8, int8) -> bool
E            * (int16, int16) -> bool
E            * (int32, int32) -> bool
E            * (int64, int64) -> bool
E            * (uint8, uint8) -> bool
E            * (uint16, uint16) -> bool
E            * (uint32, uint32) -> bool
E            * (uint64, uint64) -> bool
E            * (float32, float32) -> bool
E            * (float64, float64) -> bool
E            * parameterized
E           In definition 0:
E               All templates rejected
E           In definition 1:
E               All templates rejected
E           In definition 2:
E               All templates rejected
E           In definition 3:
E               All templates rejected
E           In definition 4:
E               All templates rejected
E           In definition 5:
E               All templates rejected
E           In definition 6:
E               All templates rejected
E           This error is usually caused by passing an argument of a type that is unsupported by the named function.
E           [1] During: typing of intrinsic-call at /home/arvoelke/CTN/arvoelke-numba/numba/targets/arrayobj.py (1713)
E           
E           File "numba/targets/arrayobj.py", line 1713:
E               def np_clip_impl(a, a_min, a_max, out=None):
E                   <source elided>
E                   for i in range(len(a)):
E                       if a_min is not None and a[i] < a_min:
E                       ^
E           
E               raised from /home/arvoelke/CTN/arvoelke-numba/numba/typeinfer.py:864

@arvoelke
Copy link
Contributor Author

arvoelke commented Nov 2, 2018

I'm also encountering some strange behaviour on this branch when it comes to using this overloaded np.clip within larger programs. For example, when adding this line to my test function:

jit(nopython=True)(lambda a: np.expm1(np.clip(a, -5, 5)))(a)

This raises the error:

...
E           Invalid use of Function(<ufunc 'expm1'>) with argument(s) of type(s): (?array(float64, 1d, C))
E            * parameterized
E           In definition 0:
E               TypingError: can't resolve ufunc expm1 for types (?array(float64, 1d, C),)
E               raised from /home/arvoelke/CTN/arvoelke-numba/numba/typing/npydecl.py:105
E           This error is usually caused by passing an argument of a type that is unsupported by the named function.
E           [1] During: resolving callee type: Function(<ufunc 'expm1'>)
...

The strange thing is I can get this to work if I replace the np.clip with a call to np_clip_impl decorated with @njit. And furthermore this inner-decorator turns out to be necessary. So there seems to be some difference in the type-inference between my @overloaded implementation and my @njited implementation.

@codecov-io
Copy link

codecov-io commented Nov 2, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@865b87a). Click here to learn what that means.
The diff coverage is 61.9%.

@@            Coverage Diff            @@
##             master    #3468   +/-   ##
=========================================
  Coverage          ?   80.61%           
=========================================
  Files             ?      392           
  Lines             ?    79833           
  Branches          ?     9076           
=========================================
  Hits              ?    64359           
  Misses            ?    14060           
  Partials          ?     1414

raise ValueError("array_clip: must set either max or min")
if out is None:
out = np.empty_like(a)
for i in range(len(a)):
Copy link
Contributor Author

@arvoelke arvoelke Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: use for index, val in np.ndenumerate(a): and add multi-dimensional tests. Currently blocked on #3469 for the scalar case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that fixing np.ndenumerate(scalar) may well be complicated, could the implementation just branch to an function that specifically deals with scalars as a workaround? e.g.

@overload(np.clip)
def np_clip(a, a_min, a_max, out=None):
    if isinstance(a, types.Number):
       def np_clip_impl(a, a_min, a_max, out=None):
            # scalar impl
    elif isinstance(a, types.Array):
        def np_clip_impl(a, a_min, a_max, out=None):
            # array impl
    return np_clip_impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the most recent changes (see thread) I can't see a way of doing this cleanly, since this blows up the amount of case-work for handling a_min and/or a_max being None. Suggestions welcome, or this could be deferred to #3469?

@arvoelke
Copy link
Contributor Author

arvoelke commented Nov 7, 2018

Could someone please provide me with a lead on why jit(nopython=True)(lambda a: np.expm1(np.clip(a, -5, 5)))(a) throws the above error on this branch? Despite both np.clip(a, -5, 5) and np.expm1(a) working in nopython mode independently, they break when composed. Related PRs / issues / code / docs would be appreciated.

@seibert
Copy link
Contributor

seibert commented Nov 11, 2018

Speculating here, I think it is related to how Numba handles "optional" types, which are types where the value is either of a specific type (like float64) or None. It is possible that type inference is behaving oddly for some reason.

@sklam: Can you offer some advice here?

@stuartarchibald
Copy link
Contributor

I think the "optional" return type is because type inference has to accommodate out being None or some known type, the returned type is effectively array or None.

The issue present arises because ufuncs have no handling for optional types, as an example:

from numba import njit
import numpy as np

def foo(A):
    return np.cos(A)

njit("(float64[:],)")(foo)(np.arange(10.)) # fine
njit("(optional(float64[:]),)")(foo)(np.arange(10.)) # broken

To force type inference to determine the type, specialisation is needed:

def _alloc_out(a, X): # dummy function that takes the arg we need to resolved and information to do so
    pass

@overload(_alloc_out)
def _alloc_out_impl(a, X): # overload the dummy function returning an implementation determined by whether `X` is a None-like type.
    if X is None or isinstance(X, types.NoneType):
        def impl(a, X):
            return np.empty_like(a)
    else:
        def impl(a, X):
            return X
    return impl

@overload(np.clip)
def np_clip(a, a_min, a_max, out=None):
    def np_clip_impl(a, a_min, a_max, out=None):
        if a_min is None and a_max is None:
            raise ValueError("array_clip: must set either max or min")
        ret = _alloc_out(a, out) # Force typing to explicitly resolve the type of `ret`, the function call will be inlined
        for i in range(len(a)):
            if a_min is not None and a[i] < a_min:
                ret[i] = a_min
            elif a_max is not None and a[i] > a_max:
                ret[i] = a_max
            else:
                ret[i] = a[i]
        return ret
    return np_clip_impl

by delegating the compilation of out through a function that explicitly handles the None vs. known array (i.e. resolves the optional), the type can be fully determined.

@arvoelke
Copy link
Contributor Author

Thanks @stuartarchibald. I pushed a commit with your change. That did the trick in getting jit(nopython=True)(lambda a: np.expm1(np.clip(a, -5, 5)))(a) to work! Your explanation also makes sense. I didn't add this explicitly as a test case though since it seems out of place? I just tested it manually offline.

I also had to do something similar to handle the possibilities of a_min and/or a_max being None. Otherwise my tests didn't pass. Are there ways to simplify my code while still passing all the tests?

It also supports multi-dimensional arrays now. The scalar case is broken (as in #3469), which I could fix with more type-specific implementations, but this would double the amount of code to handle all the possibilities again -- unless I'm missing a cleaner way of doing this.

@arvoelke
Copy link
Contributor Author

That did the trick in getting jit(nopython=True)(lambda a: np.expm1(np.clip(a, -5, 5)))(a) to work!

After some more follow-up, it appears that:

jit(nopython=True)(lambda a: np.expm1(a.clip(-5, 5)))(a)

is still broken for the same reason (difference is between np.clip and a.clip). Do I need to copy-paste this approach to get a type-specific implementation for the @overload_method(types.Array, 'clip') as well?

@stuartarchibald
Copy link
Contributor

@arvoelke RE #3468 (comment) I took a look at this and was a bit puzzled, so dug around a bit and think it's a bug, reported it here #3489.

For now, please could you try just dropping the out kwarg in the array method implementation and if that works it can be documented as such. That gets the functionality at least mostly working whilst #3489 is fixed. How does that sound?

Also, as it stands right now, the cut-off for the 0.41.0 Release Candidate is Monday 18th Nov if you were hoping for it to be in the next release? (we'll endeavour to get review turned around in time!)

Thanks.

@arvoelke
Copy link
Contributor Author

arvoelke commented Nov 15, 2018

For now, please could you try just dropping the out kwarg in the array method implementation and if that works it can be documented as such. That gets the functionality at least mostly working whilst #3489 is fixed. How does that sound?

I gave this a shot, but it appears as though the above snippet still fails when a_min or a_max are left out, i.e., jit(nopython=True)(lambda a: np.expm1(a.clip(-5)))(a) still fails. This is with the code changed to:

@overload_method(types.Array, 'clip')
def array_clip(a, a_min=None, a_max=None):
    # TODO: out argument not supported (see issue #3489)
    def impl(a, a_min=None, a_max=None):
        return np.clip(a, a_min, a_max)
    return impl

Reminder that these optional arguments have defaults for ndarray.clip but not for np.clip (they can be None for np.clip but they are required).

If on the other hand I remove the defaults, as in:

@overload_method(types.Array, 'clip')
def array_clip(a, a_min, a_max):
    # TODO: out argument not supported (see issue #3489)
    def impl(a, a_min, a_max):
        return np.clip(a, a_min, a_max)
    return impl

then jit(nopython=True)(lambda a: np.expm1(a.clip(-5, None)))(a) works just fine.

So I assume this is connected to #3489 in the same manner.

Since these defaults are a pretty significant feature of ndarray.clip, should we perhaps omit this implementation for now, and document only np.clip support under the list of other functions? Or is it better to have a mostly-working implementation associated with a known bug?

@arvoelke
Copy link
Contributor Author

arvoelke commented Nov 15, 2018

In the interest of streamlining our conversation and the review process, I've pushed a commit that removes ndarray.clip support, documents np.clip support instead, adds a couple more test cases, and documents the two outstanding bugs in the code. We can always roll-back this commit if that's no good.

@arvoelke
Copy link
Contributor Author

azure produced 3 build failures, all in the same way for one of the new tests I added:

2018-11-15T21:08:03.6820924Z ======================================================================
2018-11-15T21:08:03.6820999Z FAIL: test_clip (numba.tests.test_array_methods.TestArrayMethods)
2018-11-15T21:08:03.6821062Z ----------------------------------------------------------------------
2018-11-15T21:08:03.6821165Z Traceback (most recent call last):
2018-11-15T21:08:03.6821219Z   File "D:\a\1\s\numba\tests\test_array_methods.py", line 934, in test_clip
2018-11-15T21:08:03.6821274Z     jit(nopython=True)(lower_clip_result)(a))
2018-11-15T21:08:03.6821383Z   File "C:\Miniconda\envs\testenv\lib\site-packages\numpy\testing\_private\utils.py", line 347, in assert_equal
2018-11-15T21:08:03.6821448Z     return assert_array_equal(actual, desired, err_msg, verbose)
2018-11-15T21:08:03.6821555Z   File "C:\Miniconda\envs\testenv\lib\site-packages\numpy\testing\_private\utils.py", line 865, in assert_array_equal
2018-11-15T21:08:03.6821615Z     verbose=verbose, header='Arrays are not equal')
2018-11-15T21:08:03.6821675Z   File "C:\Miniconda\envs\testenv\lib\site-packages\numpy\testing\_private\utils.py", line 789, in assert_array_compare
2018-11-15T21:08:03.6821773Z     raise AssertionError(msg)
2018-11-15T21:08:03.6821819Z AssertionError: 
2018-11-15T21:08:03.6821872Z Arrays are not equal
2018-11-15T21:08:03.6821961Z skipped CUDA tests
2018-11-15T21:08:03.6822006Z skipped CUDA tests
2018-11-15T21:08:03.6822050Z skipped HSA tests
2018-11-15T21:08:03.6822093Z skipped HSA tests
2018-11-15T21:08:03.6822988Z 
2018-11-15T21:08:03.6823066Z (mismatch 4.950495049504951%)
2018-11-15T21:08:03.6823127Z  x: array([ -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,
2018-11-15T21:08:03.6823228Z         -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,
2018-11-15T21:08:03.6823284Z         -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,...
2018-11-15T21:08:03.6823338Z  y: array([ -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,
2018-11-15T21:08:03.6823434Z         -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,
2018-11-15T21:08:03.6823489Z         -0.993262,  -0.993262,  -0.993262,  -0.993262,  -0.993262,...
2018-11-15T21:08:03.6823520Z 
2018-11-15T21:08:03.6824181Z ======================================================================

This test calls the function:

def lower_clip_result(a):
    return np.expm1(np.clip(a, -5, 5))

Maybe I should use a function other than expm1? Seems like something to do with numerical precision on the np15 builds.

@seibert
Copy link
Contributor

seibert commented Nov 15, 2018

you can use np.testing.assert_array_almost_equal for the test

@arvoelke
Copy link
Contributor Author

you can use np.testing.assert_array_almost_equal for the test

Thanks. Passed locally and pushed. Surprised that even passed before!

@arvoelke
Copy link
Contributor Author

Also, as it stands right now, the cut-off for the 0.41.0 Release Candidate is Monday 18th Nov if you were hoping for it to be in the next release? (we'll endeavour to get review turned around in time!)

All of the builds passed this time. Is this ready for review, or should we hold off to fix the ndarray.clip issue?

@seibert
Copy link
Contributor

seibert commented Nov 16, 2018

I think we need to implement both to avoid confusion for users. We are planning for 0.42 to be a short dev cycle, with a release in the 3rd week of December, so if we miss 0.41, it will make it there.

@stuartarchibald stuartarchibald added this to the Numba 0.42 RC milestone Nov 19, 2018
@seibert seibert added this to In Progress in Active Dec 11, 2018
@stuartarchibald stuartarchibald added the Blocked awaiting long term feature For PRs/Issues that require the implementation of a long term plan feature label Jan 18, 2019
@stuartarchibald stuartarchibald removed the Blocked awaiting long term feature For PRs/Issues that require the implementation of a long term plan feature label Feb 5, 2019
@stuartarchibald
Copy link
Contributor

stuartarchibald commented Feb 5, 2019

Since #3704 was merged it should be possible to now implement the part of this feature that was blocked. @arvoelke would you be interested in having another go (after merging master into this branch) and letting us know how you get on? (Thanks!)

@stuartarchibald stuartarchibald changed the base branch from master to master_30_april_2019 April 30, 2019 08:30
@stuartarchibald stuartarchibald changed the base branch from master_30_april_2019 to master April 30, 2019 08:30
@seibert seibert modified the milestones: Numba 0.45 RC, Numba 0.46 RC Jul 2, 2019
@stuartarchibald
Copy link
Contributor

Just checking in on the state of this PR. @arvoelke was wondering if you have time to address the above, if not, don't worry, one of the core developers can take over to do the fixes. Thanks.

@arvoelke
Copy link
Contributor Author

My bad for letting this slip. It would be good if someone else could take over. A big thing that was slowing me down here was the use-case "in which a_min and/or a_max can be array_like." See above for documentation / etc. If the concern here isn't about matching the entire spec for np.clip then the main thing remaining is applying the idea from #3468 (comment). If there are questions while wrapping this up about the history of attempts in this PR just let me know. Still very interested in following this along. Thanks!

@stuartarchibald
Copy link
Contributor

@arvoelke thanks for letting us know and your work on this so far. One of the core devs will pick it up.

@makslevental
Copy link

any chance this will ever get merged?

@esc
Copy link
Member

esc commented Apr 6, 2020

any chance this will ever get merged?

Thanks for asking about this! Yes, there is definitely a chance that it will become merged. As soon as it becomes a priority of one of the core devs or when someone from the community picks it up and drives it to completion.

@arvoelke
Copy link
Contributor Author

arvoelke commented Apr 6, 2020

any chance this will ever get merged?

If this is something you need immediately (i.e., before merging), you're okay with calling np.clip as opposed to using the method a.clip, your array is one-dimensional, and specify both a_min and a_max as scalars (as opposed to being optional, or arrays)... then you could use the snippet in this comment here: #3465 (comment). Copied below for clarity:

@extending.overload(np.clip)
def np_clip(a, a_min, a_max, out=None):
    def np_clip_impl(a, a_min, a_max, out=None):
        if out is None:
            out = np.empty_like(a)
        for i in range(len(a)):
            if a[i] < a_min:
                out[i] = a_min
            elif a[i] > a_max:
                out[i] = a_max
            else:
                out[i] = a[i]
        return out
    return np_clip_impl

Extending this to handle all of the different ways in which np.clip and a.clip can be called has been the crux of this PR. If needed you can see how most of these cases can be handled in the commits and comments here. :)

@kernc
Copy link

kernc commented Aug 15, 2020

How about implementing np.clip in terms of np.maximum and np.minimum since those two are already implemented?

def np_clip(a, a_min, a_max, out=None):
    if out is None:
        out = np.empty_like(a)
    out[:] = np_minimum(np_maximum(a, a_min), a_max)
    return out

This seems to cover various input cases ...

@gmarkall
Copy link
Member

gmarkall commented Mar 8, 2021

PR #6808 started to continue this work.

@stuartarchibald
Copy link
Contributor

Closing this as #6808 continues. Thanks for your work on the initial implementation @arvoelke .

@stuartarchibald stuartarchibald added abandoned PR is abandoned (no reason required) and removed 2 - In Progress labels Mar 22, 2021
@stuartarchibald stuartarchibald removed this from In Progress in Active Mar 22, 2021
sklam added a commit that referenced this pull request May 19, 2021
#3468 continued: Add support for `np.clip`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
abandoned PR is abandoned (no reason required)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for np.clip
8 participants