Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add np.divmod ufunc #9063

Merged
merged 5 commits into from
May 8, 2017
Merged

ENH: add np.divmod ufunc #9063

merged 5 commits into from
May 8, 2017

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented May 6, 2017

Fixes #8932

TODO:

  • add it to NDArrayOperatorsMixin
  • update the ufunc overrides NEP
  • use it to implement ndarray.__divmod__?

See Also
--------
floor_divide : Equivalent of Python ``//`` operator.
remainder : Remainder complementary to floor_divide.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also equivalent to the % operator, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the equivalent for divmod(in, 1.):

modf : Return the fractional and integral parts of an array, element-wise.

}
else {
/* handle mixed case the way Python does */
const @type@ rem = in1 % in2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should be using div, ldiv, and lldiv here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrap that, apparently it's better to leave the compiler to it. Might be sensible to have a const @type@ quo = in1 / in2; line right beside this one though, just to help it out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that is definitely clearer. Done.

check(np.frexp(ArrayLike(2 ** -3)))
check(np.frexp(ArrayLike(np.array(2 ** -3))))
mantissa, exponent = np.frexp(2 ** -3)
expected = (ArrayLike(mantissa), ArrayLike(exponent))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use wrap_array_like here, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think it's actually clearer to call ArrayLike() separately in cases where we know the arity of the arguments

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, but I remember that we had quite a few problems with ensuring floor_div and remainder were consistent with each other and with cpython (#7258); I suggest to add tests for divmod for all the test cases that were introduced in #7258.

See Also
--------
floor_divide : Equivalent of Python ``//`` operator.
remainder : Remainder complementary to floor_divide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the equivalent for divmod(in, 1.):

modf : Return the fractional and integral parts of an array, element-wise.

@shoyer
Copy link
Member Author

shoyer commented May 6, 2017

I just did a quick benchmark, before switching the implementation of ndarray.__divmod__. Somewhat surprisingly, np.divmod is more than 2x faster in my micro-benchmark:

In [1]: import numpy as np

In [2]: x = np.arange(100000)

In [3]: %timeit np.floor_divide(x, 10)
1.33 ms ± 23.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: %timeit np.remainder(x, 10)
1.35 ms ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: %timeit divmod(x, 10)
2.7 ms ± 49.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: %timeit np.divmod(x, 10)
1.19 ms ± 14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

I'm not quite sure how that's possible, but there it is!

@eric-wieser
Copy link
Member

That doesn't surprise me at all:

  • No python operator resolution to wait on
  • Only invoking the overhead of the ufunc machinery once, rather than twice
  • Looping over the inputs once rather than twice
  • Many processors calculate % and \ simultaneously, and then just give back the result that you ask for - so half as many ALU instructions here too (GCC can optimize once the div and mod are not in separate ufuncs).

Divisor array.
out : tuple of ndarray, optional
Arrays into which the output is placed. Their types are preserved and
must be of the right shape to hold the output. See doc.ufuncs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that See doc.ufuncs doesn't actually produce a link of any kind, so should probably not be there at all, to avoid furthering the myth that it works.

Once we merge #9026, we'll get those links for every ufunc anyway.

@shoyer
Copy link
Member Author

shoyer commented May 6, 2017

That doesn't surprise me at all:

Each of these would give you a 2x speedup, but usually there's also some small fixed overhead. I would expect the time required would go from fixed_setup + 2 * compute_divmod to fixed_setup + compute_divmod. Here it looks like the fixed overhead is negative?!?

@mhvk
Copy link
Contributor

mhvk commented May 6, 2017

@shoyer - floor_div and remainder both effectively do a full divmod to ensure the results are correct -- see #7258 -- so a factor two improvement is in fact expected!

@shoyer
Copy link
Member Author

shoyer commented May 6, 2017

@mhvk yes, but how do you account for 2.27x speedup? :)

@eric-wieser
Copy link
Member

eric-wieser commented May 6, 2017

Here it looks like the fixed overhead is negative?!?

Only that the fixed overhead is smaller in the case of np.divmod - which is what I would expect, since you removed the overhead of PyNumber_DivMod in that test


Returns
-------
out1 : ndarray
Element-wise result of floor division.
Element-wise quotient resulting from floor division.
out2 : ndarray
Element-wise remainder from division.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're mentioning floor division above, then presumably you should do so here too?

@shoyer
Copy link
Member Author

shoyer commented May 6, 2017

I still don't get how np.divmod is faster than calling np.floor_divide or np.remainder and throwing away half the result, but I don't need to understand it to appreciate it :).

I'm about to add this as the implementation forndarray.__divmod__ and add the remaining test cases from #7258.

One question for the bikeshedders: what do we call this function? NumPy ufuncs have full non-abbreviated names, e.g., np.divide and np.remainder for the two separated operators. This suggests several possible alternatives to np.divmod:

  • np.divmod: consistent with Python's name for the operation, but masks the builtin divmod if write from numpy import *.
  • np.divide_remainder: consistent with the equivalent ufuncs that return one result, but one name is a verb and the other is a noun, which is an awkward combination. Unfortunately, there doesn't seem to be a single verb for "to take the remainder".
  • np.division_modulo: consistent with the Python's operator names
  • np.quotient_remainder: Both nouns.

@eric-wieser
Copy link
Member

eric-wieser commented May 6, 2017

NumPy ufuncs have full non-abbreviated names,

This actually catches me out a lot (especially np.mul not being a thing), so I'd favor sticking with divmod to minimize that pain. np.abs, np.all, and np.any already result in name-hiding, so I don't think that needs to be a concern.

# TODO: test div on Python 2, only
operator.mod,
divmod,
pow,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It puzzles me that operator.pow is a thing, but operator.divmod is not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these are operators in the sense that they have a corresponding symbol. (Hopefully, we'll have a operator.matmul in here shortly...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. In particular, operator.pow has two arguments, but pow has three

@shoyer
Copy link
Member Author

shoyer commented May 6, 2017

Please take another look. I think I've finished up the changes I wanted to include here.

(Fixing the docstring for np.isin was required to get the doc build to run)

Returns
-------
out1 : ndarray
Element-wise quotient resulting from floor division.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: resulting should probably be in both places or neither

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quotient is the result of floor division, but the remainder is a by-product. So I think the current language makes sense (but I'm open to alternatives if you have a concrete suggestion).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced results and byproducts are disjoint sets, but I'm happy to leave this as is based on your rationale.

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor nitpicks from me - patch looks great!

assert_(b < rem <= 0, msg)
else:
assert_(b > rem >= 0, msg)
for op in [floordiv_and_mod, divmod]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps before this loop, or even at the file level:

signs = {
    d: (+1,) if dt in np.typecodes['UnsignedInteger'] else (+1, -1)
    for d in dt
}

And then itertools.product(signs[dt1], signs[dt2]) below, which removes the continues

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, better would be, in the class level:

def _signs(self, dt):
    if dt in np.typecodes['UnsignedInteger']:
        return (+1,)
    else:
        return (+1, -1)

Constructing that dict is kinda clunky, and you'd end up repeating it over multiple tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

This ufunc corresponds to the Python builtin `divmod`, and is used to implement
`divmod` when called on numpy arrays. ``np.divmod(x, y)`` calculates a result
equivalent to ``(np.floor_divide(x, y), np.remainder(x, y))`` but is
approximately twice as fast as calling the functions separately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth pointing out that the builtin divmod now dispatches to this - I'm an idiot, you've already done this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And before release I'm going to gather all the new ufuncs into a New ufuncs section. There are enough of them in the 1.13 release to justify that.

@mhvk
Copy link
Contributor

mhvk commented May 7, 2017

@shoyer - this looks good! I like the loops over divmod and (floor_div, remainder) -- a very direct way to ensure they are identical to each other! Only the few nitpicks by @eric-wieser left, I think.

@@ -663,14 +663,14 @@ Symbol Operator NumPy Ufunc(s)
(Python 2)
``//`` ``floordiv`` :func:`floor_divide`
``%`` ``mod`` :func:`mod`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we use mod here and not remainder? If mod is preferable, then should we reverse the alias, so that np.mod.__name__ == 'mod'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for just using remainder here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no particular reason for this. Switched to use remainder.

--------
floor_divide : Equivalent of Python ``//`` operator.
remainder : Equivalent of Python ``%`` operator.
modf : Equivalent to ``divmod(x, 1.0)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a reference from these functions back to divmod too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modf function comes from the C library and doesn't agree with the Python divmod for negative floats.

In [1]: np.modf(-1.5)
Out[1]: (-0.5, -1.0)

In [2]: divmod(-1.5, 1)
Out[2]: (-2.0, 0.5)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note the reversed output values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing you plan to rebase after a final review?

if dt in np.typecodes['UnsignedInteger']:
return (+1,)
else:
return (+1, -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn, I'd missed that this was in two different files, which makes extracting it a little less useful. I guess this is still a minor improvement though

remainder : Equivalent to Python's ``%`` operator.
modf : Like ``divmod(x, 1.0)`` for positive ``x``, but returns
``(remainder, quotient)`` instead of
``(quotient, remainder)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can modf link to divmod as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention of the C library would be helpful: The C library ``modf`` function. Like ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this what the modf docstring is for? :)

Copy link
Member

@charris charris May 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not convinced the alignment of the colons buys anything here except extra spaces.

@shoyer
Copy link
Member Author

shoyer commented May 7, 2017

Guessing you plan to rebase after a final review?

Yes, indeed

@@ -2474,6 +2475,10 @@ def add_newdoc(place, name, doc):
-----
For integer input the return values are floats.

See Also
--------
divmod : Simultaneous floor division and remainder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was envisaging likening divmod(x, 1) to modf here, in the same way we do the reverse in divmod. If anything, this direction is more important to get the message across, since most users probably want the behaviour of divmod on negative values

@eric-wieser
Copy link
Member

LGTM. Ready for a final rebase, I think

Examples
--------
>>> np.divmod(np.arange(5), 3)
array([0, 0, 0, 1, 1]), array([0, 1, 2, 0, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke too soon - missing parentheses here

@shoyer
Copy link
Member Author

shoyer commented May 8, 2017

OK, git history has been rationalized. Feel free to merge once tests pass.

@eric-wieser eric-wieser merged commit 11f3ebf into numpy:master May 8, 2017
@eric-wieser
Copy link
Member

Thanks @shoyer!

@homu homu mentioned this pull request May 8, 2017
@shoyer shoyer deleted the divmod branch May 8, 2017 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants