Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cumsum (and accumulation operators generally?) should have an option to leave the identity in the first spot #6044

Open
njsmith opened this issue Jul 4, 2015 · 5 comments

Comments

@njsmith
Copy link
Member

njsmith commented Jul 4, 2015

I just found myself writing:

points_on_path = ...
point_to_point_distances = np.hypot(points_on_path[1:, :], point_ons_path[:-1, :])
arclengths = np.cumsum(point_to_point_distances)
arclengths = np.concatenate(([0], arclengths)
# now arclengths.shape == points_on_path.shape

and it occured to me that this is definitely not the first time I've found myself doing the awkward concatenate-a-zero dance in order to get a cumsum array whose shape matches what I want. Maybe we should have a convenience argument to cumsum (and to ufunc.accumulate in general I guess) to insert the operation's identity as the first argument.

No brain to figure out the details of how this would work right now, but filing a bug to remind me/us...

@jaimefrio
Copy link
Member

Amen, brother! +1

Our ufunc.accumulate does an inclusive scan, the alternative would be an exclusive scan, adding the identity as first item, and discarding the sum of all items at the end. At least that's the terminology I picked from my do-things-with-a-GPU readings.

@njsmith
Copy link
Member Author

njsmith commented Jul 5, 2015

In this case I actually want both the identity and the final sum to both be
present. But this suggests something like 'include_final=True,
include_identity=False', where you can toggle the presence of the first and
last items independently?
(Any examples off the top of your head for when the "exclusive scan" is
useful?)

On Sat, Jul 4, 2015 at 6:09 PM, Jaime notifications@github.com wrote:

Amen, brother! +1

Our ufunc.accumulate does an inclusive scan, the alternative would be an
exclusive scan, adding the identity as first item, and discarding the sum
of all items at the end. At least that's the terminology I picked from my
do-things-with-a-GPU
http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html readings.


Reply to this email directly or view it on GitHub
#6044 (comment).

Nathaniel J. Smith -- http://vorpus.org

@jaimefrio
Copy link
Member

Starting indices, e.g. for CSR or CSC sparse matrices don't typically require the last value for anything. I think it also comes up in some of the extra returns of unique.

@mdickinson
Copy link
Contributor

definitely not the first time I've found myself doing the awkward concatenate-a-zero dance in order to get a cumsum array whose shape matches what I want

Ditto. Indeed, I'm struggling to remember any occasion when I used cumsum and didn't want the extra zero. I've also encountered cases where I didn't want the last item, but those are rarer, and slicing off an unwanted last item feels much easier than inserting the missing zero at the beginning.

Big +1 from me to include_identity. include_final feels like a nice-to-have, but not an essential, and possibly not worth expanding the API for.

@WarrenWeckesser
Copy link
Member

See the comments in #14542 for a related discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants