Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

remove amap, each_row etc. #2204

Closed
JeffBezanson opened this Issue · 20 comments

7 participants

@JeffBezanson
Owner

We have amap, each_row, each_col, and each_vec, all of which can be combined into a single function that applies a function over some dimension. Let the bikeshedding begin.

@JeffBezanson
Owner

There are two basic functions: the amap style, which maps a function over each slice A[:, :, ..., i, ..., :, :] forall i, or the each_vec style, which transforms vectors along some dimension A[i, j, ..., :, ..., l, m].

The second one is what's needed for sort(A, 1), cumsum, etc. Unfortunately hist does not fit; it almost does the second style, but also resizes the vector'd dimension according to the number of bins.

These functions should be called something like mapslices and mapvecs. Those names aren't great but they are getting close.
cc @StefanKarpinski @ViralBShah

@JeffBezanson JeffBezanson was assigned
@ViralBShah
Owner

I like mapslices and mapvecs. The names are quite clear.

@panlanfeng

Hey guys, I just updated Julia and I found mean(array, dims) and amap disappeared. I found mean(array, dims) in Stats.jl but could you tell me where is amap now? Or what's the new name? I wrote a LARS algorithm in Julia and my program depends on those two function. Thank you!

@JeffBezanson

You can get to it with Base.amap. I'm curious: do you actually want the result in the way amap gives it, as a cell array?

@panlanfeng

Cell array is OK if original array is two dimensional. In R, if x is 456 array, apply(x, c(1, 2), mean) will return a 4*5 array. It seems amap cannot do this now. What's your plan about this kind of functions?

@panlanfeng

As a R user, I can't live without *ply series functions.

@JeffBezanson

My plan would be to do that with mapvecs(mean, x, 3), which maps mean over each vector along dimension 3. I want to return a single array instead of the cell array, so I was checking to see if there is a good reason to want the cell array instead.

@mschauer

You mention two basic styles, "mapslices" and "mapvecs".
The third which comes to mind, is the apl-style which applies the function to the shaped (ndims(A)-n)-dimensional continuous subarrays along the leading axes
A[:, :, ..., :, i1, i2, ..., in] forall i1, i2, ..., in

The second question is, if one should be able to apply array-valued functions with mapsices.
Finally I wanted to mention that these function have dyadic or n-adic counterparts which should have the same or similar names.

@JeffBezanson

@mschauer What would be a good name for that function?

@mschauer

Hm, I think about it. APL/J uses the rather idiosyncratic notion of "rank" of a function as the dimension of the subarrays ("cells") the function is working on.

http://en.wikipedia.org/wiki/Rank_(J_programming_language)

Maybe mapsubarrays.

@JeffBezanson

How about maprank(f, A, r) where r is the apl-style rank of the function? e.g. maprank(f, A, 3) operates on A[:,:,:,I1,I2,...,In].

@diegozea

Be aware about rank notion in statistics: http://en.wikipedia.org/wiki/Ranking
For example mapranks sounds like a map function for Array of categorical or rank data.

@JeffBezanson

That's a fair point, and we don't use rank to refer to dimensions anywhere else. Maybe mapleading, since it applies to leading dimensions?

@timholy
Owner

All of these cases seem like they could collapse into a single one: mapslices(f, A, dims). This most recent example could be mapslices(f, A, 4:ndims(A)), your mapvecs would be mapslices(f, A, setdiff(1:ndims(A), i)) and the original mapslices would be unchanged.

The middle case might be improved by defining a Dim type and the operation !Dim(i).

@JeffBezanson

Excellent observation. We should probably just provide that, and optimize the common cases.

@StefanKarpinski

I think that mapslices is the best name we've come up with so far. The in-place modification version can be named mapslices!, although that is, of course, applicable only in a subset of cases where mapslices would work.

@mschauer

Indeed, that looks nice - with the slight regret, that as we have column major, the cache friendly access
mapslices(f, A, 4:ndims(A)) looks unnatural compared to say mapslices(f, A, 1:4).

@JeffBezanson

We could just invert the meaning of the third argument; it specifies either the : dimensions or the dimensions to iterate over. I have trouble remembering which is which anyway.

@JeffBezanson JeffBezanson closed this issue from a commit
@JeffBezanson JeffBezanson add mapslices(). closes #2204
this is a beautifully general function. we can use it for #2484
and similar missing functions.
it is also a really good function to focus on optimizing, and for honing
things like indexing behavior.
c09d568
@ViralBShah
Owner

It would be great to have some documentation on mapslices in the help.

@mschauer

Here it is? c09d568#L2R2294

PS: "this is a beautifully general function": justified :+1:

@jiahao jiahao referenced this issue from a commit
@JeffBezanson JeffBezanson add mapslices(). closes #2204
this is a beautifully general function. we can use it for #2484
and similar missing functions.
it is also a really good function to focus on optimizing, and for honing
things like indexing behavior.
e8cff8e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.