Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove amap, each_row etc. #2204

Closed
JeffBezanson opened this issue Feb 6, 2013 · 20 comments
Closed

remove amap, each_row etc. #2204

JeffBezanson opened this issue Feb 6, 2013 · 20 comments
Assignees
Labels
breaking This change will break code
Milestone

Comments

@JeffBezanson
Copy link
Member

We have amap, each_row, each_col, and each_vec, all of which can be combined into a single function that applies a function over some dimension. Let the bikeshedding begin.

@JeffBezanson
Copy link
Member Author

There are two basic functions: the amap style, which maps a function over each slice A[:, :, ..., i, ..., :, :] forall i, or the each_vec style, which transforms vectors along some dimension A[i, j, ..., :, ..., l, m].

The second one is what's needed for sort(A, 1), cumsum, etc. Unfortunately hist does not fit; it almost does the second style, but also resizes the vector'd dimension according to the number of bins.

These functions should be called something like mapslices and mapvecs. Those names aren't great but they are getting close.
cc @StefanKarpinski @ViralBShah

@ghost ghost assigned JeffBezanson Feb 9, 2013
@ViralBShah
Copy link
Member

I like mapslices and mapvecs. The names are quite clear.

@panlanfeng
Copy link

Hey guys, I just updated Julia and I found mean(array, dims) and amap disappeared. I found mean(array, dims) in Stats.jl but could you tell me where is amap now? Or what's the new name? I wrote a LARS algorithm in Julia and my program depends on those two function. Thank you!

@JeffBezanson
Copy link
Member Author

You can get to it with Base.amap. I'm curious: do you actually want the result in the way amap gives it, as a cell array?

@panlanfeng
Copy link

Cell array is OK if original array is two dimensional. In R, if x is 4_5_6 array, apply(x, c(1, 2), mean) will return a 4*5 array. It seems amap cannot do this now. What's your plan about this kind of functions?

@panlanfeng
Copy link

As a R user, I can't live without *ply series functions.

@JeffBezanson
Copy link
Member Author

My plan would be to do that with mapvecs(mean, x, 3), which maps mean over each vector along dimension 3. I want to return a single array instead of the cell array, so I was checking to see if there is a good reason to want the cell array instead.

@mschauer
Copy link
Contributor

You mention two basic styles, "mapslices" and "mapvecs".
The third which comes to mind, is the apl-style which applies the function to the shaped (ndims(A)-n)-dimensional continuous subarrays along the leading axes
A[:, :, ..., :, i1, i2, ..., in] forall i1, i2, ..., in

The second question is, if one should be able to apply array-valued functions with mapsices.
Finally I wanted to mention that these function have dyadic or n-adic counterparts which should have the same or similar names.

@JeffBezanson
Copy link
Member Author

@mschauer What would be a good name for that function?

@mschauer
Copy link
Contributor

Hm, I think about it. APL/J uses the rather idiosyncratic notion of "rank" of a function as the dimension of the subarrays ("cells") the function is working on.

http://en.wikipedia.org/wiki/Rank_(J_programming_language)

Maybe mapsubarrays.

@JeffBezanson
Copy link
Member Author

How about maprank(f, A, r) where r is the apl-style rank of the function? e.g. maprank(f, A, 3) operates on A[:,:,:,I1,I2,...,In].

@diegozea
Copy link
Contributor

Be aware about rank notion in statistics: http://en.wikipedia.org/wiki/Ranking
For example mapranks sounds like a map function for Array of categorical or rank data.

@JeffBezanson
Copy link
Member Author

That's a fair point, and we don't use rank to refer to dimensions anywhere else. Maybe mapleading, since it applies to leading dimensions?

@timholy
Copy link
Member

timholy commented Mar 17, 2013

All of these cases seem like they could collapse into a single one: mapslices(f, A, dims). This most recent example could be mapslices(f, A, 4:ndims(A)), your mapvecs would be mapslices(f, A, setdiff(1:ndims(A), i)) and the original mapslices would be unchanged.

The middle case might be improved by defining a Dim type and the operation !Dim(i).

@JeffBezanson
Copy link
Member Author

Excellent observation. We should probably just provide that, and optimize the common cases.

@StefanKarpinski
Copy link
Member

I think that mapslices is the best name we've come up with so far. The in-place modification version can be named mapslices!, although that is, of course, applicable only in a subset of cases where mapslices would work.

@mschauer
Copy link
Contributor

Indeed, that looks nice - with the slight regret, that as we have column major, the cache friendly access
mapslices(f, A, 4:ndims(A)) looks unnatural compared to say mapslices(f, A, 1:4).

@JeffBezanson
Copy link
Member Author

We could just invert the meaning of the third argument; it specifies either the : dimensions or the dimensions to iterate over. I have trouble remembering which is which anyway.

@ViralBShah
Copy link
Member

It would be great to have some documentation on mapslices in the help.

@mschauer
Copy link
Contributor

Here it is? c09d568#L2R2294

PS: "this is a beautifully general function": justified 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code
Projects
None yet
Development

No branches or pull requests

7 participants