New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop dimensions indexed with a scalar? #5949

Closed
timholy opened this Issue Feb 25, 2014 · 73 comments

Comments

@timholy
Member

timholy commented Feb 25, 2014

It seems likely that 0.4 will see a transition to array views (#5556). This change will surely break some code, e.g., for anyone who was relying on the behavior of setindex! applied to the output of A[3:15, 7]. Perhaps we should take the occasion to simultaneously think about another (much worse) breakage.

As we've gotten less concerned about breaking Matlab compatibility (and as I've learned more about the wider scientific ecosystem), I've increasingly begun to wonder whether NumPy's slicing rules are the right way to go. It's very simple: if you index along a particular dimension with an AbstractVector, that dimension is retained; if you index with a scalar, that dimension is dropped. So for a 2d array A, A[:,:] produces 2d output, A[3,:] and A[:,3] both produce 1d output, and A[3:3,:], A[:,3:3] both produce 2d output. Note this scheme is type-stable, so AFAICT there are no performance gotchas.

But wow would it be a breaking change.

@lindahua

This comment has been minimized.

Member

lindahua commented Feb 25, 2014

Looking at this in isolation, it sounds fine to me. However, problems may arise if we consider it in conjunction with broadcasting.

Consider this:

A .+ A[1,:]

At the first glance, I would suppose that the behavior is to add the first row of A to each row. However, if we squeeze out the first dimension, this would fail.

To extend this little bit, shall we squeezed reduced dimension for reduction operation? What would be the shape of sum(A, 1)? When we want to make decision about this, I would invite people to consider the following example that represents a very common operation in practice:

A .- mean(A, dim)
@timholy

This comment has been minimized.

Member

timholy commented Feb 25, 2014

It would simply become A .+ A[1:1,:]. Range-indexing wouldn't squeeze, Int-indexing would.

Another perspective on this: currently slice works this way, and sub never squeezes. If we're basically getting rid of those functions in preference to getindex, then slice's behavior may be the one we need to emulate, since sub doesn't give you any choices.

In my view, it's pretty clear that reductions shouldn't squeeze, but I recognize this is potentially contentious.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Feb 27, 2014

See also #4774 and #3262. I kind of wish there weren't so many issues about basically the same thing.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Feb 27, 2014

I would be happier with not dropping any dimensions and having slicing always return a tensor of the same rank as the tensor that is being sliced. Are there any downsides to that? Vectors would tend to get turned into column matrices, but that's fine, imo. We could also make [1;2;3] and the corresponding form with newlines construct a column matrices instead of vectors.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Feb 27, 2014

I think that would be ok; I suspect it's more important to formalize the embedding of N-d tensors in (N+1)-d tensors with a trailing singleton dimension. Then extra dimensions can generally be ignored, rather than explicitly dropped.

@nalimilan

This comment has been minimized.

Contributor

nalimilan commented Feb 27, 2014

I'm also a big supporter of @timholy's proposal. Since in Julia you can distinguish scalars from vectors and ranges, it sounds natural to allow using this difference to indicate what the shape of the resulting slice should be.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Feb 27, 2014

I have an unorthodox proposal: how about someone who's in favor of this go ahead and implement a Julia array type that has this indexing behavior. It doesn't have to be completely functional – I just think it would help a lot to have a mockup to play with. Then people (like me), who are skeptical of this idea can try it out and see if it's really as odious as it sounds ;-)

@timholy

This comment has been minimized.

Member

timholy commented Feb 27, 2014

I opened this issue because I'm basically intending to do just that; this is the "shot across the bow" from someone with a dangerous track record 😄 when it comes to reworking much of Julia's array indexing. Like you, I don't actually know how this will work out, and I am kind of curious to know whether I will like it.

But my general feeling is that perhaps this should be timed with other disruptions, like switching to views---both will break code, this much more than views. That said, if you really would play with this well in advance of getting serious about merging it, then perhaps one doesn't need to wait.

@timholy

This comment has been minimized.

Member

timholy commented Feb 27, 2014

I think the issue has served its purpose, and the next step is just to do it. So this can be closed.

@timholy timholy closed this Feb 27, 2014

@timholy

This comment has been minimized.

Member

timholy commented Feb 27, 2014

@nalimilan, while I agree with you in that this proposal is logical and adds power---which is why I think it's worth exploring---the part I'm worried about is that it may be so much more common to want not to slice dimensions that I'll get annoyed at writing ranges all the time. Moreover, : has quite low precedence and gets used both for ranges and for conditionals (the x?a:b syntax), so this will inevitably add to the number of parentheses in code.

@timholy

This comment has been minimized.

Member

timholy commented Feb 28, 2014

Actually, let me use this to ask one more thing: if I make the core changes to base/, anyone willing to take the lead in modifying the scripts in test/ to be consistent with the new behavior?

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Feb 28, 2014

Should we go full APL and make the rank of the result the sum of the ranks of the indexes?

@tshort

This comment has been minimized.

Member

tshort commented Feb 28, 2014

Instead of changing what's in base, can you add a new type? (Maybe that's what you plan.)

@timholy

This comment has been minimized.

Member

timholy commented Feb 28, 2014

@JeffBezanson, can you point me to a link? I think I understand what you're asking, but not sure. I'd have to think about the implementation, but it sounds doable.

@tshort, I don't think it will be an effective experiment if we don't really try it. IF we do try it (and it's by no means too late to convince me otherwise), my plan would be to post a branch and people could play with it. I'm also mulling over whether there's a way to make the transition safely with an appropriate macro or two. It's not obvious there is, but I think it's worth careful consideration.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Feb 28, 2014

I we do make this change, then it only makes sense to go "full APL", imo.

@toivoh

This comment has been minimized.

Member

toivoh commented Feb 28, 2014

Can someone provide a link on what APL indexing semantics actually are?
Given @JeffBezanson's description, it seems more profound than just
dropping singleton dimensions or not.

@toivoh

This comment has been minimized.

Member

toivoh commented Feb 28, 2014

Also, since there seem to be definite use cases for both dropping and keeping dimensions, how about that we provide both operations? This could even make the transition non-breaking.

Or does slice already do what is wanted? Could it? (I didn't even realize that we had it, so I'm not sure exactly what it does)

@JeffBezanson

This comment has been minimized.

@lindahua

This comment has been minimized.

Member

lindahua commented Feb 28, 2014

Agree that we may seriously consider adopting the APL rules, which are much more expressive.

@timholy

This comment has been minimized.

Member

timholy commented Mar 1, 2014

Having looked at the rules briefly, yes, it sounds like a good idea to go all the way if we do this.

@toivoh, the new behavior would imitate slice:

julia> A = reshape(1:4, 2, 2)
2x2 Array{Int64,2}:
 1  3
 2  4

julia> A[1,:]
1x2 Array{Int64,2}:
 1  3

julia> sub(A, 1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
 1  3

julia> slice(A, 1, :)
2-element SubArray{Int64,1,Array{Int64,2},(Int64,Range1{Int64})}:
 1
 3

julia> slice(A, 1:1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
 1  3

What's your idea for making this non-breaking? I currently have what might be an effective, but somewhat complex, plan for migration. The key is to use deprecation on indexing operations whose behavior is changing, but provide users a mechanism for indicating that other indexing operations should follow the new behavior and therefore not be subject to the deprecation behavior:

  • getindex(A, i::Int, j::AbstractVector) would return a warning, but follow the old behavior (using @deprecate)
  • I or someone would have to create a new expression type, :newref, that gets converted into getindex_apl/setindex!_apl operations. This is the part I'm currently least sure about implementing.
  • Provide macros @ni and @newindex. The first is used on single expressions, @ni A[3,:] to create a :newref expression. The second wraps whole blocks of code and converts all :refs into :newrefs. The safest migration path would be to use @ni on an expression-by-expression basis until you're sure you've converted them all, then clear them all out by wrapping the entire thing in @newindex. And new code could start with @newindex right away, of course.

A bit ugly, but I'm not sure I see a good alternative.

@timholy

This comment has been minimized.

Member

timholy commented Mar 1, 2014

I'm going to be occupied for the next several weeks with other tasks, and in any event this is 0.4 material. Let me make sure that folks noticed my request for collaboration on this issue, specifically on converting the files in /test to the new behavior: CC @isorber, @wenxgwen, @hayd, @BobPortmann, @nalimilan.

While I am curious to know how this indexing would work out in practice, much of my motivation for being willing to try it is because of the complaints about the current scheme. (I've implemented much of Julia's current array indexing, so I feel a certain amount of responsibility for handling complaints about how it works.) I'm reasonably happy with Julia's current behavior; if none of you are willing to share the pain of this transition, then I know I don't need to feel guilty about anything 😄 and just might find better uses for my time.

@timholy

This comment has been minimized.

Member

timholy commented Mar 1, 2014

Looks like that should have been CC @lsorber.

@timholy

This comment has been minimized.

Member

timholy commented Mar 1, 2014

Forgot @malmaud

@lindahua

This comment has been minimized.

Member

lindahua commented Mar 1, 2014

@timholy If we agree to change behavior (in whatever way), better to do it along with the array view framework. If we are going to deprecate sub-arrays in favor of array views (with the GenericView, it should 100% covers what a sub-array can do), then we can implement this behavior in array views, and delegate getindex to view.

@malmaud

This comment has been minimized.

Contributor

malmaud commented Mar 1, 2014

@timholy Full APL sounds good to me.

@MikeInnes

This comment has been minimized.

Member

MikeInnes commented Mar 2, 2014

So a[[i], :] won't work because it can't be inlined/type inferred - but could a tuple work here? i.e. a[(i,), :]. An extra character, but it doesn't look so bad.

@lindahua

This comment has been minimized.

Member

lindahua commented Mar 2, 2014

As I said, I have no strong opinion on one way or the other.

However, I agree with @StefanKarpinski that the argument for dropping all singular dimensions (as opposed to only dropping the trailing dimensions) is not strong enough.

Frankly, I still have a hard time understanding why dropping trailing dimensions (without dropping non-trailing singular dimension) is such a big problem. I have never encountered any real problems caused by this behavior even when writing very generic codes.

I am not against this proposal. But I believe such a breaking change needs more justification.

@nalimilan

This comment has been minimized.

Contributor

nalimilan commented Mar 2, 2014

Another option: make x: mean x:x. This syntax used to mean x:end but it's currently deprecated. It would probably be cleaner to find a way to inline a[[x], :], though.

@lindahua IMHO dropping trailing dimensions introduces more complexity and I don't think in most cases it makes sense. For example, you may often want to extract a vector from a row as well as from a column of a matrix.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Mar 3, 2014

dropping trailing dimensions introduces more complexity

This is the very claim that's not substantiated. I don't actually find myself wanting to extract rows as pure vectors. Ever, really. If the thing is a row, it's rowness is pretty important. If I need it to act like a column, then I transpose it. The only reason one might need it to really be a vector and not a column matrix is to resize it. But that a radically different use case (stack/queue) and I think it's reasonable to apply the vec function before changing modes like that. What is the use case you're encountering where you need something to be a vector and not a row or column vector?

@timholy

This comment has been minimized.

Member

timholy commented Mar 3, 2014

There certainly are cases where you want to drop a dimension. Consider a 3d image where you want to take an xz slice.

If folks dislike dropping on scalars, alternatively we could introduce a type,

A[DropMe(3),:]

Or something shorter, of course. Once #3721 gets fixed, I vote we exploit unicode and choose pizza. After all, we are taking slices 😄.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Mar 3, 2014

My gut feeling about that case is that you're doing two steps: you want to select a certain xz plane, and then treat it like an xy plane. It's not obvious to me that one most often wants dimensions permuted into the leading dimensions.

@timholy

This comment has been minimized.

Member

timholy commented Mar 3, 2014

Depends on what code you want to interact with next---lots of people write image-processing algorithms thinking only of 2d (as a 4d images person, I happen to think this is abominable, but whatever). I'd go so far as to say most code you see online for image processing makes that assumption.

Images doesn't care what you do: since it names all the dimensions anyway, there's never any risk of ambiguity. But that's a special case.

@timholy

This comment has been minimized.

Member

timholy commented Mar 3, 2014

...though I should add that if what you're trying to do is confer meaning, naming dimensions is much more robust than trying to do it by position. Just wait until someone feeds you a huge file using a different storage order than you expected.

So in a sense maybe Images isn't a special case; if we really want to give meaning to dimensions, maybe we should think in terms of names.

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Mar 3, 2014

Now that's a rather interesting idea. I rather like it, but it would be awkward to always have to name your dimensions. However, it might be an interesting angle to think about the problem from. What if 1, 2 and so on are just default dimension names? Then you could have a pure vector whose only dimension is 2. This is a bit like absent dimensions, but seems more sensible.

@nalimilan

This comment has been minimized.

Contributor

nalimilan commented Mar 3, 2014

dropping trailing dimensions introduces more complexity

This is the very claim that's not substantiated. I don't actually find myself wanting to extract rows as pure vectors. Ever, really. If the thing is a row, it's rowness is pretty important.

@StefanKarpinski I don't deny that. But I don't understand why the "columnness" of columns of a matrix wouldn't be equally important.

I guess we have very different use cases. I work with contingency tables all the time, where the choice of putting one variable as rows and one as columns is completely arbitrary. I'd like them to behave the same.

The idea of having dimension names so that a vector's only dimension could be named 2 is interesting, but orthogonal to the issue of the special treatment of trailing dimensions.

@lindahua

This comment has been minimized.

Member

lindahua commented Mar 3, 2014

From all these discussion (together with my own experience), my take is that when taking a row of a matrix, people may desire a 1 x n shape matrix/row vector, while others may want a vector of length n. Perhaps, you may feel at home with some particular convention, but please bear in mind that there can be other people in other domains that might be more comfortable with other ways.

We really need an approach with which people can freely and easily express their intention, instead of forcing one way while leaving a lot of other people unhappy.

@timholy

This comment has been minimized.

Member

timholy commented Mar 3, 2014

To me it's beginning to sound like the best choice might be (1) to stop dropping trailing dimensions, and (2) implement either (a) getslice/setslice! (which does not have a briefer syntax) or (b) A[DropMe(2), :] (though I still like the idea of pizza :) ).

@MichaelHatherly

This comment has been minimized.

Member

MichaelHatherly commented Mar 3, 2014

What about using negative indices to denote dropped dimensions? Would that be a bad idea or just not possible? I haven't really thought it through very much yet since I only just found this thread.
What I basically mean is use negative integers to drop the corresponding positive indexed dimension, and retain the dimension when it is indexed using positive integers.
Here's an example:

A[2:4, -2, 3]

So the range behaviour 2:4 would remain the same. -2 (the - suggesting "removal") would drop that dimension and 3 would be retained.
Have I maybe missed something really important? Obviously if someone mistakenly indexed an array with a negative (in a loop perhaps) when they actually meant to use a positive it could give weird results.

Edit:
Another thought: would doing it this way cause problems with type-stability of indexing?

@ivarne

This comment has been minimized.

Member

ivarne commented Mar 3, 2014

@MichaelHatherly, that would be very confusing for people converting from python. They need a BoundsError(), and a suggestion to try end-x to index from back.

Just to throw out yet another syntax for creating single value ranges, that can work outside of indexing also:

colon(x,f::Function) = colon(x,x)
#And now we can use
a[frobozz + 2gazonk:&, 2]

We will probably want the implementation to be more specific, but the point is the syntax.

@nalimilan

This comment has been minimized.

Contributor

nalimilan commented Mar 3, 2014

@lindahua Fully agreed.

@MichaelHatherly

This comment has been minimized.

Member

MichaelHatherly commented Mar 3, 2014

@ivarne, your right. I think Ruby and Mathematica (maybe) also have that style of indexing, so it would be quite confusing. Probably not something prominent documentation would solve either since people would still try anyway.

@timholy

This comment has been minimized.

Member

timholy commented Mar 3, 2014

That's a clever solution, @ivarne.

@toivoh

This comment has been minimized.

Member

toivoh commented Mar 3, 2014

I also think that we should cater to both people/use cases who want to drop dimensions, and who dont.
I think that a DropMe or DontDropMe (such as Range1) type could work out ok, but I also suspect that having separate dropping and non-dropping operations would work better.

To me, the reason not to drop is that you want to preserve the dimensions that are already present, and then it doesn't make sense to drop some of them. It would seem to me that separate operations would also make code less verbose and error prone than mixing the two in []/getindex.
I also think that it could make the implementation considerably easier, which is not to be neglected consider what I have heard about the current complexity of the indexing code.

Of course, it could be very nice to go to a setting like named dimensions where dropping a dimensions just never makes sense, but for that case I think that we still need to figure out if matrix algebra could work in a sane way, and if such a scheme would add too much complexity to the basic Array type. (That would have to be supported by all AbstractArray descendants as well)

jiahao added a commit to jiahao/julia-paper-arrays that referenced this issue Apr 12, 2014

More edits
- change float environment of dispatch statistics table from figure to float
- move 'symbolic programming' subsection to be second to last ('implications' feels like a more natural ending)
- more precise subsection headings
- remove comments addressing points already made and rephrased
- tighten spacing of "(n-1)-array"
- less repetitive phrasing
- eliminate most parenthetical comments
- hyphenation
- cite issue JuliaLang/julia#5949

jiahao added a commit to jiahao/julia-paper-arrays that referenced this issue Apr 12, 2014

Update JuliaLang/julia#5949 footnote
- footnote comes after punctuation
- change tense since issue is closed
@jtravs

This comment has been minimized.

Contributor

jtravs commented Apr 16, 2014

This issue was brought to my attention by https://groups.google.com/d/topic/julia-users/aUahMSjJ-o0/discussion

My unsolicited opinion: not dropping singleton dimensions is the biggest wart in Julia from my POV. I often take slices from many dimensioned arrays and always choke on the extra syntax/effort required to drop the dimensions.

I like the idea of the APL rules - and would be willing to help @timholy fix the tests etc. as a vote of support!

@pwl

This comment has been minimized.

Contributor

pwl commented Apr 16, 2014

An example in favor of dropping singleton dimensions comes from solving partial differential equations, where it is common to use multidimensional array to represent function values on Cartesian mesh (of arbitrary dimension). In this context, array slices represent the slices of the Cartesian mesh. Say I would like to take a look at function values on a particular 2d section of a larger 3d mesh. Without dropping the singleton dimensions, dealing with such slice depends on which plane I choose and passing the slice to a function that would perform some operations on it (e.g. plot the slice) requires some kind of wrapping. I posted the example in the mailing list: https://groups.google.com/d/msg/julia-users/aUahMSjJ-o0/kle-wuGJVXgJ

In this case, dropping singular dimensions has a well defined geometric interpretation.

@timholy

This comment has been minimized.

Member

timholy commented Apr 16, 2014

Offers of help are wonderful. In general I think any transition needs to be made in conjunction with a transition to better ArrayViews, which is one of several reasons I haven't made any moves on this front (see below for another).

My current opinion, largely based on the earlier discussion, is that the change to dropping dimensions in expressions like A[3,:] is perhaps not worth the pain. My current favorite strategy is:

  • Provide getslice and setslice!, which follow full APL rules. So you would write getslice(A, 3, :) instead of A[3,:]. More verbose, I know, but that way the people who hate the idea of dropping aren't left out, and it's backwards-compatible.
  • Consider changing our policy on whether trailing dimensions are dropped with A[:,3]. People seem to find the "drop only trailing dimensions" a bit complex/arbitrary.

Finally, implementing APL rules will require #6437, and there's little point doing anything until that's resolved.

@pwl

This comment has been minimized.

Contributor

pwl commented Apr 16, 2014

Although I read the whole discussion I honestly still fail to see what is wrong with requiring the programmer to explicitly state the intention of getting the 2d array via A[3:3,:] instead of A[3,:] (which could be reserved for dropping). To me, the only real downside of this approach would be getting slices with elaborate indexes like A[(frobozz + 2gazonk):(frobozz + 2gazonk),:] as pointed by @GunnarFarneback. But then taking, for example, the diagonal element A[frobozz + 2gazonk,frobozz + 2gazonk] is also not very readable. I think that in such situations it is better to put the index in a temporary variable anyway like i=frobozz + 2gazonk and refer to A[i:i,:] or A[i,i] respectively.

@jtravs

This comment has been minimized.

Contributor

jtravs commented Apr 17, 2014

Another syntax suggestion, if issue #2403 gets implemented, then the call operator could be overloaded for Arrays for the slice syntax. Then A[:,i,j,:] can behave as now (maybe without dropping trailing singleton dimensions) and A(:,i,j,:) can perform the slicing or full APL version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment