plane, row, col notation in all docstrings
(Don't remove the checklist below.)
@soupault I appreciate the effort to make our docs more consistent, but I am not a fan of the convention here. "M, N, P" was chosen to signify arbitrary letters in a sequence, containing N which is the standard for "count of the number of elements". That P corresponds to planes is an unfortunate coincidence. I might suggest Np, Nr, Nc. However, I like the arbitrary letters because they expand to N dimensions (not just 2 and 3). What is the
So, I do want to unify all of the docstrings, but I would suggest (M, N), (M, N, P), and (M, N, ..., P) (with optional 3 or Nch where appropriate) everywhere. What do you think?
@jni Thank you for the feedback! I knew this PR would cause a discussion, so let's get straight to it :).
Let's take a look together :) :
Which convention looks more consistent to you? As it is, I like
In my opinion, when a-b order is used, the meaning of different letters completely dissapears. In the proposed approach,
Another point is that in
Regarding your other comments:
To sum up:
@soupault thanks for your awesome table! =)
There's a fourth column, which is a modified version of this PR:
I know you don't think this distinction is necessary, but for me it's helpful so I thought I'd bring it to the forefront for the discussion. =)
Looking at the first two columns, I did realise that I don't have a major preference, so if others vote for column 1, I will happily yield.
I agree with @jni, pertaining to the distinction between a rank-3 array intended to represent a 3D volume and a rank-3 array which represents channels or a collection of images with two spatial dimensions.
The ship has sailed and I realize why we opted to place z-axis in front of the other spatial axes for performance, but it remains unfortunate from an accessibility standpoint. Users understand row, column, (plane) but when you lead with plane, it is significantly less clear.
As far as overall approach is concerned, consider we have multiple years of conference tutorials and other instruction materials which have been given surrounding the row/col convention. Row/Col is massively beneficial for any users without a MATLAB background. I do think we should retain r/c for consistency; thus, I actually favor the overall 'N_something' approach. But, instead of capitalizing (thus emphasizing) N, what if we reversed it - the first line in the table above then becomes [nP, ]nR, nC.
ZYX/PRC notation is also unfortunate from a geometric standpoint, especially pertaining to n-dimensional quasipolar coordinates (see section 2.2 for definition).
In this definition,
Placing planes before rows and columns disrupts this correspondence between dimensions ordering and their quasipolar coordinate index. Besides it just generally being confusing and different from what most users are used to.
I don't know how relevant this is, but I just wanted to give my 2¢.
@kne42 you're right. That's what I was referencing. However, it turns out there are real performance gains when iterating over the last axis due to the way arrays are allocated in memory. There was a lengthy discussion on this matter in a past issue and performance won over accessibility; we've since moved to this convention. I dissented at the time, but bowed to the majority.
I also like having distinction of non-spatial axis (for image collections, medical image timeseries, etc), but I don't have any strong preference for any of the different columns in the table. The most important thing is that it is consistent across the library! Thanks @soupault for working on making that the case.
FWIW, the microscopy community has generally standardized on
One of the nice things about the numpy striding system is that one can generally choose whatever makes most sense for the memory layout pretty much independently of whatever makes most sense for the indexing convention (i.e. what axes are listed in what order in an index expression). A properly-constructed set of allows basically any index order to fit with any memory order.
I thought it was a shame that skimage chose the matlab/C-array style of indexing convention originally rather than taking advantage of the richness of numpy indexing, and I still do think it's a shame, but Josh is right.
Anyhow, in my lab's microscopy code, we use clever striding so we can manipulate numpy arrays with the index order everyone is used to. Whenever we have to hand the arrays over to something more opinionated like skimage (scipy.ndimage doesn't usually make such strong assumptions), we just make sure to slice and transpose or whatnot as needed. Since we generally use the same "correct for performance" memory layout underneath as does skimage (and basically everyone else, with t is the slowest moving index, then z, y, x, and finally c -- though there is some disagreement about where c "ought" to go), these operations don't require actually shuffling any data around and so are basically zero-cost.
That seems to me to be overstating the community's ability for standardisation. =P Just now I am working on a LIF (Leica Image Format) file with order (t, z, c, y, x).
Can you elaborate on this? Honestly, I would very much like it if we could do "the right thing" for any input array order. However, for many algorithms, doing the memory-optimal, axis-order-independent thing is harder than assuming C- or F-order. If I understand correctly, you'd have to pepper
At any rate, C-order is the NumPy default. If F-order had been the default, I would have advocated for xyzt. As mentioned in our array order section, there are significant performance penalties from using the memory-unoptimised ordering — usually about 5x, sometimes higher. (~9x on the webpage example.) This isn't peanuts.
I also object to calling scikit-image "opinionated" about this: as far as I know (do raise issues if not!) our code works perfectly fine whether you use prc or xyz convention in your arrays. Just pass in the right voxel spacing to the relevant functions and you'll be golden. You might just wait 10 times longer, depending on the algorithm.
Finally, regarding the tradeoff between performance and readability: I actually think prc/zyx is more readable, at least in a SciPy context. We've been conditioned for decades to think that
Anyway, that's enough rehashing of old arguments from me. I'm just hoping to convince the dissenters a little bit that they maybe might enjoy this convention, rather than grudgingly accept it. =P
@kne42 note that Cartesian coordinates typically have the origin on the bottom left, not top left like matrices, so the analogy between xy (Cartesian) and yx (Matrix) is not perfect. This is part of the motivation for my push to the row/column terminology (which by the way is extremely common in matrix literature anyway, more so than y/x).
@jni I'm actually totally fine with leading indexing with planes. My main point of discontent is with the conflation of
Thus, when they see
Of course, the whole deal with this confusion in the first place is due to the inconsistent naming conventions that resulted in the
@jni Hey there! I knew I shouldn't have opened my mouth without being better prepared :)
Anyhow, my main point is that there are (at least) three separate concepts that all too often get conflated when discussing array indexing, particular in the context of images:
So, when you talk about the Leica Image Format, that's really a case of concept 1, while discussion about array ordering (as in the skimage docs) is a case of concept 2. But why do those docs talk about what order the axes are labeled in (concept 3) as if it has anything to do with performance?
As long as you iterate over an image in order of fastest to slowest moving index (in memory), you get good performance regardless of how the numpy strides tuple is configured. A lot of people (not you, @jni, but many who should know better) don't seem to get this. But arbitrarily strided arrays are one of the particular bits that make numpy really great to use!! Moreover, numpy's C API has various nd iterator tools, so that one can iterate over an arbitrary numpy array in a memory-efficient fashion.
And in particular, numpy striding lets users break free from the tyrannical connection between concepts 2 and 3. Using striding, you can let an array be laid out in a a memory-optimal fashion while indexing it in whatever way is most appropriate to a given application. If an algorithm is smart about iteration (i.e. either using numpy's C iterators, or doing a bit of inspection on the stride array), then you can just pass your array to the algorithm and all is well. If the algorithm makes assumptions about memory layout matching the index ordering, then it's almost as easy: a quick transpose operation (or otherwise rearranging the axis order without actually touching the memory), and then you can pass off your numpy array to the algorithm.
So I think it's reasonable to state that many microscopy folks would generally like to logically index their data as
And if there are use-cases where it makes more sense to say
Here we are in 100% agreement. Array index ordering should be treated as authoritative. There should never be a distinction between "numpy coordinates" and "real coordinates". If you index
But I think it's silly to insist that if I have an image that's stored in the standard scanline order in memory (i.e. row by row) that it's only "appropriate" to index that chunk of memory as
And what if I have an RGB image in scanline order (e.g. rows of pixels as
So, I do wish that skimage had made more of an effort up front to be stride-agnostic and coordinate-convention-agnostic. I never run into these problems when using
However, at the end of the day, it hardly matters whether I call
@zpincus I'm trying to fully comprehend your argument, so pardon any misunderstanding on my part. There are two typical situations: Python code that operate on an image, that expect the image to be indexable in a certain way (
I share your wish that we were more prescient ten years ago, but such is the challenge of living blinded by the present :)
@stefanv First off, I realize that this wasn't really the right forum for me to bring this up, so my apologies for derailing relevant discussion about this PR!
Nevertheless, I'll continue :)
My overall point is that I think the
Well put. I'd just note that these two situations often get conflated for some reason, with one being used to justify the other. But in numpy, these issues are much more separable than in C or Fortran or Matlab, where indexing is more closely tied to memory layout.
Regarding the first situation: I propose that it's a "code smell" for code that doesn't care about memory layout to have any expectations about how an array should be indexed. The main exceptions are (a) dealing with color/multichannel images, where the color dimension is semantically distinct in a way that a z or t axis is not; and (b) visualization tools.
But are there other justifiable cases where code "needs" to know what my axis labels are? I agree with @jni's point that functions should just directly index into numpy arrays. However, I disagree that this requires that all functions use a
One might think that code that needs to receive / provide geometric coordinates (such as image drawing) must commit to using
Regarding the second situation (where memory layout matters): I posit that it's rarely much more work for a function to iterate efficiently over an arbitrary strided array than to just iterate over a C-contiguous one. A few simple helper functions can make this pretty trivial, by e.g. making a view on the same memory in which the last index is guaranteed to be the fastest-moving in memory.
In particular, I think it's silly to insist on contiguity. Say I have a contiguous array, but then want to apply an algorithm to a slice of it, say
Given that cython is pretty smart about strided indexing, all that's required is a helper function to provide a view on any given array in which the axes are in slow-to-fast order. Then one merely needs to write cython where the iteration is nested in the same order (i.e. the current best practice).
@zpincus I have not helped things by going off on my layout efficiency tangent, but I think you have a misunderstanding of what xy/rc mean in the context of skimage. The point is not actually that we want to be prescriptive about axis order. (Actually if you look above I voted for the nomenclature with the least referential baggage, "M, N, P". Perhaps you want to cast your vote for that also?
The point is that we
The xy/rc distinction is necessary, therefore, so that at least people who are using the library one way or another know which functions require them to swap axes around. Perhaps in the far future we can redo all our docs so that we stop referring to "rows and columns" and instead we go for "axis 0 and axis 1". But is that really much better?
@@ Coverage Diff @@ ## master #3031 +/- ## ========================================= - Coverage 85.9% 85.9% -0.01% ========================================= Files 336 336 Lines 27301 27300 -1 ========================================= - Hits 23454 23453 -1 Misses 3847 3847
@jni I don't think there's really that much disagreement here. I also would prefer "M,N,P", or "axis 0...n" for minimal baggage, but I agree that failing that, the key thing is consistency. So the set of changes in this PR will be good on that front.
My main point in starting this conversation was to remind folks that numpy makes it pretty easy to index any particular chunk of memory in whatever way makes most sense for a given application. One can then change the indexing convention (without having to touch the actual data in memory) when handing off an array to other code that uses a different convention.
That said, I'm not sure I follow you here:
Could you give an example of what you mean by this? I do this sort of thing every day and don't seem to have the problems you suggest I ought to.
Let's say that I have a single-channel image stored in the standard scanline order in memory (i.e. the pixels are laid out as contiguous rows in memory). Let's further say that the image is 640 pixels wide and 480 tall, and the dtype is
Using numpy, I can construct a Fortran-ordered view on this memory of shape
Further, indexing this as
So far so good, right? It certainly doesn't seem like "the first coordinate is the rows" here...
Moreover, all remains well in the world when I construct a C-ordered view on that same memory. This view would have shape
Moreover, if I use
What if I make a new C-ordered array, of shape
The only thing that's a little odd is that when you print out a Fortran-ordered view of a scanline-ordered image in memory in the terminal, you get the transpose of what you'd see in photoshop. But for anything but toy 5-by-5 images or whatever, this is really not a big deal since you never actually do it.
The key point is that the issue of "Fortran" or "C" ordering (or more complex striding schemes) is about the indexing convention, which in numpy is more or less separate from the memory layout (i.e. is the image in some kind of scanline order or not), which itself is reasonably separate from how the image ought to be viewed in Photoshop or matplotlib or whatever. (Check out the TIFF spec sometime, if you dare, to see how distinct all of these aspects of an image are.)
Now, obviously memory layout drives the cache-efficient traversal order, but how or why that should have anything to do with the indexing convention or visualization is beyond me.
Literally the only places where numpy "cares" about the ordering is in array printing, and the fact that C order is default when creating new arrays. I do admit that the latter means that incautious users who want to use
So, given all of this, I really don't think I fully agree with your conclusion that
I'm pretty sure that the only functions for which this distinction really is necessary are visualization and image IO. Can you all think of other classes of functions that need to know whether axis 0 represents vertical or horizontal (or temporal, &c.) strips of pixels? Certainly rasterization / image-drawing functions don't need to know any of this. They just need to know what axes to draw along, not whether those axes are "rows" or "columns". (
As such, I do honestly believe that "axis 0...n" is the best notation except for visualization and IO. And in those cases, I think the the best thing to do is for the functions to have a parameter that controls how axes are interpreted, rather than making a particular assumption.
Thank you for that amazing response.
You are, of course, right about your assessment that this whole confusion, my whole confusion, has to do with display of the arrays. That is, both Matplotlib and NumPy printing works in row/column order, and that had coloured my whole view about our coordinate systems. As you say, there's a few different issues mixed up here and they were all jumbled in my head until your post. (They are probably still a bit jumbled, but much less so. ;)
So, let me clarify, in light of this new understanding, why some parts of skimage are still incorrect: in parts of the
Do you agree that this is bad?
That is the primary problem that I tried to address in moving to
Anyway, I think I'm going to put your vote down for "M, N, P". =P In my opinion, "N0, N1, N2" is too unwieldy for a docstring.
(Regarding scanline efficiency, that remains a problem with x/y, but you are right that it is a much more minor point.)
@jni No worries -- it took me years of strenuously thinking about this exact question to come to this viewpoint. It's far from obvious or intuitive... (The good news is that once one fully absorbs how numpy striding works for images, you can do amazing/absurd things. Like make an (x, y, 3, 3)-shaped view on an (x, y)-shaped image, where the sub-array [i, j, :, :] of the view is the 3x3 neighborhood centered on pixel [i, j] in the original image...)
Anyhow, yes, my vote would be for (M, N, P...) indexing. I agree that "n0, n1" etc is too unwieldy.
Further, I totally agree about your example being a clear instance of something that ought to be fixed: the library assumes that coordinates are specified in (x, y) tuples, but that the array will be indexed [y, x]. In contrast, this is where
My only objection to
For pedagogical purposes, it helps to have coordinates that make some intuitive sense. When we talk about rows, columns, depth, planes, etc. new users get it. If we tell them to think in terms of N-dimensional arrays and coordinates—well, that's all a bit overwhelming.
Is this benefit enough to justify talking about rows and columns? I don't know. The gap could perhaps be addressed effectively by a section in the user guide.
I suspect that we are currently making indexing/layout assumptions inside our Cython modules as well, so that's something to flag for future attention.
Anyway, these points are somewhat orthogonal to this PR—the convention introduced here seems perfectly fine.
I agree with Stefan; intuitive coordinates may be pedagogically helpful. So long as it's clearly stated for advanced users and more importantly devs to not rely on axes "meaning" anything in particular.
I also agree that this is more or less orthogonal to this PR, as well.
Just to add one more view: In deep learning, the conventions are:
@soupault by my reading of the above discussion, (M, N, P) is the least contentious. Though I am of course a biased reader... ;) (Incidentally, the "symbolic dimension" section of the XND ndtypes docs fit this scheme: https://ndtypes.readthedocs.io/en/latest/ndtypes/types.html#symbolic-dimension)
I think you mean (time, plane, row, col, channel), first of all. =P But second, I'm not in love with it. @zpincus made a
Sounds too ambiguous for me, but I'm happy with the points @zpincus brought to the table.