Skip to content

Commit

Permalink
Merge pull request #177 from asmeurer/docs-fixes
Browse files Browse the repository at this point in the history
More docs fixes
  • Loading branch information
asmeurer authored May 20, 2024
2 parents d6c87eb + aa8b264 commit 0ba06d3
Show file tree
Hide file tree
Showing 7 changed files with 68 additions and 67 deletions.
2 changes: 1 addition & 1 deletion docs/indexing-guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ for each of the remaining index types, the basic indices:
[ellipses](multidimensional-indices/ellipses.md), and
[newaxis](multidimensional-indices/newaxis.md); and the advanced indices:
[integer arrays](multidimensional-indices/integer-arrays.md) and [boolean
arrays](multidimensional-indices/boolean-arrays.md).
arrays](multidimensional-indices/boolean-arrays.md) (i.e., masks).

Finally, a page on [other topics relevant to indexing](other-topics.md) covers
a set of miscellaneous topics about NumPy arrays that are useful for
Expand Down
54 changes: 30 additions & 24 deletions docs/indexing-guide/multidimensional-indices/boolean-arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ It's important to not be fooled by this way of constructing a mask. Even
though the *expression* `(a > 0) & (a % 2 == 1)` depends on `a`, the resulting
*array itself* does not---it is just an array of booleans. **Boolean array
indexing, as with [all other types of indexing](../intro.md), does not depend
on the values of the array, only in the positions of its elements.**
on the values of the array, only on the positions of its elements.**

This distinction might feel overly pedantic, but it matters once you realize
that a mask created with one array can be used on another array, so long as it
Expand Down Expand Up @@ -186,10 +186,10 @@ both. -->
>>> plt.scatter(x, y, marker=',', s=1)
<matplotlib.collections.PathCollection object at ...>
If we want to show only those x values that are positive, we could easily do
this by modifying the ``linspace`` call that created ``x``. But what if we
want to show only those ``y`` values that are positive? The only way to do
this is to select them using a mask:
If we want to show only those :math:`x` values that are positive, we could
easily do this by modifying the ``linspace`` call that created ``x``. But what
if we want to show only those :math:`y` values that are positive? The only way
to do this is to select them using a mask:
.. plot::
:context: close-figs
Expand Down Expand Up @@ -359,19 +359,21 @@ Masking a subset of dimension is not as common as masking the entire array
"array of subarrays". For instance, suppose we have a video with 1920 x 1080
pixels and 500 frames. This might be represented as an array of shape `(500,
1080, 1920, 3)`, where the final dimension, 3, represents the 3 RGB color
values of a pixel. We can think of this array as 500 `(1080, 1920, 3)`
"frames". Or as 500 x 1080 x 1920 3-tuple "pixels". Or we could slice along
the last dimension and think of it as 3 `(500, 1080, 1920)` video "channels",
one for each primary color.
values of a pixel. We can think of this array as 500 different 1080 &times;
1920 &times; 3 "frames". Or as a 500 &times; 1080 &times; 1920 array of
3-tuple "pixels". Or we could slice along the last dimension and think of it
as three 500 &times; 1080 &times; 1920 video "channels", one for each primary
color.

In each case, we imagine that our array is really an array (or a stack or
batch) of subarrays, where some of our dimensions are the "stacking"
dimensions and some of them are the array dimensions. This way of thinking is
also common when doing linear algebra on arrays. The last two dimensions
(typically) are considered matrices, and the leading dimensions are batch
dimensions. An array of shape `(10, 5, 4)` might be thought of as ten 5 x 4
matrices. NumPy linear algebra functions like `solve` and the `@` matmul
operator will automatically operate on the last two dimensions of an array.
dimensions. An array of shape `(10, 5, 4)` might be thought of as ten 5
&times; 4 matrices. NumPy linear algebra functions like `solve` and the `@`
matmul operator will automatically operate on the last two dimensions of an
array.

So, how does this relate to using a boolean array index to select only a
subset of the array dimensions? Well, we might want to use a boolean index to
Expand Down Expand Up @@ -437,18 +439,22 @@ saturation of only those pixels:
>>> hsv_image[high_sat_mask, 1] = np.clip(hsv_image[high_sat_mask, 1] + 0.3, 0, 1)
>>> # Convert back to RGB
>>> enhanced_color_image = color.hsv2rgb(hsv_image)
>>> imshow(enhanced_color_image, "Saturated Image")
>>> imshow(enhanced_color_image, "Saturated Image (Better)")
```

Here, `hsv_image.shape` is `(512, 512, 3)`, so our mask `hsv_image[:, :, 1] >
0.6` has shape `(512, 512)`, i.e., the shape of the first two dimensions. In
other words, the mask has one value for each pixel, either `True` if the
saturation is `> 0.6` or `False` if it isn't. To add 0.3 to only those pixels
above the threshold, we mask the original array with `hsv_image[high_sat_mask,
1]`. The `high_sat_mask` part of the index selects only those pixel values
that have high saturation, and the `1` in the final dimension selects the
saturation channel for those pixels.
0.6`[^high_sat_mask-footnote] has shape `(512, 512)`, i.e., the shape of the
first two dimensions. In other words, the mask has one value for each pixel,
either `True` if the saturation is `> 0.6` or `False` if it isn't. To add
`0.3` saturation to only those pixels above the threshold, we mask the
original array with `hsv_image[high_sat_mask, 1]`. The `high_sat_mask` part of
the index selects only those pixel values that have high saturation, and the
`1` in the final dimension selects the saturation channel for those pixels.

[^high_sat_mask-footnote]: We could have also written `(hsv_image > 0.6)[:, :,
1]`, although this would be less efficient because it would unnecessarily
compute `> 0.6` for the hue and value channels.

(nonzero-equivalence)=
### `nonzero()` Equivalence
Expand Down Expand Up @@ -674,9 +680,9 @@ Or if it had no actual `0`s:[^0-d-mask-footnote]
array([1, 1, 2])
```

But even if `a` is a 0-D array, i.e., a single scalar value, we would expect
this sort of thing to still work, since, as we said, `a[a == 0] = -1` should
work for *any* array. And indeed, it does:
But even if `a` is a 0-D array, i.e., a single scalar value, we would still
expect this sort of thing to still work, since, as we said, `a[a == 0] = -1`
should work for *any* array. And indeed, it does:

```py
>>> a = np.asarray(0)
Expand Down Expand Up @@ -714,7 +720,7 @@ array([], dtype=int64)
```

In this case, `a[a == 0] = -1` would assign `-1` to all the values in `a[a
== 0]`, which would be no values, so `a` would remain unchanged:
== 0]`, i.e., no values, so `a` would remain unchanged:

```py
>>> a[a == 0] = -1
Expand Down
46 changes: 17 additions & 29 deletions docs/indexing-guide/multidimensional-indices/integer-arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,26 +108,12 @@ For example:
```

In particular, even when the index array `idx` has more than one dimension, an
integer array index still only selects elements from a single axis of `a`.

```
>>> a = np.array([[100, 101, 102],
... [103, 104, 105]])
>>> idx = np.array([0, 0, 1])
>>> a[idx] # Index the first dimension
array([[100, 101, 102],
[100, 101, 102],
[103, 104, 105]])
>>> a[:, idx] # Index the second dimension
array([[100, 100, 101],
[103, 103, 104]])
```

It would appear that this limits the ability to arbitrarily shuffle elements
of `a` using integer indexing. For instance, suppose we want to create the
array `[105, 100]` from the above 2-D `a`. Based on the above examples, it
might not seem possible. The elements `105` and `100` are not in the same row
or column of `a`.
integer array index still only selects elements from a single axis of `a`. It
would appear that this limits the ability to arbitrarily shuffle elements of
`a` using integer indexing. For instance, suppose we want to create the array
`[105, 100]` from the above 2-D `a`. Based on the above examples, it might not
seem possible, since the elements `105` and `100` are not in the same row or
column of `a`.

However, this is doable by providing multiple integer array
indices:
Expand All @@ -136,11 +122,12 @@ indices:
> **When multiple integer array indices are provided, the elements of each
> index are selected correspondingly for that axis.**
It's perhaps most illustrative to
show this as an example. Given the above `a`, we can produce the array `[105,
100]` using.
It's perhaps most illustrative to show this as an example. Given the above
`a`, we can produce the array `[105, 100]` using

```
>>> a = np.array([[100, 101, 102],
... [103, 104, 105]])
>>> idx = (np.array([1, 0]), np.array([2, 0]))
>>> a[idx]
array([105, 100])
Expand Down Expand Up @@ -415,9 +402,9 @@ array([105, 100])
However, you might have noticed that this behavior is somewhat unusual
compared to other index types. For all other index types we've discussed so
far, such as [slices](../slices.md) and [integer indices](../integer-indices.md),
each index applies "independently" along each dimension. For example, `x[0:3,
0:2]` applies the slice `0:3` to the first dimension of `x` and `0:2` to the
second dimension. The resulting array has `3*2 = 6` elements, because there
each index applies "independently" along each dimension. For example, `x[0:2,
0:3]` applies the slice `0:2` to the first dimension of `x` and `0:3` to the
second dimension. The resulting array has `2*3 = 6` elements, because there
are 3 subarrays selected from the first dimension with 2 elements each. But in
the above example, `a[[1, 0], [2, 0]]` only has 2 elements, not 4. And
something like `a[[1, 0], [2, 0, 1]]` is an error.
Expand Down Expand Up @@ -548,14 +535,15 @@ Conversely, a slice like `2:9` is equivalent to the outer index `[2, 3,

[^slice-outer-index-footnote]: They aren't actually equivalent, because [a
slice creates a view and an integer array index creates a
copy](views-vs-copies). If your index can be represented as a slice, it's
better to use an actual `slice`.
copy](views-vs-copies), not to mention the fact that slices
[clip](clipping) and integer arrays have bounds checks. If your index can
be represented as a slice, it's usually better to use an actual `slice`.

### Assigning to an Integer Array Index

As with all index types discussed in this guide, an integer array index can be
used on the left-hand side of an assignment. This is useful because it allows
you to surgically inject new elements into your array.
you to surgically inject new elements into existing positions in your array.

```py
>>> a = np.array([100, 101, 102, 103]) # as above
Expand Down
10 changes: 6 additions & 4 deletions docs/indexing-guide/multidimensional-indices/newaxis.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,23 +102,24 @@ array([[[0],
[7]]])
```

Let's look at each of these more closely:

- `a[np.newaxis, 0, :2]`: the new axis is inserted before the first axis, but
1. `a[np.newaxis, 0, :2]`: the new axis is inserted before the first axis, but
the `0` and `:2` still index the original first and second axes. The resulting
shape is `(1, 2, 4)`.

- `a[0, np.newaxis, :2]`: the new axis is inserted after the first axis, but
2. `a[0, np.newaxis, :2]`: the new axis is inserted after the first axis, but
because the `0` removes this axis when it indexes it, the resulting shape is
still `(1, 2, 4)` (and the resulting array is the same).

- `a[0, :2, np.newaxis]`: the new axis is inserted after the second axis,
3. `a[0, :2, np.newaxis]`: the new axis is inserted after the second axis,
because the `newaxis` comes right after the `:2`, which indexes the second
axis. The resulting shape is `(2, 1, 4)`. Remember that the `4` in the shape
corresponds to the last axis, which isn't represented in the index at all.
That's why in this example, the `4` still comes at the end of the resulting
shape.

- `a[0, :2, ..., np.newaxis]`: the `newaxis` is after an ellipsis, so the new
4. `a[0, :2, ..., np.newaxis]`: the `newaxis` is after an ellipsis, so the new
axis is inserted at the end of the shape. The resulting shape is `(2, 4, 1)`.

In general, in a tuple index, the axis that each index selects corresponds to
Expand Down Expand Up @@ -155,6 +156,7 @@ In summary,
non-`newaxis` indices in the tuple index are indexed as if the `newaxis`
indices were not there.**

(where-newaxis-is-used)=
## Where `newaxis` is Used

What we haven't said yet is why you would want to do such a thing in the first
Expand Down
8 changes: 4 additions & 4 deletions docs/indexing-guide/multidimensional-indices/tuples.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,9 +199,9 @@ every index type as a single element tuple index. An integer index `0` is
`a[0:3,]`. This is a good way to think about indices because it will help you
remember that non-tuple indices operate as if they were the first element of a
single-element tuple index, namely, they operate on the first axis of the
array. Remember, however, that this does not apply to Python built-in types;
for example, `l[0,]` and `l[0:3,]` will both produce errors if `l` is a
`list`, `tuple`, or `str`.
array. Remember, however, that this does not apply to Python built-in types:
`l[0,]` and `l[0:3,]` will both produce errors if `l` is a `list`, `tuple`, or
`str`.

Up to now, we looked at the tuple index `(1, 0, 2)`, which selected a single
element. And we considered sub-tuples of this, `(1,)` and `(1, 0)`, which
Expand Down Expand Up @@ -355,7 +355,7 @@ argument to retain the dimension as a size-1 dimension instead.
array.

There are two final facts about tuple indices that should be noted before we
move on to the other basic index types. First, as we noticed above,
move on to the other basic index types. First, as we saw above,

> **if a tuple index has more elements than there are dimensions in an array,
it raises an `IndexError`.**
Expand Down
5 changes: 5 additions & 0 deletions docs/indexing-guide/other-topics.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,11 @@ It can be useful to think of broadcasting as repeating "stacks" of smaller
arrays in this way. The size `1` dimension rule allows these "stacks" to be
along any dimensions of the array, not just the last ones.

When it comes to indexing, one of the most useful types of index for use with
broadcasting is [newaxis](./multidimensional-indices/newaxis.md), which lets
you easily insert size `1` dimensions into an array to make the broadcastable
in a specific way. See [](where-newaxis-is-used).

See the [NumPy
documentation](https://numpy.org/doc/stable/user/basics.broadcasting.html) for
more examples of broadcasting.
Expand Down
10 changes: 5 additions & 5 deletions docs/indexing-guide/slices.md
Original file line number Diff line number Diff line change
Expand Up @@ -1299,7 +1299,7 @@ Something like the following would work:
```py
>>> mid = len(a)//2
>>> n = 4
>>> a[mid - n//2: mid + n//2]
>>> a[mid - n//2:mid + n//2]
['b', 'c', 'd', 'e']
```

Expand All @@ -1311,7 +1311,7 @@ However, let's look at what happens when `n` is larger than the size of `a`:

```py
>>> n = 8
>>> a[mid - n//2: mid + n//2]
>>> a[mid - n//2:mid + n//2]
['g']
```

Expand Down Expand Up @@ -1346,10 +1346,10 @@ manually clip with `max(mid - n//2, 0)`:

```py
>>> n = 4
>>> a[max(mid - n//2, 0): mid + n//2]
>>> a[max(mid - n//2, 0):mid + n//2]
['b', 'c', 'd', 'e']
>>> n = 8
>>> a[max(mid - n//2, 0): mid + n//2]
>>> a[max(mid - n//2, 0):mid + n//2]
['a', 'b', 'c', 'd', 'e', 'f', 'g']
```

Expand Down Expand Up @@ -1457,7 +1457,7 @@ the *maximum* length of a slice. If the shape or length of the input is known,
{meth}`len(ndindex.Slice(...).reduce(shape)) <ndindex.Slice.reduce>` will
compute the true length of the slice. Of course, if you already have a list or
a NumPy array, you can just slice it and check the shape. Slicing a NumPy
array always produces a [view on the array](views-vs-copies), so it is a very
array always produces a [view on the array](views-vs-copies), so it is an
inexpensive operation. Slicing a `list` does make a copy, but it's a shallow
copy so it isn't particularly expensive either.

Expand Down

0 comments on commit 0ba06d3

Please sign in to comment.