-
Notifications
You must be signed in to change notification settings - Fork 10
/
other-topics.md
669 lines (567 loc) · 23.5 KB
/
other-topics.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
# Other Topics Relevant to Indexing
There is a great deal of functionality in NumPy, and most of it is
out-of-scope for this guide. However, a few additional features are useful to
understand in order to fully utilize indexing to its fullest potential, or as
important caveats to be aware of when doing indexing operations.
(broadcasting)=
## Broadcasting
Broadcasting is a powerful abstraction that applies to all operations in
NumPy. It allows arrays with mismatched shapes to be combined together as if
one or more of their dimensions were simply repeated the appropriate number of
times.
Normally, when we perform an operation on two arrays with the same shape, it
does what we'd expect, i.e., the operation is applied to each corresponding
element of each array. For example, if `x` and `y` are both `(2, 2)` arrays,
`x + y` results in a `(2, 2)` array with the sum of the corresponding
elements.
```py
>>> import numpy as np
>>> a = np.array([[1, 2],
... [3, 4]])
>>> b = np.array([[101, 102],
... [103, 104]])
>>> a + b
array([[102, 104],
[106, 108]])
```
However, you may have noticed that `x` and `y` don't always have to be two
arrays with the same shape. For example, you can add a single scalar element
to an array, and it will add it to each element.
```py
>>> a + 1
array([[2, 3],
[4, 5]])
```
In the above example, we can think of the scalar `1` as a shape `()` array,
whereas `x` is a shape `(2, 2)` array. Thus, `x` and `1` do not have the same
shape, but `x + 1` is allowed via repeating `1` across every element of `x`.
This means NumPy is taking `1` and treating it as if it were the shape `(2,
2)` array `[[1, 1], [1, 1]]`.
Broadcasting is a generalization of this behavior. Specifically, instead of
repeating just a single number into an array, we can repeat just some
dimensions of an array into a bigger array. For example, here we multiply `x`,
a shape `(3, 2)` array, with `y`, a shape `(2,)` array. `y` is virtually
repeated into a shape `(3, 2)` array with each element of the last dimension
repeated 3 times.
```py
>>> x = np.array([[1, 2],
... [3, 4],
... [5, 6]])
>>> x.shape
(3, 2)
>>> y = np.array([0, 2])
>>> x*y
array([[ 0, 4],
[ 0, 8],
[ 0, 12]])
```
We can see how broadcasting works using `np.broadcast_to`
```py
>>> np.broadcast_to(y, x.shape)
array([[0, 2],
[0, 2],
[0, 2]])
```
This is what the array `y` looks like before it is combined with `x`. Although
note that broadcasting is implemented efficiently, so that the "repeated"
elements actually refer to the same objects in memory. See [](views-vs-copies)
and [](strides) below.
Broadcasting always happens automatically in NumPy whenever two arrays with
different shapes are combined using any function or operator, assuming those
shapes are broadcast compatible. The rule for broadcast compatibility is that
the shorter of the shapes are prepended with length 1 dimensions so that they
have the same number of dimensions. Then any dimensions that are size `1` in a
shape are replaced with the corresponding size in the other shape. The other
non-`1` sizes must be equal or broadcasting is not allowed.
In the above example, we broadcast `(3, 2)` with `(2,)` by first extending
`(2,)` to `(1, 2)` then broadcasting the size `1` dimension to the
corresponding size in the other shape, `3`, giving a broadcasted shape of `(3,
2)`. In more advanced examples, both shapes may have broadcasted dimensions.
For instance, `(3, 1)` can broadcast with `(2,)` to create `(3, 2)`. The first
shape repeats the first axis 2 times along the second axis, while the second
shape inserts a new axis at the beginning that repeats 3 times.
We can think of `(3, 2)` as a "stack" of 3 shape `(2,)` arrays. Just as the
scalar `1` got repeated to match the full shape of `a` above, the shape `(2,)`
array `y` gets repeated into a `(3,)` "stack" so it matches the shape of `x`.
Additionally, size `1` dimensions are a signal to NumPy that that dimension is
allowed to be repeated, or broadcasted, when used with another array where
the other dimensions match.
It can be useful to think of broadcasting as repeating "stacks" of smaller
arrays in this way. The automatic prepending rule lets you automatically treat
array "stacking" along the last dimension, which is the most common, but the
size `1` dimension rule allows these stacks to be along any dimensions of
the array, not just the last ones.
When it comes to indexing, one of the most useful types of index for use with
broadcasting is [newaxis](./multidimensional-indices/newaxis.md), which lets
you easily insert size `1` dimensions into an array to make the broadcastable
in a specific way. See [](where-newaxis-is-used).
See the [NumPy
documentation](https://numpy.org/doc/stable/user/basics.broadcasting.html) for
more examples of broadcasting.
(views-vs-copies)=
## Views vs. Copies
There is an important distinction between basic indices (i.e.,
[integers](integer-indices.md), [slices](slices.md),
[ellipses](multidimensional-indices/ellipses.md),
[newaxis](multidimensional-indices/newaxis.md)) and [advanced
indices](advanced-indices) (i.e., [integer array
indices](multidimensional-indices/integer-arrays.md) and [boolean array
indices](multidimensional-indices/boolean-arrays.md)) in NumPy that must be
noted in some situations. Specifically, basic indices always create a **view**
into an array[^view-scalar-footnote], while advanced indices always create a
**copy** of the underlying array. [Tuple](multidimensional-indices/tuples.md)
indices (i.e., multidimensional indices) will create a view if they do not
contain an advanced index and a copy if they do.
[^view-scalar-footnote]:
<!-- This is the only way to cross reference a footnote across documents -->
(view-scalar-footnote-ref)=
There is one exception to this rule, which is that an
index that would return a scalar returns a copy, since scalars are
supposed to be immutable.
```py
>>> a = np.arange(20)
>>> a[0] # doctest: +SKIPNP1
np.int64(0)
>>> print(a[0].base)
None
>>> a[0, ...]
array(0)
>>> a[0, ...].base is a
True
```
A **view** is a special type of array whose data points to another array. This
means that if you mutate the data in one array, the other array will also have
that data mutated as well. For example,
```py
>>> a = np.arange(24).reshape((3, 2, 4))
>>> b = a[:, 0] # b is a view of a
>>> a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
<BLANKLINE>
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
<BLANKLINE>
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> b[:] = 0 # Mutating b also mutates a
>>> a
array([[[ 0, 0, 0, 0],
[ 4, 5, 6, 7]],
<BLANKLINE>
[[ 0, 0, 0, 0],
[12, 13, 14, 15]],
<BLANKLINE>
[[ 0, 0, 0, 0],
[20, 21, 22, 23]]])
```
Mutating `b` also changed `a`, because both arrays point to the same memory.
Note that this behavior is exactly the opposite of the way Python lists work.
With Python lists, `a[:]` is a shorthand to copy `a`. But with NumPy, `a[:]`
creates a view into `a` (to copy an array with NumPy, use `a.copy()`). Python
lists do not have a notion of views.
```py
>>> a = [1, 2, 3] # list
>>> b = a[:] # a copy of a
>>> b[0] = 0 # Modifies b but not a
>>> b
[0, 2, 3]
>>> a
[1, 2, 3]
>>> a = np.array([1, 2, 3]) # NumPy array
>>> b = a[:] # A view of a
>>> b[0] = 0 # Modifies both b and a
>>> b
array([0, 2, 3])
>>> a
array([0, 2, 3])
>>> c = a.copy() # A copy of a
>>> c[0] = -1 # Only modifies c
>>> c
array([-1, 2, 3])
>>> a
array([0, 2, 3])
```
Views don't just come from indexing. For instance, when you reshape an array,
that also creates a view.
```py
>>> a = np.arange(24)
>>> b = a.reshape((3, 2, 4))
>>> b[0] = 0
>>> b
array([[[ 0, 0, 0, 0],
[ 0, 0, 0, 0]],
<BLANKLINE>
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
<BLANKLINE>
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> a
array([ 0, 0, 0, 0, 0, 0, 0, 0, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
```
Many other operations also create views, for example
[`np.transpose`](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html),
[`a.T`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.T.html),
[`np.ravel`](https://numpy.org/doc/stable/reference/generated/numpy.ravel.html),
[`broadcast`](https://numpy.org/doc/stable/reference/generated/numpy.broadcast.html),
and
[`a.view`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.view.html).[^view-functions-footnote]
[^view-functions-footnote]: Some of these functions will sometimes return a
copy because returning a view is not possible. For example, it is not
always possible to represent a reshape as a [strides](strides)
manipulation if the strides are already non-contiguous.
To check if an array is a view, check `a.base`. It will be `None` if it is not
a view and will point to the base array otherwise. A view of a view will have
the same base as the original array.
```py
>>> a = np.arange(24)
>>> b = a[:]
>>> print(a.base) # a is not a view
None
>>> b.base is a # b is a view of a
True
>>> c = b[::2]
>>> c.base is a # c is a further view of a
True
```
In contrast, an advanced index will always create a copy (even if it would be
possible to represent it as a view). This includes any [tuple
index](multidimensional-indices/tuples.md) (i.e., multidimensional index) that
contains at least one array index.
```py
>>> a = np.arange(10)
>>> b = a[::4]
>>> c = a[[0, 4, 8]]
>>> b
array([0, 4, 8])
>>> c
array([0, 4, 8])
>>> b.base is a # b is a view of a
True
>>> print(c.base) # c is not (it is a copy)
None
```
Whether an array is a view or a copy matters for two reasons:
1. If you ever mutate the array, if it is a view, it will also mutate the data
in the base array, as shown above. Be aware that views affect mutations in
both directions: if `a` is a view, mutating it will also mutate whichever
array it is a view on, but conversely, even if `a` is not a view, mutating
it will modify any other arrays which are views into `a`. And while you can
check if `a` is a view by looking at `a.base`, there is no easy way to
check if `a` has other views pointing at it. The only way to know is to
analyze the program and check any array which is created from `a`.
It's best to minimize mutations in the presence of views, or to restrict
them to a controlled part of the code, to avoid unexpected "[action at a
distance](https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_programming))"
bugs.
Note that you can always ensure that `a` is a new array that isn't a view
and doesn't have any views pointing to it by copying it, using `a.copy()`.
2. Even if you don't mutate data, views are important because they are more
efficient. A view is a relatively cheap thing to make, even if the array is
large. It also saves on memory usage.
For example, here we have an array with about 800 kB of data, and it takes
over 200x longer to copy it than to create a view (using
[IPython](https://ipython.org/)'s `%timeit`):
```
In [1]: import numpy as np
In [2]: a = np.arange(100000, dtype=np.float64)
In [3]: %timeit a.copy()
25.9 µs ± 645 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [4]: %timeit a[:]
127 ns ± 1.62 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
```
(strides)=
## Strides
The reason so many types of indexing into arrays can be represented as a
[view](views-vs-copies) without creating a copy is that NumPy arrays aren't
merely a pointer to a blob of memory. They are a pointer along with something
called *strides*. The strides tell NumPy how many bytes to skip in memory
along each axis to get to the next element of the array along that dimension.
This along with the *memory offset* (the address in physical memory of the
first byte of data), the *shape*, and the *itemsize* (the number of bytes each
element takes up, which depends on the *dtype*) exactly determines how the
corresponding memory is organized into an array. For example, let's start with
a flat 1-dimensional array with `24` elements whose itemsize is 8 (an `int64`
takes up 8 bytes). Its strides is `(8,)`:
```py
>>> a = np.arange(24)
>>> a.itemsize
8
>>> a.strides
(8,)
>>> a.shape
(24,)
```
Now let's create a view `b`, which is `a` reshaped to shape `(3, 2, 4)`. `b`
uses the exact same memory as `a` (which is just `0 1 2 ... 23`). Its itemsize
is the same because it has the same dtype, but its strides and shape are
different.
```py
>>> b = a.reshape((3, 2, 4))
>>> b
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
<BLANKLINE>
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
<BLANKLINE>
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> b.itemsize
8
>>> b.strides
(64, 32, 8)
>>> b.shape
(3, 2, 4)
```
This tells NumPy that to get the next element in the first dimension of `b`,
it needs to skip 64 bytes. That's because the first dimension contains 2\*4=8
items each, corresponding to the sizes of the second and third dimensions, and
each item is 8 bytes, so 8\*8=64. For example, the next element in the first
dimension after `0` (index `(0, 0, 0)`) is `8` (index `(1, 0, 0)`), which sits
exactly 64 bytes after it in memory. Similarly, to get the next element in the
second dimension, it should skip 32 bytes (4 elements).
The memory offset of an array can be accessed using `a.ctypes.data`. This is
the address in physical memory where the data (`0 1 2 ... 23`) lives. `a`
and `b` have the same memory offset because they both start with the same
first element:
```py
>>> a.ctypes.data # doctest: +SKIP
105553170825216
>>> b.ctypes.data # doctest: +SKIP
105553170825216
```
When we slice off the beginning of the array, we can see that all this does is
move the memory offset forward, and adjust the shape correspondingly.
```py
>>> a[2:].ctypes.data # doctest: +SKIP
105553170825232
>>> a[2:].shape
(22,)
>>> a[2:].strides # the strides are the same
(8,)
```
Specifically, it moves it by `2*8` (where remember `8` is `a.itemsize`):
```py
>>> a[2:].ctypes.data - a.ctypes.data
16
```
Here the strides are the same. Similarly, if we slice off the end of the
array, all it needs to do is adjust the shape. The memory offset is the same,
because it still starts at the same place in memory.
```py
>>> a[:2].ctypes.data # the memory offset is the same # doctest: +SKIP
105553170825216
>>> a[:2].shape
(2,)
>>> a[:2].strides # the strides are the same
(8,)
```
If we instead slice with a step, this adjusts the strides. For instance
`a[::2]` will double the strides, making it skip every other element (but the
offset will be again unchanged because it still starts at the first element of
`a`):
```py
>>> a[::2].strides
(16,)
>>> a[::2].ctypes.data # the memory offset is the same # doctest: +SKIP
105553170825216
>>> a[::2].shape
(12,)
```
If we use a *negative* step, the strides will become negative. This will cause
NumPy to work backwards in memory as it accesses the elements of the array.
The memory offset also changes here so that it starts with the last element of
`a`:
```py
>>> a[::-2].strides
(-16,)
>>> a[::-2].ctypes.data # doctest: +SKIP
105553170825400
```
From this, you are hopefully convinced that every possible slice is just a
manipulation of the memory offset, shape, and strides. It's not hard to see
that this also applies to integer indices (which just removes the stride for
the corresponding axis, adjusting the shape and memory offset accordingly) and
newaxis (which just adds `0` to the strides):
```py
>>> b.strides
(64, 32, 8)
>>> b[2].strides
(32, 8)
>>> b[np.newaxis].strides
(0, 64, 32, 8)
```
This is why [basic indexing](basic-indices) always produces a
[view](views-vs-copies): because it can always be represented as a
manipulation of the strides (plus shape and offset).
Another important fact about strides is that [broadcasting](broadcasting) can
be achieved by manipulating the strides, namely by using a `0` stride to
repeat the same data along a given axis.
```py
>>> c = a.reshape((1, 12, 2))
>>> d = np.broadcast_to(c, (5, 12, 2))
>>> c.shape
(1, 12, 2)
>>> c.strides
(192, 16, 8)
>>> d.shape
(5, 12, 2)
>>> d.strides
(0, 16, 8)
>>> c.base is a
True
>>> d.base is a
True
```
If you've ever used `broadcast_to()`, you might have noticed that it returns a
read-only array. That's because writing into it would not do what you'd
expect, since the repeated elements literally refer to the same memory.
This shows why [broadcasting](broadcasting) is so powerful: it can be done
without any actual copy of the data. When you perform an operation on two
arrays, the broadcasting is implicit, but even explicitly creating a
broadcasted array is cheap, because all it does is create a view with
different strides.
You can also manually create a view with any strides you want using [stride
tricks](https://numpy.org/doc/stable/reference/generated/numpy.lib.stride_tricks.as_strided.html).
Most views that you would want to create can be made with a combination of
indexing, `reshape`, `broadcast_to`, and `transpose`, but it's possible to use
strides to represent some things which are not so easy to do with just these
functions, for example, [sliding
windows](https://numpy.org/doc/stable/reference/generated/numpy.lib.stride_tricks.sliding_window_view.html)
and [convolutions](https://stackoverflow.com/a/43087507/161801). However, if
you do use stride tricks, be careful of the caveats (see the [notes section of
the `as_strided`
docs](https://numpy.org/doc/stable/reference/generated/numpy.lib.stride_tricks.as_strided.html)).
(c-vs-fortran-ordering)=
## C vs. Fortran Ordering
NumPy has an internal distinction between C order and Fortran
order.[^c-order-footnote] C-ordered arrays are stored in memory such that the
last axis varies the fastest. For example, if `a` has 3 dimensions, then its
elements are stored in memory like `a[0, 0, 0], a[0, 0, 1], a[0, 0, 2], ...,
a[0, 1, 0], a[0, 1, 1], a[0, 1, 2], ...`. Fortran ordering is the opposite:
the elements are stored in memory so that the first axis varies fastest, like
`a[0, 0, 0], a[1, 0, 0], a[2, 0, 0], ..., a[0, 1, 0], a[1, 1, 0], a[2, 1, 0]
...`.
[^c-order-footnote]: C ordering and Fortran ordering are also sometimes
"row-major" and "column-major" ordering, respectively. However, this
terminology is confusing when the array has more than two axes or when it
does not represent a mathematical matrix. It's better to think of ordering
in terms of which axes vary the fastest---the last for C ordering and the
first for Fortran ordering. Also, I don't know about you, but I can never
remember which is supposed to be "row-" and "column-" major, but I do
remember how array indexing works in C, so just thinking about that requires
no mnemonic.
The most important thing to note about C and Fortran ordering for our purposes
is that
> **the internal ordering of an array does not change any indexing
> semantics.**
The same index will select the same elements on `a` regardless of whether it
uses C or Fortran ordering internally.
More generally, the actual memory layout of an array has no bearing on
indexing semantics. Indexing operates on the logical abstraction of the array
as presented to the user, even if the true memory doesn't look anything like
that because the array is a [view](views-vs-copies) or has some other layout
due to [stride tricks](strides).
In particular, this also applies to [boolean array
indices](multidimensional-indices/boolean-arrays.md). A boolean mask always
[selects the elements in C order](boolean-array-c-order), even if the
underlying arrays use Fortran ordering.
```py
>>> a = np.arange(9).reshape((3, 3))
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> a_f = np.asarray(a, order='F')
>>> a_f # a_f looks the same as a, but the internal memory is ordered differently
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> idx = np.array([
... [False, True, False],
... [ True, True, False],
... [False, False, False]])
>>> idx_f = np.asarray(idx, order='F') # It doesn't matter if the index array is Fortran-ordered either
>>> a[idx] # These are all the same
array([1, 3, 4])
>>> a_f[idx]
array([1, 3, 4])
>>> a[idx_f]
array([1, 3, 4])
>>> a_f[idx_f]
array([1, 3, 4])
```
~~~~{admonition} Aside
If you read the previous section on [strides](strides), you may have guessed
that the difference between C-ordered and Fortran-ordered arrays is a
difference of...strides!
```py
>>> a.strides
(24, 8)
>>> a_f.strides
(8, 24)
```
In a C-ordered array the strides decrease and in a Fortran-ordered array they
increase, because a smaller stride corresponds to "faster varying".
~~~~
**What ordering *does* affect is the performance of certain operations.** In
particular, the ordering determines whether it is more optimal to index along
the first or last axes of an array. For example, `a[0]` selects the first
subarray along the first axis (recall that `a[0]` is a [view](views-vs-copies)
into `a`, so it references the exact same memory as `a`). For a C-ordered
array, which is the default, this subarray is contiguous in memory. This is
because the indices on the last axes vary the fastest (i.e., are next to each
other in memory), so selecting a subarray of the first axis picks elements
which are still contiguous. Conversely, for a Fortran-ordered array, `a[0]` is
not contiguous, but `a[..., 0]` is.
```
>>> a[0].data.contiguous
True
>>> a_f[0].data.contiguous
False
>>> a_f[..., 0].data.contiguous
True
```
Operating on contiguous memory allows the CPU to place the entire memory block
in the cache at once, and is more performant as a result. The performance
difference won't be noticeable for our small example `a` above, as it is small
enough to fit entirely in the cache, but it becomes significant for larger
arrays. Compare the time to sum along `a[0]` or `a[..., 0]` for a
3-dimensional array `a` with a million elements using C and Fortran ordering
(using [IPython](https://ipython.org/)'s `%timeit`):
```
In [1]: import numpy as np
In [2]: a = np.ones((100, 100, 100)) # a has C order (the default)
In [3]: %timeit np.sum(a[0])
8.57 µs ± 121 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [4]: %timeit np.sum(a[..., 0])
24.2 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [5]: a_f = np.asarray(a, order='F')
In [6]: %timeit np.sum(a_f[0])
26.3 µs ± 952 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [7]: %timeit np.sum(a_f[..., 0])
8.6 µs ± 130 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
```
Summing along contiguous memory (`a[0]` for C ordering and `a[..., 0]` for
Fortran ordering) is about 3x faster.
NumPy indexing semantics generally favor using C ordering, as it does not
require an ellipsis to select contiguous subarrays. C ordering also matches
the [list-of-lists intuition](multidimensional-indices/what-is-an-array.md) of
an array, since an array like `[[0, 1], [2, 3]]` is stored in memory as
literally `0, 1, 2, 3` with C ordering. It also aligns well with NumPy's
[broadcasting](broadcasting) rules, where broadcasted dimensions are prepended
by default, allowing one to think of an array as a "stack" of contiguous
subarrays.
C ordering is the default in NumPy when creating arrays with functions like
`asarray`, `ones`, `arange`, and so on. One typically only switches to
Fortran ordering when calling certain Fortran codes, or when creating an
array from another memory source that produces Fortran-ordered data.
Regardless of which ordering you are using, it is worth structuring your data
so that operations are done on contiguous memory when possible.
```{rubric} Footnotes
```
<!-- Footnotes are written inline above but markdown will put them here at the
end of the document. -->