# Numpy Memory Layout

Within this small notebook, the difference between the memory layouts in numpy are examined, i.e. the C-major or Fortran-major order.
Additionally, the goal is to test whether the ``numpy`` functions or the corresponding ``ndarray`` methods are operating faster on "huge" arrays.
This notebook is inspired by [this post](https://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays/26999092#26999092).

In [25]:
import numpy as np

Define two array of shape ``(n, n)``, one in C-major and one in Fortran-major memory layout, i.e. **C or Fortran contiguous memory layout**.

In [26]:
n = 10_000
arr_c_major = np.arange(n**2).reshape((n, n), order="C")
arr_f_major = np.arange(n**2).reshape((n, n), order="F")

msg = f"C-major array:\n{arr_c_major}\n\nFortran-major:\n{arr_f_major}\n"
print(msg)

C-major array:
[[       0        1        2 ...     9997     9998     9999]
 [   10000    10001    10002 ...    19997    19998    19999]
 [   20000    20001    20002 ...    29997    29998    29999]
 ...
 [99970000 99970001 99970002 ... 99979997 99979998 99979999]
 [99980000 99980001 99980002 ... 99989997 99989998 99989999]
 [99990000 99990001 99990002 ... 99999997 99999998 99999999]]

Fortran-major:
[[       0    10000    20000 ... 99970000 99980000 99990000]
 [       1    10001    20001 ... 99970001 99980001 99990001]
 [       2    10002    20002 ... 99970002 99980002 99990002]
 ...
 [    9997    19997    29997 ... 99979997 99989997 99999997]
 [    9998    19998    29998 ... 99979998 99989998 99999998]
 [    9999    19999    29999 ... 99979999 99989999 99999999]]



## Test of Operation Order on Arrays Using Numpy Function

Sum the rows of the C-major array. Faster.

In [35]:
sum_arr_c_np = %timeit -o np.sum(arr_c_major, axis=0)

43 ms ± 312 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [37]:
sum_arr_c_np.compile_time

5.099999999913507e-05

Sum the columns of the C-major array. Slower.

In [28]:
%timeit np.sum(arr_c_major, axis=1)

50.3 ms ± 155 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Sum of rows of the Fortran-major array. Slower.

In [29]:
%timeit np.sum(arr_f_major, axis=0)

50.1 ms ± 110 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Sum of columns of the Fortran-major array. Faster.

In [30]:
%timeit np.sum(arr_f_major, axis=1)

42.4 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In conclusion, row-wise operations (``axis=0``) on C-major arrays and column-wise operations (``axis=1``) on Fortran-major arrays are faster.

## Test of Numpy Function vs. Ndarray Methods

In [31]:
%timeit np.sum(arr_c_major, axis=0)

42.6 ms ± 180 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [32]:
%timeit arr_c_major.sum(axis=0)

41.8 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [33]:
%timeit np.sum(arr_f_major, axis=1)

41.6 ms ± 86.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [34]:
%timeit arr_f_major.sum(axis=1)

42 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
