## Using stride tricks with NumPy
Generalize the notion of row-major and column-major orders to multidimensional arrays. The general notion is that of strides, which describes how the items of a multidimensional array are organized within a one-dimensional data buffer. These are mostly an implementation detail, but they can also be used in specific situations to optimize some algorithms.

### How it works
Every array has a number of dimensions, a shape, a data type, and strides. Strides describe how the items of a multidimensional array are organized in the databuffer. There are many different schemes for arranging the items of a multidimensional array in a one-dimensional block. NumPy implements a **strided indexing scheme**, where the position of any element is a **linear combination** of the dimensions, the coefficients being the strides. In other words, strides describe, in any dimension, how many bytes we need to jump over the data buffer to go from one item to the next.  

In [1]:
import numpy as np
def aid(x):
    # This function returns the memory
    # block address of an array
    return x.__array__interface__['data'][0]

1. Strides are integer numbers describing the byte step in the contiguous block of memory for each dimension.

In [2]:
x = np.zeros(10)
x.strides

(8,)

This vector `x` contains double-precision floating point numbers (`float64`, 8 bytes), so you need to go 8 byte forward to go from one item to the next.

2. Now, let's look at the strides of a 2D array:

In [3]:
y = np.zeros((10, 10))
y.strides

(80, 8)

In the first dimension (vertical), you need to go 80 bytes (10 float64 items) *forward* to go from one item to the next, because the items are internally stored in row-major order. In the second dimension (horizontal), you need to go 8 bytes forward to go from one item to the next.

3. Let's show how we can revisit the broadcasting rules from the previous recipe using strides:

In [4]:
n = 1000
a = np.arange(n)

We will create a new array `b`, pointing to the same memory block as `a`, but with a different shape and different strides. This new array will look like a vertical tiled version of `a`. We use a special function in NumPy to change the strides of an array:

In [5]:
b = np.lib.stride_tricks.as_strided(a, (n, n), (0, 8))
b

array([[  0,   1,   2, ..., 997, 998, 999],
       [  0,   1,   2, ..., 997, 998, 999],
       [  0,   1,   2, ..., 997, 998, 999],
       ...,
       [  0,   1,   2, ..., 997, 998, 999],
       [  0,   1,   2, ..., 997, 998, 999],
       [  0,   1,   2, ..., 997, 998, 999]])

In [6]:
b.size, b.shape, b.nbytes

(1000000, (1000, 1000), 8000000)

NumPy believes that this array contains one million different elements, whereas the data buffer actually contains the same 1000 elements as `a`.

4. We can now perform an efficient outer product using the same principle as with broadcasting rules:

In [7]:
%timeit b * b.T

771 µs ± 8.19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [9]:
%%timeit
np.tile(a, (n, 1)) * np.tile(a[:, np.newaxis], (1, n))

5.72 ms ± 73.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
