In [2]:
%%timeit
%matplotlib inline

913 µs ± 326 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [3]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt

## Introduction
---
### 1. Simple example:

**Key words**:
- `vectors`
- `arrays`
- `views`
- `ufuncs`

The below is a simple example, `random walk`. A possible **object** oriented approach to define a RandomWalker class and write a walk **method** that would return the current position after each (random) step. 

**Object oriented approach**

In [7]:
# Object oriented approach
class RandomWalker:
    def __init__(self):
        self.position = 0
        
    def walk(self, n):
        self.position = 0
        for i in range(n):
            yield self.position
            self.position += 2*np.random.randint(0, 1) - 1
            
walker = RandomWalker()
walk = [position for position in walker.walk(10000)]
%timeit("[position for position in walker.walk(n=10000)]", globals())

65.1 ns ± 0.93 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [20]:
walker.position

-1000

---

**Procedual Approach**

we can probably save the class definition and concentrate only on the walk method that computes successive positions after each random step.

In [9]:
def random_walk(n):
    position = 0
    walk = [position]
    for i in range(n):
        position += 2*np.random.randint(0, 1) - 1
        walk.append(position)
    return walk

random_walk(10000)
%timeit("random_walk(n=10000)", globals())

65.2 ns ± 1.53 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


This new method saves some CPU cycles but not that much because this function is pretty much the same as in the object-oriented approach and the few cycles we saved probably come from the inner Python object-oriented machinery.

**Vectorized approach**

But we can do better using the `itertools` Python module that offers a set of *functions creating iterators for efficient looping*. If we observe that a random walk is an accumulation of steps, we can rewrite the function by first generating all the steps and accumulate them without any loop:

In [15]:
def random_walk_faster(n=10000):
    from itertools import accumulate
    # only available from Pythonn 3.6
    steps = np.random.choice([-1, +1], n)
    return [0] + list(accumulate(steps))
walk = random_walk_faster(10000)
%timeit("random_walk_faster(n=10000)", globals())

65.2 ns ± 0.642 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [16]:
np.random.choice?

In [18]:
def random_walk_faster2(n=1000):
    steps = np.random.choice([-1,1], n)
    return np.cumsum(steps)
#walk=random_walk_faster2(1000)
%timeit("random_walk_fastest(n=10000)", globals())

65 ns ± 1.51 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In the above function `random_walk_faster2`, We just have to translate `itertools` call into NumPy ones.
The series of notebooks is about vectorization, be it at the code or problem level. This difference is important before looking at custom vectorization.

## 2. Anatomy of an array
### 2.1 Memory layout

**Definition** of the `ndarray` class in NumPy:

An array is mostly a contiguous block of memory whose parts can be accessed using an indexing scheme. Such indexing scheme is in turn defined by a *shape* and a *data type* and this is precisely what is needed when you define a new array:

In [1]:
%matplotlib inline
import numpy as np
import scipy as sp

In [2]:
z = np.arange(9).reshape(3,3).astype(np.int16)
z

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]], dtype=int16)

In [5]:
z.itemsize ##2 bytes(int16)

2

In [6]:
z.shape

(3, 3)

In [7]:
z.ndim # the number of dimension

2

Since `z` is not a view, we can deduce the [strides](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.strides.html#numpy.ndarray.strides) of the array that define the number of bytes to step in each dimension when traversing the array:

In [3]:
strides = z.shape[1]*z.itemsize, z.itemsize
print(strides)

(6, 2)


In [4]:
print(z.strides)

(6, 2)


With all these information, we know how to access a specific item (designed by an index tuple) and more precisely, how to compute the start and end offsets:
```
offset_start = 0
for i in range(ndim):
    offset_start += strides[i]*index[i]
offset_end = offset_start + z.itemsize
```

In [5]:
 # test the above method:
z=np.arange(9).reshape(3,3).astype(np.int16)
index = 1,1
print(z[index].tobytes())

b'\x04\x00'


In [11]:
offset = 0
for i in range(z.ndim):
    offset += z.strides[i]*index[i]
print(offset)

8


In [1]:
from IPython.display import HTML
HTML(filename="03-anatomy.html")