# <img width=400 src="http://www.numpy.org/_static/numpy_logo.png" alt="Numpy"/>


## Why do we need numpy?

* You may have heard "Python is slow", this is true when it concerns looping over many small python objects
* Python is dynamically typed and everything is an object, even an `int`. There are no primitive types.
* Numpy's main feature is the `ndarray` class, a fixed length, homogeniously typed array class.
* Numpy implements a lot of functionality in fast c, cython and fortran code to work on these arrays
* python with vectorized operations using numpy can be blazingly fast

See: [Python is not C](https://www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Is_Not_C?lang=en)

In [1]:
import numpy as np

## Small example timings

In [28]:
import math


def var(data):
    '''
    knuth's algorithm for one-pass calculation of the variance
    Avoids rounding errors of large numbers when doing the naive
    approach of `sum(v**2 for v in data) - sum(v)**2`
    '''
    
    n = 0
    mean = 0.0
    m2 = 0.0
    
    if len(data) < 2:
        return float('nan')

    for value in data:
        n += 1
        delta = value - mean
        mean += delta / n
        delta2 = value - mean
        m2 += delta * delta2

    return m2 / n 

In [29]:
%%timeit

l = list(range(1000))
var(l)

242 µs ± 951 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [30]:
%%timeit

a = np.arange(1000)

np.var(a)

27.6 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


### Some useful properties

In [31]:
a = np.arange(5)
a

array([0, 1, 2, 3, 4])

In [3]:
len(a)

5

In [4]:
a.shape

(5,)

In [5]:
a.dtype

dtype('int64')

In [6]:
a.ndim

1

In [7]:
a.size

5

## Basic math: vectorized

Operations on numpy arrays work vectorized, element-by-element

In [32]:
2 * a

array([0, 2, 4, 6, 8])

In [33]:
a**2

array([ 0,  1,  4,  9, 16])

In [35]:
a**a

array([  1,   1,   4,  27, 256])

In [36]:
np.cos(a)

array([ 1.        ,  0.54030231, -0.41614684, -0.9899925 , -0.65364362])

**Attention: You need the `cos` from numpy!**

In [37]:
math.cos(a)

TypeError: only length-1 arrays can be converted to Python scalars

Most normal python functions with basic operators like `*`, `+`, `**` simply work because
of operator overloading:

In [38]:
def poly(x):
    return x + 2 * x**2 - x**3

poly(a)

array([  0,   2,   2,  -6, -28])

In [43]:
poly(np.pi)

-8.125475224531307

## Arbitrary dimension arrays

In [None]:
# two-dimensional array
y = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

y + y

In [None]:
## since python 3.5 @ is matrix product
y @ y

## Helpers for creating arrays

In [None]:
np.zeros(10)

In [None]:
np.ones((5, 2))

In [None]:
np.full(5, np.nan)

In [None]:
np.empty(5)  # attention, uninitialised memory, be carefull

In [None]:
np.linspace(0, 1, 11)

In [None]:
# like range() for arrays:
np.arange(0, 10)

In [None]:
np.logspace(-4, 5, 10)

## Numpy Indexing

* Element access
* Slicing

In [None]:
x = np.arange(0, 10)

# like lists:
x[4]

In [None]:
# all elements with indices ≥1 and <4:
x[1:4]

In [None]:
# negative indices count from the end
x[-1], x[-2]

In [None]:
# combination:
x[3:-2]

In [None]:
# step size
x[::2]

In [None]:
# trick for reversal: negative step
x[::-1]

In [None]:
y = np.array([x, x + 10, x + 20, x + 30])
y

In [None]:
# comma between indices
y[3, 2:-1]

In [None]:
# only one index ⇒ one-dimensional array
y[2]

In [None]:
# other axis: (: alone means the whole axis)
y[:, 3]

In [None]:
# inspecting the number of elements per axis:
y.shape

# Changing array content

In [None]:
y

In [None]:
y[:, 3] = 0
y

Using slices on both sides

In [None]:
y[:,0] = x[3:7]
y

Transposing inverts the order of the dimensions

In [None]:
y

In [None]:
y.shape

In [None]:
y.T

In [None]:
y.T.shape

# Masks

* A boolean array can be used to select only the element where it contains `True`.
* Very powerfull tool to select certain elements that fullfill a certain condition

In [None]:
a = np.linspace(0, 2, 11)
b = np.random.normal(0, 1, 11)

print(b >= 0)
print(a[b >= 0])

In [None]:
a[b < 0] = 0
a

## Reduction operations

Numpy has many operations, which reduce dimensionality of arrays

In [None]:
x = np.random.normal(0, 1, 10)

In [None]:
np.sum(x)

In [None]:
np.prod(x)

In [None]:
np.mean(x)

Standard Deviation

In [None]:
np.std(x)

Standard error of the mean

In [None]:
np.std(x, ddof=1) / np.sqrt(len(x))

Sample Standard Deviation

In [None]:
np.std(x, ddof=1)

Difference between neighbor elements

In [None]:
z = np.arange(10)**2
np.diff(z)

### Reductions on multi-dimensional arrays


In [None]:
array2d = np.arange(20).reshape(4, 5)

array2d

In [None]:
np.sum(array2d, axis=0)

In [None]:
np.mean(array2d, axis=1)

### Random numbers

* numpy has a larger number of distributions builtin

In [None]:
np.random.uniform(0, 1, 5)

In [None]:
np.random.normal(5, 10, 5)

## Structured numpy arrays

## Linear Algebra