# Introduction to numpy

numpy is a python library for vectors, matrices, and general multidimensional arrays. It is highly optimized and therefore faster and more efficient than the python standard library. 

First, you have to import numpy. A common convention is to import numpy as np.

In [None]:
import numpy as np

To create a numpy-array you first create a normal Python list an then convert via `np.array(list)` to a numpy-array.

In [None]:
np.array([1,2,3,4])

Numpy-arrays can handle many types as Ints, Floats, and Strings.

In [None]:
cvalues = [25.3, 24.8, 26.9, 23.9]
C = np.array(cvalues)

You can perform basic operations on numpy-arrays. In contrast to normal Python lists, the operations are performed elementwise on the elements of the array. So to perform an operation on every element you don't have to iterate of the list or perform a list comprehension.

In [None]:
C * 9 / 5 + 32

In [None]:
[x * 9/5 + 32 for x in cvalues]

## Create an array with evenly spaced values within a given interval

Like the standard Python `range()` you can create numpy-arrays with evenly spaced values within a given range (`interval.arange([start, ] stop[, step,], dtype=None)`). You can specify with start the included start-value, with stop the not included stop-value and with step the step size between the values. Also, you can specify the data-type with dtype, which is inferred from the other parameters if not specified.

In [None]:
np.arange(3.0)

In [None]:
np.arange(1,5,2)

## Comparison Python-Lists vs. Numpy-Arrays

Numpy-arrays and the numpy operations are highly optimized when compared to lists. This means you should use the numpy-library whenever it is possible to get better performance and reduce runtime.

In [None]:
import time

v = [e for e in range(10000)]

start = time.time()
for i in range(10000):
    x = [e+e for e in v]
    v = [e/2 for e in x]
time_lists = time.time() - start

arr = np.array(v)
start = time.time()
for i in range(10000):
    x = arr + arr
    arr = x/2
time_arrays = time.time() - start

print('time_list:', time_lists)
print('time_arrays:', time_arrays)

## Arrays with multiple axes

Numpy-arrays can be multidimensional.

You can create scalars with only a single value.

In [None]:
np.array(42)

Or you can create a vector with one dimension.

In [None]:
np.array([3.4, 6.9, 99.8, 12.8])

But you can also create a matrix with two dimensions, like nested lists in standard Python.

In [None]:
np.array([[ 3.4,  8.7,  9.9 ], \
          [ 1.1, -7.8, -0.7 ], \
          [ 4.1, 12.3,  4.8 ]])

Last but not least numpy supports tensors with a third dimension.

In [None]:
np.array([[[ 111, 112 ], [ 121, 122 ]], \
          [[ 211, 212 ], [ 221, 222 ]], \
          [[ 311, 312 ], [ 321, 322 ]]])

## Shape of an array

The shape of an array indicates the dimensions of the array. To get the shape of an array numpy has the function `np.shape(array)` and every array has the property `shape`. If an array has the shape `(6,3)` it represents a 6x3 matrix with 6 rows and 3 columns.

In [None]:
x = np.array([[67, 63, 87], \
              [77, 69, 59], \
              [77, 69, 59], \
              [67, 63, 87], \
              [67, 63, 87], \
              [67, 63, 87]])

np.shape(x)
x.shape # alternative.

## Change shape

By calling `reshape` on an array you can change the dimensions of the array. This operation is only possible if the new shape fits the length of the original array.

In [None]:
a = np.arange(12).reshape(3, 4)
print(a)

In [None]:
a.shape = (2, 6)
print(a)

The `reshape` operation supports multiple dimensions. The product of the specified shape has to fit the product of the old shape.

In [None]:
np.arange(24).reshape(2, 3, 4)

## Transpose

For matrix operations, you sometimes need to transpose the matrix. In numpy you can do this by calling `array.T` on a matrix or `array.transpose(axes)` and specifying the axes.

In [None]:
b = np.arange(6).reshape(2, 3)

print(b)
print(b.T)
b.transpose(1, 0)

## Basic operators

Numpy supports many basic math operations like subtracting one array from the other or multiplying two arrays. 

In [None]:
n = np.array([20, 30, 40, 50])
p = np.array([0, 1, 2, 3])

In [None]:
n - p

In [None]:
n * p

Also the dot-product is supported by calling `array1.dot(array2)` or in a more functional way by calling `np.dot(array1, array2)`. 

In [None]:
n.dot(p)  # np.dot(n, p)

## Unary operators

You can perform unary operations on numpy-arrays, which numpy then handles elementwise on the array. Supported are boolean- and math-operators but also functions like the exponential function `np.exp(array)`.

In [None]:
n < 35

In [None]:
p ** 2

In [None]:
np.exp(p)

In [None]:
np.sqrt(p)

In [None]:
np.log(n)

## Sum, Maximum, Minimum

`array.sum(axis)` for the sum, `array.max(axis)` for the maximum and `array.min(axis)` are function build into numpy. If these functions are used on multidimensional arrays you can specify on which axis the operation should be performed. The specified axis has to be present on the array or an error will be raised.

In [None]:
m = np.arange(12).reshape(3,4)

In [None]:
m.sum(axis=0)

In [None]:
m.min(axis=1)

## Matrix-multiplication

If you want to perform a matrix multiplication you have to call `np.dot`. You have to pay attention to not use the `*` operator because the `*`, like in other libraries for example scikit, does not perform matrix multiplication. The `*` operator works elementwise. 

In [None]:
X = np.array([[2, -1], [0, 3], [1, 0]])
Y = np.array([[2, 0], [1, -1]])

In [None]:
A = X.dot(Y) # np.dot(X, Y)
print(A)
print(A.shape)

## Indexing

With `array[row][column]` you can index a multidimensional array similar to nested Python lists. 

In [None]:
B = np.array([[[ 111, 112 ], [ 121, 122 ]], \
              [[ 211, 212 ], [ 221, 222 ]], \
              [[ 311, 312 ], [ 321, 322 ]]])

In [None]:
print(B[2][1][0])

Additionaly numpy offers the index-syntax `array[row, column]`.

In [None]:
print(B[2, 1, 0])

By specifying only the row you get the whole column.

In [None]:
print(B[1])

Likewise to lists negative indexing is supported, to get only a part of the original array.

In [None]:
print(B[-1, -1])

## Indexing with Index-Arrays/Lists

Furthermore is indexing with another array or list supported, too. It is possible to create an array with indices and index another array with this array, by using the index-operator `array[]`. The result is an array with the resolved indices of the initial array.

In [None]:
s = np.arange(12) ** 2
i = np.array([1, 1, 3, 8, 5]) # i = [1, 1, 3, 8, 5]

In [None]:
s[i]

## Indexing with boolean values

If you create a numpy-array with boolean values and use it to index another numpy-array, you get an array of all the values, where the corresponding index in the boolean array is `True`. This is especially helpful when the goal is to filter the array.

In [None]:
g = np.arange(12).reshape(3, 4)

In [None]:
h = g > 4
print(h)

In [None]:
g[h]

Besides filtering the boolean index array can be used to reassign all `True` values.

In [None]:
g[h] = 0
print(g)

## Slicing

In analogy to lists, numpy supports slicing of arrays. The slicing operator works like `array[start:end:step]` where start and end values are included. All parameters are optional and can be left out to create different results.

In [None]:
S = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
S[3:6:2]

In [None]:
S[:4]

In [None]:
S[4:]

Defining no start, end and step slicing returns the original array.

In [None]:
S[:]

## Slicing multiple axes

Slicing is also possible on two or more axes. To accomplish multidimensional slicing specify the slicing for each dimension comma-separated (`array[slicing-axis0, slicing-axis1]`). By doing this the axes are sliced independent. 

In [None]:
J = np.arange(25).reshape(5,5)

In [None]:
J[:3, 2:]

In [None]:
J[3:, :]

In [None]:
K = np.arange(28).reshape(4, 7)

In [None]:
K[::2, ::3]

In [None]:
K[:, ::3]

## View

Slicing does not deep copy the array, it only creates a shallow copy, a view. Therefore you have to watch out if you edit the view the initial array also changes and sometimes this is unintended. 

In [None]:
D = np.arange(10)

In [None]:
V = D[2:6]

In [None]:
V[0] = 22
V[1] = 23

In [None]:
print(D)

To create a deep copy of an array numpy provides the function `np.copy(array)` or `array.copy()`. By using `no.copy()` the values of the initial array aren't affected by changes on the copy.

In [None]:
Q = D[2:6].copy()

## Array containing ones/zeros

Numpy offers a function to create arrays only containing ones or zeros. `np.ones(shape)` creates an array of the specified shape only filled with ones. `np.zeros(shape)` does the same but the values are all zero.

In [None]:
np.ones((2,3))

The default data-type of `np.zeros` and `np.ones` is float this can be changed by specifying the `dtype` in the function call.

In [None]:
np.ones((3,4), dtype=int)

In [None]:
np.zeros((2,4))

## Matrices with random values

For some applications, you need random numbers. Numpy supports initializing arrays with random values. `np.random.rand(shape)` creates an array with the specified shape with random values from a uniform distribution from including zero to excluded one.

In [None]:
np.random.rand(2,3)

Calling `np.random.randn(shape)` works like `np.random.rand(shape)` except it takes its values form an univariate “normal” (Gaussian) distribution of mean zero and variance one.

In [None]:
np.random.randn(2,3)

## Iterate

You can iterate over a numpy-array like over a standard Python list. If you iterate over a matrix you get in each iteration a row of the matrix. 

In [None]:
for row in np.arange(12).reshape(3,4):
    print(row)

## Stacking of Arrays

Multiple numpy arrays can be combined into one array by stacking them. The corresponding numpy functions `np.vstack(arrays)` and `np.hstack(arrays)`.

In [None]:
e = np.array([[1,2], [3,4]])
r = np.array([[11, 22], [33, 44]])

By using `np.vstack(arrays)` the arrays are stacked vertically. To stack two arrays vertically the shape of axis one has to match. 

In [None]:
np.vstack((e,r))

By using `np.hstack(arrays)` the arrays are stacked horizontally. To stack two arrays horizontally the shape of axis zero has to match.

In [None]:
np.hstack((e,r))