# NumPy and SciPy

## What is NumPy?
* "MATLAB in python"
* provides mathematical functionality for python
* array and matrix datatypes
* element-wise calculations
* faster than lists

## Basics - Array creation

In [None]:
import numpy

# or, for convenience:
import numpy as np     # you could use any name, but "np" is stamdard

Arrays can be created from:
* python lists or sequences
* functions
* strings or files

They can only contain one data type!


In [None]:
a = np.array([1,2,3,4])         # data type is guessed from the values
b = np.array([5,6,7,8], float)  # but can also be specified explicitly
c = np.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6]]) # 2D-array (matrix)
d = np.arange(0,15)             # like range, but an numpy-array

In [None]:
print(a)
print(b)
print(c)
print(d)

In [None]:
np.linspace(0, 2, 9)   # (start, end, num_steps)

In [None]:
np.zeros((3,2))   # ((rows, cols))   argument has to be a tuple!

In [None]:
np.ones((3,2)) 

In [None]:
np.random.random((3,2)) 

## Arrays work element-wise!

In [None]:
list_a = [1,2,3,4]
list_b = [8,7,6,5]
array_a = np.array(list_a)
array_b = np.array(list_b)

In [None]:
array_a+array_b

In [None]:
list_a + list_b

In [None]:
array_a + 4

In [None]:
array_a * 4

In [None]:
list_a * 4

In [None]:
array_a < 3

In [None]:
array_a * array_b   # element-wise

In [None]:
array_a @ array_b   # matrix-product (dot-product)

In [None]:
np.sin(array_a)

## Indexing
very similar to MATLAB

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

In [None]:
a[0,2]     # row, col

In [None]:
a[1,:]     # all elements in row with index 1

In [None]:
a[:,2]     # all elements in col with index 2

## Array properties

In [None]:
a.shape    # again, (row, col)

In [None]:
a.dtype

In [None]:
a.size    # total number of elements

In [None]:
a.ndim    # number of dimensions

In [None]:
b = np.array((1.1, 2.2, 3.8))
b.dtype

## Working with arrays

In [None]:
a = np.array([1,2,3,4])
b = np.array([8,7,6,5])

Datatype can be changed...

In [None]:
c = b.astype(int)
print(c)

Array concatenation

In [None]:
a = np.array([1,2,3,4])
b = np.array([8,7,6,5])

In [None]:
a+b   # not like this

In [None]:
np.append(a,b)

In [None]:
np.hstack((a,b))     # tuple required -> two brackets

In [None]:
c = np.vstack((a,b))     # tuple required -> two brackets
print(c)

In [None]:
c.transpose()

In [None]:
c.T

In [None]:
c.flatten()

In [None]:
for i in b:
    print(i)

In [None]:
b.sum()

In [None]:
b.prod()

In [None]:
b.mean()

In [None]:
b.min()

In [None]:
b.max()

In [None]:
np.abs([-5, 4, -3, 2, -1])

## NumPy defines some mathematical constants

In [None]:
np.pi

In [None]:
np.e

## NumPy can fit polynomials

In [None]:
x = [1, 2, 3, 4, 5, 6, 7, 8]
y = [0, 2, 1, 3, 7, 10, 11, 19]
fitted_curve = np.poly1d(np.polyfit(x, y, 2))    # x-values, f(x)-values, degree

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(x, fitted_curve(x))
plt.scatter(x,y, c="r")

## SciPy

Library used for scientific computing and technical computing
* optimization
* linear algebra
* integration
* interpolation
* FFT
* signal and image processing

### How to use it?

Import the wanted function with
```python
from scipy.module import function
```

Don't know how the function or module is called?
Google what you want to do, follow the link to the scipy-documentation...

f.ex. gradient descent (find local minimum of n-dimensional function):
https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.optimize.minimize.html

```python
from scipy.optimize import minimize
```

### For the second assignment: Pearson correlation coefficient  

scipy.stats.pearsonr(x,y)
returns: r (Pearson's correlation coefficent)
         p-value

The Pearson correlation coefficient measures the linear relationship between two datasets. Strictly speaking, Pearson’s correlation requires that each dataset be normally distributed, and not necessarily zero-mean. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact linear relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.

The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so.

source: https://docs.scipy.org/doc/scipy-0.19.1/reference/generated/scipy.stats.pearsonr.html

In [64]:
from scipy.stats import pearsonr

In [65]:
a = np.sin(np.linspace(0,10*np.pi, 1000))
b = -np.sin(np.linspace(0,10*np.pi, 1000))
r, _ = pearsonr(a, b)
print(r)

-1.0
