# Lecture 3: Scientific programming

## Content
 
* Summary of lecture 2
* `numpy`: 
    * the `array`
    * indexing
    * some standard operations/methods
    * linear algebra `<3`
* plotting with `matplotlib`
* `scipy`:
    * distributions
    * Nonlinear solvers
    * Nonlinear optimization

## Summary of lecture 2

* Collections: `tuple` & `dict`
* Mutabillity
* Loops: `for`
& `while` loop
* Functions

In [1]:
t = (2,3)

In [3]:
t

(2, 3)

In [5]:
type(t)

tuple

In [7]:
t[0] = 1

TypeError: 'tuple' object does not support item assignment

In [8]:
l = list(t)

In [9]:
l[1] = 1

## Numpy

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. If you are already familiar with MATLAB, you might find this [tutorial](http://wiki.scipy.org/NumPy_for_Matlab_Users) useful to get started with Numpy.

The `numpy` package (module) is used in almost all numerical computation using Python. It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. It is implemented in C and Fortran so when calculations are vectorized (formulated with vectors and matrices), performance is very good. 

To use `numpy` you need to import the module, using for example:

In [1]:
import numpy as np

In [2]:
np.array(1)

array(1)

In [11]:
import numpy

In [12]:
numpy.array(1)

array(1)

In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. 



In [17]:
[1,2] + [2,3]

[1, 2, 2, 3]

In [18]:
np.array((1,2)) + np.array((2,3))

array([3, 5])

## Creating `numpy` arrays

There are a number of ways to initialize new numpy arrays, for example from

* a Python list or tuples
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.
* reading data from files

### From lists/tuples

For example, to create new vector and matrix arrays from Python lists we can use the `numpy.array` function.

In [20]:
np.array([1,2,3])

array([1, 2, 3])

In [4]:
v = np.array((1,2,3))

In [3]:
M = np.array(((1,2),(3,4)))

The `v` and `M` objects are both of the type `ndarray` that the `numpy` module provides.

In [26]:
type(v)

numpy.ndarray

In [27]:
type(M)

numpy.ndarray

The difference between the `v` and `M` arrays is only their shapes. We can get information about the shape of an array by using the `ndarray.shape` property.

In [29]:
v.shape

(3,)

Equivalently, we could use the function `numpy.shape`

In [31]:
np.shape(M)

(2, 2)

So far the `numpy.ndarray` looks awefully much like a Python list (or nested list). Why not simply use Python lists for computations instead of creating a new array type? 

There are several reasons:

* Python lists are very general. They can contain any kind of object. They are dynamically typed. They do not support mathematical functions such as matrix and dot multiplications, etc. Implementing such functions for Python lists would not be very efficient because of the dynamic typing.
* Numpy arrays are **statically typed** and **homogeneous**. The type of the elements is determined when the array is created.
* Numpy arrays are memory efficient.
* Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of `numpy` arrays can be implemented in a compiled language (C and Fortran is used).

Using the `dtype` (data type) property of an `ndarray`, we can see what type the data of an array has:

In [32]:
M.dtype

dtype('int64')

We get an error if we try to assign a value of the wrong type to an element in a numpy array:

In [34]:
M

array([[1, 2],
       [3, 4]])

In [38]:
M[0,0] = 0

In [40]:
M[1,1]

4

In [37]:
M[0,0] = 'hellooo'

ValueError: invalid literal for int() with base 10: 'hellooo'

### Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in `numpy` that generate arrays of different forms. Some of the more common are:

#### `np.arange`

In [5]:
x = np.arange(0,10)

In [6]:
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [46]:
np.arange(-1,2)

array([-1,  0,  1])

In [48]:
np.arange(-1,1,0.1) # arange(start, stop, step)

array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,
       -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,
       -2.00000000e-01, -1.00000000e-01, -2.22044605e-16,  1.00000000e-01,
        2.00000000e-01,  3.00000000e-01,  4.00000000e-01,  5.00000000e-01,
        6.00000000e-01,  7.00000000e-01,  8.00000000e-01,  9.00000000e-01])

#### `np.linspace`

In [50]:
np.linspace(0,2,10) # linspace(start, stop, number)

array([0.        , 0.22222222, 0.44444444, 0.66666667, 0.88888889,
       1.11111111, 1.33333333, 1.55555556, 1.77777778, 2.        ])

#### random data

In [7]:
np.random.rand(3,3) # draw 3x3 array from uniform distribution 0 to 1

array([[0.5445439 , 0.58401834, 0.81614184],
       [0.78413513, 0.68252897, 0.12812195],
       [0.1385071 , 0.11698397, 0.43635175]])

In [61]:
np.random.randn(3,3) # draw 3x3 from standard normal

array([[-0.79758461,  1.77934297, -0.84400175],
       [ 0.33449603, -0.11293938, -0.29898686],
       [-1.50801561, -0.7851967 ,  0.10904272]])

In [63]:
stuff = [1,2,3,4,5,6]

In [65]:
np.random.choice(stuff)

1

#### zeros and ones

In [8]:
np.zeros(3)

array([0., 0., 0.])

In [59]:
np.zeros((2,2))

array([[0., 0.],
       [0., 0.]])

In [9]:
np.ones(4)

array([1., 1., 1., 1.])

## Manipulating arrays

### Indexing

We can index elements in an array using square brackets and indices:

In [11]:
v

array([1, 2, 3])

In [13]:
v[1]

2

If we omit an index of a multidimensional array it returns the whole row (or, in general, a N-1 dimensional array) 

In [21]:
M.shape

(2, 2)

In [22]:
M

array([[1, 2],
       [3, 4]])

In [25]:
M[0,1]

2

The same thing can be achieved with using `:` instead of an index: 

In [27]:
M[:,0]

array([1, 3])

In [29]:
M[1,:]

array([3, 4])

In [30]:
M[1]

array([3, 4])

We can assign new values to elements in an array using indexing:

In [32]:
M[0,0] = 9

In [37]:
M[:,0] = [5,5]

In [38]:
M

array([[5, 2],
       [5, 4]])

In [42]:
M[:,0] = 6.

In [43]:
M

array([[6, 2],
       [6, 4]])

### Index slicing

Index slicing is the technical name for the syntax `M[lower:upper:step]` to extract part of an array:

In [44]:
A = np.arange(5)

In [46]:
A

array([0, 1, 2, 3, 4])

In [47]:
A[1:4]

array([1, 2, 3])

Array slices are *mutable*: if they are assigned a new value the original array from which the slice was extracted is modified:

In [51]:
A[-3:-1] = [5,5]

In [53]:
A = np.arange(9)

In [54]:
A[1:8:2]

array([1, 3, 5, 7])

We can omit any of the three parameters in `M[lower:upper:step]`:

In [56]:
A[::2]

array([0, 2, 4, 6, 8])

In [57]:
A[::-1]

array([8, 7, 6, 5, 4, 3, 2, 1, 0])

Index slicing works exactly the same way for multidimensional arrays:

In [58]:
A = np.array([[0,1,2],[3,4,5],[6,7,8]])

In [60]:
A[1:,1:]

array([[4, 5],
       [7, 8]])

### Fancy indexing

Fancy indexing is the name for when an array or list is used in-place of an index: 

In [61]:
row_inds = [1,2]

In [63]:
A[row_inds]

array([[3, 4, 5],
       [6, 7, 8]])

In [65]:
A[:,row_inds]

array([[1, 2],
       [4, 5],
       [7, 8]])

In [67]:
A[row_inds,row_inds]

array([4, 8])

We can also use index masks: If the index mask is an Numpy array of data type `bool`, then an element is selected (True) or not (False) depending on the value of the index mask at the position of each element: 

In [68]:
mask = [True, False, True]

In [70]:
A[mask]

array([[0, 1, 2],
       [6, 7, 8]])

In [71]:
A[[0,2]]

array([[0, 1, 2],
       [6, 7, 8]])

This feature is very useful to conditionally select elements from an array, using for example comparison operators:

In [72]:
a = np.arange(4)

In [74]:
a

array([0, 1, 2, 3])

In [76]:
vob = a > 1

In [78]:
vob

array([False, False,  True,  True])

In [81]:
a[vob]

array([2, 3])

In [82]:
b = np.linspace(0,1,10)

In [83]:
b

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [85]:
uu = (b > 0.1) & (b < 0.8)

In [86]:
b[uu]

array([0.11111111, 0.22222222, 0.33333333, 0.44444444, 0.55555556,
       0.66666667, 0.77777778])

`np.all` and `np.any`:

In [89]:
vob

array([False, False,  True,  True])

In [91]:
np.all(vob)

False

In [92]:
np.any(vob)

True

## Some standard operations

Lets us create some very precious data that we want to learn more about.

In [94]:
data = np.random.randn(100,3)*2 + 1

In [96]:
data

array([[-0.19367477, -0.7292697 , -0.30796199],
       [ 0.38174476,  2.9034114 ,  4.07804743],
       [ 3.58802175, -3.16127487,  1.50013969],
       [-0.26199405, -0.17974927,  4.74553505],
       [ 0.12604224, -1.9707826 ,  1.89637518],
       [ 2.74102622,  0.2599855 ,  1.56831222],
       [ 0.98556606, -1.40974092,  2.87927519],
       [ 1.41893653, -1.8589449 ,  3.46379432],
       [ 1.66462936, -0.77093358, -1.09630283],
       [ 2.02926533,  2.52534557,  4.82592788],
       [ 2.69344699,  1.60930107,  0.32880827],
       [ 2.67563075,  0.17022113,  2.02656665],
       [ 2.52402415,  0.373657  , -0.29992512],
       [ 0.69897013, -0.08526874, -0.99221047],
       [ 2.88184367, -1.46310727,  1.13923243],
       [-1.99795728,  1.88472869,  4.04153881],
       [ 0.96689003,  1.5280281 , -0.48289774],
       [-0.78558659,  2.17778785, -1.669883  ],
       [ 2.95276926, -1.75534467,  0.9806363 ],
       [ 0.99439731,  0.80832255, -1.27285626],
       [ 1.04928037,  3.58374098,  0.627

In [97]:
data.shape

(100, 3)

#### sum

In [99]:
np.sum(data)

287.20294118112884

In [101]:
np.sum(data[0])

-1.2309064633488327

#### mean

In [105]:
np.mean(data)

0.9573431372704294

In [108]:
np.mean(data, 0)

array([1.08735245, 0.83350721, 0.95116975])

#### standard deviations and variance

In [113]:
np.std(data, 1)

array([0.23031951, 1.54204863, 2.82136771, 2.34142733, 1.58063498,
       1.01338487, 1.75497037, 2.19234788, 1.23200733, 1.21838217,
       0.96646839, 1.06167479, 1.20427393, 0.69102707, 1.785356  ,
       2.49894629, 0.84725591, 1.64546985, 1.93049323, 1.02774661,
       1.30569466, 2.10819265, 0.51953447, 1.61864218, 3.18675165,
       1.74465633, 2.38166476, 2.33333811, 1.08492007, 3.61349412,
       1.37444096, 1.02017875, 1.51534792, 1.47591883, 0.62891777,
       0.93069977, 1.75724936, 1.96177479, 1.09070202, 2.39703066,
       0.28876931, 1.01854131, 2.0445897 , 0.83100186, 0.89393775,
       1.81395366, 2.1156957 , 4.65324099, 2.21328588, 2.72347615,
       1.22340478, 1.5323173 , 1.16126708, 1.55948463, 1.37046742,
       2.5102498 , 1.11227394, 0.5467652 , 0.86745568, 2.98423307,
       2.03393221, 1.22883938, 2.33050805, 0.9574533 , 1.89538851,
       0.67977317, 2.37899018, 0.87152884, 0.27355081, 1.15791793,
       0.66965821, 1.41452544, 2.14494082, 1.33204414, 1.39279

In [114]:
np.var(data, 0)

array([4.61892114, 4.28217533, 4.3364137 ])

#### min and max

In [116]:
np.min(data)

-4.372132164405886

In [121]:
np.min(data, 0)

array([-3.59799792, -4.37213216, -3.3343821 ])

In [122]:
np.min(data, axis=0)

array([-3.59799792, -4.37213216, -3.3343821 ])

In [120]:
np.max(data, 1)

array([-0.19367477,  4.07804743,  3.58802175,  4.74553505,  1.89637518,
        2.74102622,  2.87927519,  3.46379432,  1.66462936,  4.82592788,
        2.69344699,  2.67563075,  2.52402415,  0.69897013,  2.88184367,
        4.04153881,  1.5280281 ,  2.17778785,  2.95276926,  0.99439731,
        3.58374098,  4.20334714,  3.21588537,  1.65925019,  6.7565725 ,
        1.74993486,  3.84352308,  1.76867359,  2.21231802,  5.91787181,
        1.00350278,  1.60515678,  0.94056318,  2.12366324, -1.37331855,
        2.44871316,  1.86822974,  2.04295932,  2.00873073,  3.87311083,
        1.75554065,  0.7133915 ,  4.22780838,  3.39570808,  2.82552612,
        2.8189137 ,  3.0512197 ,  6.91575598,  3.64913905,  2.89734937,
        3.50467633,  2.48817413,  3.11946429,  3.94328027, -0.203553  ,
        5.56252684,  3.11606211,  1.7357665 ,  2.7374942 ,  4.32187145,
        2.51240101,  2.10386248,  4.14616536,  0.97334523,  4.6840102 ,
        0.37848791,  2.79696319,  1.95185052,  1.63279072,  1.70

### Reshaping and stacking arrays

The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast operation even for large arrays. Let us create `A`.

In [124]:
a = np.arange(4*4)

In [126]:
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [127]:
b = np.reshape(a, (4,4))

In [129]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [131]:
b = np.reshape(a, (4,5))

ValueError: cannot reshape array of size 16 into shape (4,5)

In [133]:
a.reshape(4,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [136]:
c = np.ones(3)

In [137]:
d = np.zeros(3)

In [141]:
np.hstack((c,d))

array([1., 1., 1., 0., 0., 0.])

In [142]:
np.vstack((c,d))

array([[1., 1., 1.],
       [0., 0., 0.]])

## Copy and "deep copy"

To achieve high performance, assignments in Python usually do not copy the underlaying objects. This is important for example when objects are passed between functions, to avoid an excessive amount of memory copying when it is not necessary (technical term: pass by reference). 

In [143]:
A = np.arange(4).reshape(2,2)

In [145]:
A 

array([[0, 1],
       [2, 3]])

In [146]:
B = A

In [147]:
B[0,0] = 9

In [149]:
B

array([[9, 1],
       [2, 3]])

In [150]:
A

array([[9, 1],
       [2, 3]])

If we want to avoid this behavior, so that when we get a new completely independent object `B` copied from `A`, then we need to do a so-called "deep copy" using the function `copy`:

In [151]:
C = np.copy(A)

In [152]:
C[0,0] = -99

In [154]:
C

array([[-99,   1],
       [  2,   3]])

In [155]:
A

array([[9, 1],
       [2, 3]])

## Vectorization

Vectorizing code is the key to writing efficient numerical calculation with Python/Numpy. That means that as much as possible of a program should be formulated in terms of matrix and vector operations, like matrix-matrix multiplication.

### Scalar-array operations

We can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers.

In [156]:
v1 = np.arange(5)

In [157]:
v1

array([0, 1, 2, 3, 4])

In [159]:
v1 * 2

array([0, 2, 4, 6, 8])

In [160]:
dont_use_list = [1,2,3]

In [162]:
dont_use_list * 2

[1, 2, 3, 1, 2, 3]

In [163]:
v1 + 1

array([1, 2, 3, 4, 5])

### Element-wise array-array operations

When we add, subtract, multiply and divide arrays with each other, the default behaviour is **element-wise** operations:

In [165]:
v1 * v1

array([ 0,  1,  4,  9, 16])

In [166]:
v1**2

array([ 0,  1,  4,  9, 16])

In [168]:
v2 = np.array((-1,2))

In [169]:
v1 * v2

ValueError: operands could not be broadcast together with shapes (5,) (2,) 

What about matrix mutiplication? There are two ways. We can either use the `dot` function, which applies a matrix-matrix, matrix-vector, or inner vector multiplication to its two arguments: 

In [171]:
v1 @ v1

30

See also the related functions: `inner`, `outer`, `cross`, `kron`, `tensordot`. Try for example `help(kron)`.

#### `np.diag`

In [174]:
M

array([[6, 2],
       [6, 4]])

In [175]:
np.diag(M)

array([6, 4])

Let us experiment with matrices

$$ 
  A = \begin{bmatrix} 2 & -1 \\ 3 & 0 \end{bmatrix} 
  \quad \text{and} \quad
  b = \begin{bmatrix} 1 \\ 1 \end{bmatrix}
$$

In [176]:
A = [[2, -1],
     [3, 0]]
A = np.array(A) # Convert from list to NumPy array
b = np.ones((2, 1))  # Shape is 2 x 1

#### Inverse

In [177]:
np.linalg.inv(A)

array([[ 0.        ,  0.33333333],
       [-1.        ,  0.66666667]])

#### Determinant

In [178]:
np.linalg.det(A)

3.0000000000000004

#### Eigenvalues and eigenvectors

In [180]:
np.linalg.eig(A)

(array([1.+1.41421356j, 1.-1.41421356j]),
 array([[0.28867513+0.40824829j, 0.28867513-0.40824829j],
        [0.8660254 +0.j        , 0.8660254 -0.j        ]]))

In [183]:
A @ b

array([[1.],
       [3.]])

In [185]:
b.reshape(1,-1) @ A

array([[ 5., -1.]])

Let us solve (for $x$) the problem 
$$A x = b$$

In [188]:
x = np.linalg.inv(A) @ b

## Exercises

#### 1 Indexing
a) create an array `a` from 0 to 9

In [192]:
a = np.arange(10)

In [193]:
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

b) get only the first 3 values of `a`

In [195]:
a[:3]

array([0, 1, 2])

c) create a random array `b` of size 9 with $b_i \sim N(0,1)$.

In [197]:
b = np.random.randn(9)

In [198]:
b

array([ 0.12431908, -1.07737412, -0.91037138,  0.70498443,  1.11958298,
       -0.40883085,  2.34907882, -0.24955598,  0.97070726])

d) get all the values of `b` that are larger than one.

In [199]:
b[b > 1]

array([1.11958298, 2.34907882])

#### 2 Standard operations
a) create a random array `b` of size 9 with $b_i \sim N(0,1)$.

In [200]:
b

array([ 0.12431908, -1.07737412, -0.91037138,  0.70498443,  1.11958298,
       -0.40883085,  2.34907882, -0.24955598,  0.97070726])

b) get the lowest and highest value of `b`.

In [202]:
low, high = np.min(b), np.max(b)
low, high

(-1.077374119334417, 2.3490788235830395)

c) create an array `c` of size 9 and stack it horizontally on `b`. Call the result `d`.

In [204]:
c = np.zeros(9)

In [206]:
d = np.hstack((c,b))

In [207]:
d

array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.12431908,
       -1.07737412, -0.91037138,  0.70498443,  1.11958298, -0.40883085,
        2.34907882, -0.24955598,  0.97070726])

d) get the mean of `d`.

In [208]:
np.mean(d)

0.14569668071224276

e) create a null vector of size 10 but the fifth value which is 1

In [213]:
vec = np.zeros(10)
vec[5] = 1

In [214]:
vec

array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])

f) create a vector with values ranging from 10 to 49


In [218]:
m = np.arange(10,50)

g) reverse the vector (first element becomes last)

In [225]:
m[0:-1:2]

array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
       44, 46, 48])

In [226]:
m[::-1]

array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10])

In [221]:
#[fistvalue:lastvalue:step]

h) create a 3x3 matrix with values ranging from 0 to 8 

In [227]:
np.arange(9).reshape(3,3)

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

i) find indices of non-zero elements from [1,2,0,0,4,0] 

In [235]:
n = np.array([1,2,0,0,4,0])
n != 0
inds = np.arange(len(n))
inds[n != 0]

array([0, 1, 4])

j) create a 3x3 identity matrix 

In [236]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])