# [1] Scientific computation

There are several packages that provide multidimensional data manipulation, optimization, regression, interpolation and visualization, among other possibilities.

## 0. Some arithmetic insights

### [Integers](https://docs.python.org/3/c-api/long.html)

In python, integers have arbitrary precision and therefore we can represent an arbitrarily large range of integers (only limited by the available memory).

In [None]:
x = 7**273
print(x)
print(type(x))

### [Floats](https://docs.python.org/3/tutorial/floatingpoint.html)

Python uses (hardware) [754 double precision representation](https://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64) for floats. This means that some floats can be only represented approximately.

* Using [string format](https://docs.python.org/3.4/library/string.html#string-formatting) to see the precision limitation of **doubles** in Python. For example, it is impossible to represent exactly the number `0.1`:

In [None]:
format(0.1, '.80f')

* This can give us *surprises*:

In [None]:
.1 + .1 + .1 == .3

In [None]:
.1  + .1 == .2

* For "infinite" precision float arithmetic you can use [decimal](https://docs.python.org/3/library/decimal.html#module-decimal) or [mpmath](http://mpmath.org):

In [None]:
from decimal import Decimal, getcontext

* Getting 30 digits of 1/7:

In [None]:
getcontext().prec=80
format(Decimal(1)/Decimal(7), '.80f')

* We can see how many digits are true of 1/7 using doubles:

In [None]:
format(1/7, '.80f')

In [None]:
#12345678901234567 (17 digits)

* Decimal arithmetic produces decimal objects:

In [None]:
Decimal(1)/Decimal(7)

* Decimal objects can be printed with `format`:

In [None]:
print('{:.50f}'.format(Decimal(1)/Decimal(7)))

* A more complex example: lets compute 1000 digits of the $\pi$ number using the [Bailey–Borwein–Plouffe formula](https://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%93Plouffe_formula):

$$
\pi = \sum_{k = 0}^{\infty}\Bigg[ \frac{1}{16^k} \left( \frac{4}{8k + 1} - \frac{2}{8k + 4} - \frac{1}{8k + 5} - \frac{1}{8k + 6} \right) \Bigg]
$$

In [None]:
# https://stackoverflow.com/questions/28284996/python-pi-calculation
from decimal import Decimal, getcontext
getcontext().prec=1000
my_pi= sum(1/Decimal(16)**k * 
          (Decimal(4)/(8*k+1) - 
           Decimal(2)/(8*k+4) -
           Decimal(1)/(8*k+5) -
           Decimal(1)/(8*k+6)) for k in range(1000))
'{:.1000f}'.format(my_pi)

You can visit [100,000 Digits of Pi](http://www.geom.uiuc.edu/~huberty/math5337/groupe/digits.html) or [One Million Digits of Pi](http://www.piday.org/million/) to check the correctness this code.

## 1. SciPy.org's [Numpy](http://www.numpy.org/)

Numpy provides a high-performance multidimensional array object.

### 1.1. Installation

```
pip install numpy
```

### 1.2. Why numpy?

Good running times.

In [None]:
import numpy as np

* Lets define a list and compute the sum of its elements, timing it:

In [None]:
l = list(range(0,100000)); print(type(l), l[:10])
%timeit sum(l)

* An now, lets create a numpy's array and time the sum of its elements:

In [None]:
a = np.arange(0, 100000); print(type(a), a[:10])
%timeit np.sum(a)

* And what about a *pure* C implementation of an equivalent computation: 

In [None]:
!cat sum_array.c
!gcc -O3 sum_array.c -o sum_array
%timeit !./sum_array

* Looking for informayion of numpy's *something*:

In [None]:
np.lookfor('invert')

* Remember that it's possible to use the tabulator to extend some command or to use a wildcard in Ipython to get the numpy's stuff:

In [None]:
np.*?

### 1.3. Creating (simple) [arrays](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html) in Numpy
A simple array is a grid of values, all of the same type, indexed by a tuple of nonnegative integers.

### 1.3.1. 1D arrays

* Creating an array using a list:

In [None]:
a = np.array([1, 2, 3])
print(type([1, 2, 3]))
print(type(a))

* Getting the number of dimensions of an array:

In [None]:
print(a.ndim)

* Printing an array:

In [None]:
print(a)

* Printing the *shape* (which always is a tuple) of an array:

In [None]:
print(a.shape)

* Native Python's [`len()`](https://docs.python.org/3.6/library/functions.html#len) also works:

In [None]:
print(len(a))

* More a exotic definition using [`linspace()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.linspace.html):

In [None]:
np.linspace(1., 4., 6)

* And input the data from a file:

In [None]:
np.genfromtxt('data.txt')

In [None]:
!cat data.txt

* Arrays can be created from different types of contaniers (which store complex numbers in this case):

In [None]:
c = [[1,1.0],(1+1j,.3)]
print(type(c), type(c[0]), type(c[1]))
x = np.array(c)
x

* Accessing to an element:

In [None]:
print(a, a[0], a[1])

In [None]:
a[0] = 0
print(a)

### 1.3.3. 2D arrays

* Creating a 2D array with two 1D arrays:

In [None]:
b = np.array([[1,2,3],[4,5,6]])
print(b)
print(b.shape)
print(b[1, 1])

* With zeroes:

In [None]:
a = np.zeros((5,5))
print(a)

* The default dtype is `float64`:

In [None]:
print(type(a[0][0]))

* With ones:

In [None]:
a = np.ones((5,5))
print(a)

* With an arbitrary scalar:

In [None]:
a = np.full((5,5), 2)
print(a)

* The identity matrix:

In [None]:
a = np.eye(5)
print(a)

* With random data:

In [None]:
a = np.random.random((5,5))
print(a)

In [None]:
a = np.random.random((5,5))
print(a)

* Filled with [arbitrary](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.empty_like.html) data and with a previously defined shape:

In [None]:
b = np.empty_like(a)
print(b)

* With a 1D list comprehension:

In [None]:
a = np.array([i for i in range(5)])
print(a, a[1], a.shape)

* With a 2D list comprehension:

In [None]:
a = np.array([[j+i*5 for j in range(10)] for i in range(5)])
print(a, a.shape)

* Accessing to a row of a matrix:

In [None]:
a[1] # Get row 2

* Accessing to an element of a matrix:

In [None]:
a[1][2] # Get column 3 of row 2

In [None]:
a[1,2] # Get element of coordinates (1,2)

* Getting elements of a matrix using "integer array indexing":

In [None]:
print(a)
print(a[[0, 1, 2], [3, 2, 1]])

* The same integer array indexing using comprehension lists:

In [None]:
print(a[np.array([i for i in range(3)]), np.array([i for i in range(3,0,-1)])])

* The same using [`np.arange()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.arange.html):

In [None]:
print(np.arange(3))
print(np.arange(3,0,-1))
print(a[np.arange(3), np.arange(3,0,-1)])

### 1.4. Slicing

In [None]:
a = np.array([[j+i*5 for j in range(10)] for i in range(5)])
a

* Get all rows of a matrix (the whole matrix):

In [None]:
a[:]

In [None]:
print(a)
a[::] # Notation: [starting index : stoping index : step]
# By default, start = 0, stop = maximum, step = 1

In [None]:
print(a)
a[0:]

In [None]:
print(a)
a[0::]

In [None]:
a[:a.shape[1]]

In [None]:
a[:a.shape[1]:]

* Get all rows of a matrix, except the first one:

In [None]:
print(a)
a[1:]

In [None]:
print(a)
a[1::]

* Get the first two rows of a matrix:

In [None]:
print(a)
a[0:2]

* Get the even rows of a matrix:

In [None]:
print(a)
a[0::2]

* Get the odd columns of a matrix:

In [None]:
print(a)
a[:,1::2]

* Get the odd rows of a matrix:

In [None]:
print(a)
a[:][1::2]

In [None]:
a[1::2]

* Getting the second row:

In [None]:
print(a)
a[1,:]

* Getting the third column:

In [None]:
print(a)
a[:,2]

* Getting a top-left $2\times 2$ submatrix:

In [None]:
print(a)
a[:2,:2]

* Getting a bottom-right $2\times 2$ submatrix:

In [None]:
print(a)
a[a.shape[0]-2:,a.shape[1]-2:]

In [None]:
a[a.shape[0]-2::,a.shape[1]-2::]

* Sampling in horizontal every 2 elements, starting at row 2:

In [None]:
print(a)
a[:,1::2]

 ### 1.5. Boolean array indexing

* Finding the elements bigger than ...

In [None]:
bool_idx = (a>12)
print(bool_idx)

* Printing the elements bigger than ...

In [None]:
print(a[bool_idx])

### 1.6. Elementwise (vectorial-vectorial and vectorial-scalar) math

* Create an zero-ed matrix:

In [None]:
a = np.zeros((5,5), np.int32)
print(a)

* Change to 1 from coordinate (1,1) to coordinate (4,4):

In [None]:
a[1:4,1:4] = 1
print(a)

* Vectorial-scalar addition:

In [None]:
a[1:4, 1:4] += 1
print(a)

* A new matriz:

In [None]:
b = np.ones((5,5), np.int32)
print(b)

* Vectorial addition:

In [None]:
c = a + b
print(c)

* Vectorial substraction:

In [None]:
d = c - b
print(d)

* Vectorial multiplication (not matrix multiplication!):

In [None]:
c = c * d
print(c)

* Floating-point vectorial division:

In [None]:
c = c / b
print(c)

* Fixed-point (integer) vectorial division:

In [None]:
c = d // b
print(c)

### 1.7. Broadcasting
In vectorized operations, NumPy "extends" scalars and arrays with one of its dimensions equal to 1 to the size of the other(s) array(s).

In [None]:
a = np.ones((5,3))
a

In [None]:
b = np.arange(1)
b

In [None]:
b += 1
b

* Broadcasting of a $1\times 1$ matrix:

In [None]:
a+b # 'a' is 5x3 and 'b' is 1x1

* Broadcasting of a $1\times 3$ matrix:

In [None]:
b = np.arange(3)
b

In [None]:
a+b # 'a' is 5x3 and 'b' is '1x3'

* Broadcasting of a $5\times 1$ matrix:

In [None]:
b = np.arange(5)
b

In [None]:
b = b.reshape((5,1)) # (Rows, Columns)
b

In [None]:
a+b

If the arrays have different shapes and s can not be "broadcasted", `ValueError: frames are not aligned` is thrown.

In [None]:
b = np.arange(4)[:, None]
b

In [None]:
a.shape

In [None]:
b.shape

In [None]:
a+b

### 1.8. Matricial math
Provides basic matrix computation.

* Let's define a "chessboard" matrix:

In [None]:
a = np.array([[(i+j)%2 for j in range(10)] for i in range(10)])
print(a, a.shape)

... and a 1-column matrix:

In [None]:
b = np.array([[1] for i in range(10)])
print(b, b.shape)

* Product matrix-matrix:

In [None]:
c = np.dot(a,b)
print(c)

* Sum of all elements of a matrix:

In [None]:
print(np.sum(c))

In [None]:
print(np.sum(a))

* Compute the maximum of a matrix:

In [None]:
print(np.max(c))

* Matrix transpose:

In [None]:
print(c.T, c.T.shape, c.shape)

### 1.9. How fast is array math?

In [None]:
a = np.array([[(i*10+j) for j in range(10)] for i in range(10)])
print(a, a.shape)

In [None]:
a[:1] # First row (a matrix)

In [None]:
a[:1].shape

In [None]:
a[:1][0] # First element of a matrix of one elment (a vector)

In [None]:
a[:1][0].shape

In [None]:
b = a[:1][0]
b

* Add `b[]` to all the rows of `a[][]` using scalar arithmetic:

In [None]:
c = np.empty_like(a)
def add():
    for i in range(a.shape[1]):
        for j in range(a.shape[0]):
            c[i, j] = a[i, j] + b[j]
%timeit add()
print(c)

* Add `b[]` to all the rows of `a[][]` using vectorial arithmetic:

In [None]:
c = np.empty_like(a)
def add():
    for i in range(a.shape[1]):
        c[i, :] = a[i, :] + b
%timeit add()
print(c)

* Add `b[]` to all the rows of `a[][]` using fully vectorial arithmetic:

In [None]:
%timeit c = a + b # <- broadcasting is faster
print(c)

### 1.10. Structured arrays

* Create a 1D array of (two) records, where each record has the structure (int, float, char[10]).

In [None]:
x = np.array([(1,2.,'Hello'), (3,4.,"World")],
             dtype=[('first', 'i4'),('second', 'f4'), ('third', 'S10')])
x

* Get the first element of every record:

In [None]:
x['first']

* Get the first record:

In [None]:
x[0]

* Get the second element of every record:

In [None]:
x['second']

* Third element of every record:

In [None]:
x['third']

## 2. [Matplotlib](http://matplotlib.org)
A Python 2D plotting library.

### 2.1. Installation

```
pip install matplotlib
```

### 2.2. Configure Matplotlib in-line of IPython (~Jupyter) notebook

In [None]:
%matplotlib inline

### 2.3. Importing it

In [None]:
import matplotlib.pyplot as plt

### 2.4. Drawing data structures (matrices):

In [None]:
chess_board = np.zeros([8, 8], dtype=int)
chess_board[0::2, 1::2] = 1
chess_board[1::2, 0::2] = 1
plt.matshow(chess_board, cmap=plt.cm.gray)

### 2.5. Drawing 2D curves

In [None]:
resolution = 100
x = np.arange(0, 3*np.pi, np.pi/resolution)
si = np.sin(x)
co = np.cos(x)
plt.plot(x, si, c = 'r')
plt.plot(x, co, c = 'g')
plt.legend(['$\sin(x)$', '$\cos(x)$'])
plt.xlabel('radians')
plt.title('sine($x$) vs. cosine($x$)')
plt.xticks(x*resolution, ['0', '$\pi$', '$2\pi$'], rotation='horizontal')
plt.xlim(0,3*np.pi)
plt.show()

### 2.6. Drawing 3D curves

In [None]:
x = np.array([[(x+y)/25 for x in range(256)] for y in range(256)])
si = np.sin(x)
plt.imshow(si, cmap='hot', interpolation='nearest')
plt.show()

In [None]:
# https://github.com/AeroPython/Taller-Aeropython-PyConEs16
def funcion(x,y):
    return np.cos(x) + np.sin(y)

x_1d = np.linspace(0, 5, 100)
y_1d = np.linspace(-2, 4, 100)
X, Y = np.meshgrid(x_1d, y_1d)
Z = funcion(X,Y)
plt.contourf(X, Y, Z, np.linspace(-2, 2, 100),cmap=plt.cm.Spectral)
plt.colorbar()
cs = plt.contour(X, Y, Z, np.linspace(-2, 2, 9), colors='k')
plt.clabel(cs)

## 3. [SciPy](https://docs.scipy.org/doc/scipy/reference/)
[SciPy](http://cs231n.github.io/python-numpy-tutorial/#numpy-array-indexing) provides a large number of functions that operate on numpy arrays and are useful for different types of scientific and engineering applications such as:
1. [Custering](https://docs.scipy.org/doc/scipy/reference/cluster.html).
2. [Discrete Fourier Analysis](https://docs.scipy.org/doc/scipy/reference/fftpack.html).
3. [Interpolation](https://docs.scipy.org/doc/scipy/reference/interpolate.html).
4. [Linear algebra](https://docs.scipy.org/doc/scipy/reference/linalg.html).
5. [Signal](https://docs.scipy.org/doc/scipy/reference/signal.html) and [Image processing](https://docs.scipy.org/doc/scipy/reference/ndimage.html).
6. [Optimization](https://docs.scipy.org/doc/scipy/reference/optimize.html).
7. [Sparse matrix](https://docs.scipy.org/doc/scipy/reference/sparse.html) and [sparse linear algebra](https://docs.scipy.org/doc/scipy/reference/sparse.linalg.html).



### 3.1. Installation

```
pip install scipy
```

### 3.1.1. Optimization example

In [None]:
# http://www.scipy-lectures.org/advanced/mathematical_optimization/
from scipy import optimize

In [None]:
def f(x):
    return -np.exp(-(x - .7)**2)

In [None]:
sol = optimize.brent(f)
print('min =', sol, '\nx =', f(sol))

In [None]:
x = np.arange(-10, 10, 0.1)
plt.plot(x, f(x))
plt.plot([sol],[f(sol)], 'ro')
plt.show()

## 4. [Pandas](http://pandas.pydata.org/)
High-performance data structures and data analysis tools for the Python programming language (similar to [R](https://en.wikipedia.org/wiki/R_(programming_language)). Some tools are:
1. [Statistical functions (covariance, correlation)](http://pandas.pydata.org/pandas-docs/stable/computation.html#statistical-functions).
2. [Window functions](http://pandas.pydata.org/pandas-docs/stable/computation.html#window-functions).
3. [Time series](http://pandas.pydata.org/pandas-docs/stable/timeseries.html).
4. [Analysis of sparse data](http://pandas.pydata.org/pandas-docs/stable/sparse.html).

### 4.1. Installation

```
pip3 install pandas
```

### 4.2. Example

Create a table with data:

In [None]:
import numpy as np
import pandas as pd
df = pd.DataFrame({'int_col' : [1, 2, 6, 8, -1],
                    'float_col' : [0.1, 0.2, 0.2, 10.1, None],
                    'str_col' : ['a', 'b', None, 'c', 'a']})
print(df)
df

Arithmetic average of a column:

In [None]:
df2 = df.copy()
mean = df2['float_col'].mean()
mean

Replace undefined elements:

In [None]:
df3 = df['float_col'].fillna(mean)
df3

Create a table by means of columns:

In [None]:
df4 = pd.concat([df3, df['int_col'], df['str_col']], axis=1)
df4

## 5. [SymPy](http://www.sympy.org/en/index.html)
A Python library for symbolic mathematics. Among others things, it provides:
1. [Symbolic simplification](http://docs.sympy.org/latest/tutorial/simplification.html).
2. [Calculus (derivatives, integrals, limits, and series expansions)](http://docs.sympy.org/latest/tutorial/calculus.html).
3. [Algebraic solver](http://docs.sympy.org/latest/tutorial/solvers.html).
4. [Matrix operations](http://docs.sympy.org/latest/tutorial/matrices.html).
5. [Combinatorics](http://docs.sympy.org/latest/modules/combinatorics/index.html)
6. [Cryptography](http://docs.sympy.org/latest/modules/crypto.html).

### 5.1. Install
```
pip install sympy
```

### 5.2. Example

In [None]:
from sympy import init_session
init_session(use_latex='matplotlib')

In [None]:
# https://github.com/AeroPython/Taller-Aeropython-PyConEs16
expr = cos(x)**2 + sin(x)**2
expr

In [None]:
simplify(expr)

In [None]:
expr.subs(x, y**2)

In [None]:
expr = (x + y) ** 2
expr

In [None]:
expr = expr.expand()
expr

In [None]:
expr = expr.factor()
expr

In [None]:
expr = expr.integrate(x)
expr

In [None]:
expr = expr.diff(x)
expr