# Python's Main Scientific Libraries

Material originally prepared by John Stachurski.

## Libraries

Anaconda comes with many libraries for scientific computing pre-installed.

The most important ones are

* NumPy
* SciPy
* Matplotlib
* Pandas
* Numba


Attributes are loaded into memory using `import`

In [None]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import datasets
from sklearn.decomposition import PCA

%matplotlib inline

# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]  
y = iris.target
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5


fig = plt.figure(1, figsize=(8, 6))
ax = Axes3D(fig, elev=-150, azim=110)
X_reduced = PCA(n_components=3).fit_transform(iris.data)
ax.scatter(X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2], c=y,
           cmap=plt.cm.Set1, edgecolor='k', s=40)
ax.set_title("First three PCA directions")
ax.set_xlabel("1st eigenvector")
ax.w_xaxis.set_ticklabels([])
ax.set_ylabel("2nd eigenvector")
ax.w_yaxis.set_ticklabels([])
ax.set_zlabel("3rd eigenvector")
ax.w_zaxis.set_ticklabels([])

plt.show()

### NumPy

A library for fast array/vector/matrix processing in Python.

In [None]:
import numpy as np

#### Elementary functions

With NumPy we can access standard functions like $\exp$, $\sin$, $\cos$, etc.

In [None]:
x = 0
np.exp(x)

In [None]:
np.cos(x)

In [None]:
np.sin(x)

#### Arrays

We can make an "array" of evenly spaced numbers:

In [None]:
x = np.linspace(-3, 3, 5)

In [None]:
x

The functions listed above work directly on arrays:

In [None]:
np.exp(x)

In [None]:
np.sin(x)

Basic arithmetic operators are "vectorized"

In [None]:
x

In [None]:
2 * x

In [None]:
2 * x - 1

#### Reductions

In [None]:
np.sum(x)

In [None]:
np.mean(x)

In [None]:
np.std(x)

In [None]:
np.max(x)

In [None]:
np.min(x)

#### Matrix algebra

In [None]:
A = np.random.randn(2, 2)
B = np.random.randn(2, 3)

In [None]:
A

In [None]:
B

In [None]:
A @ B  # matrix multiplication

#### Types and speed

Arrays have to be homogeneous in terms of data type

In [None]:
x

In [None]:
x[0] = "foobar"

Homogeneity makes them fast and efficient.

In [None]:
x = np.random.randn(1_000_000)

In [None]:
np.sum(2 * x - x**2)

In [None]:
%%timeit 

np.sum(2 * x - x**2)

In [None]:
%%timeit

y = 0.0
for val in x:
    y = y + 2 * val - val**2

### JIT compilation via Numba

In [None]:
from numba import jit

@jit
def sum_vec(v):
    y = 0.0
    for val in v:
        y = y + 2 * val - val**2
    return y

In [None]:
sum_vec(x)

In [None]:
%%timeit 

sum_vec(x)

### Matplotlib

The next line says "show all figures inside the browser"

In [None]:
%matplotlib inline

Now let's import the main Python plotting library, called Matplotlib.

In [None]:
import matplotlib.pyplot as plt

#### Our first plot

In [None]:
fig, ax = plt.subplots()

x = np.linspace(-np.pi, np.pi, 100)
y = np.sin(x)
ax.plot(x, y)


A plot with two lines and a legend:

In [None]:
fig, ax = plt.subplots()

y1 = np.sin(x)
y2 = np.cos(x)
ax.plot(x, y1, label='sine')
ax.plot(x, y2, label='cosine')
ax.legend()

### An Example

In [None]:
def g(x, β=0.5):
    return x * np.exp(-β * x)

In [None]:
g(1)

In [None]:
g(10)

In [None]:
fig, ax = plt.subplots()

x = np.linspace(0, 10, 100)
y = g(x)
ax.plot(x, y)

In [None]:
def h(x):
    return np.abs(np.sin(x))

fig, ax = plt.subplots()

x = np.linspace(0, 10, 100)
y = h(x)
ax.plot(x, y)

### SciPy

A useful collection of subpackages for numerical methods.

* linear algebra
* numerical optimization and root finding
* statistics and probability
* interpolation and approximation
* etc.

In [None]:
from scipy.linalg import eigvals

In [None]:
eigvals(np.random.randn(2, 2))

In [None]:
def f(x):
    return x**3

In [None]:
fig, ax = plt.subplots()
x = np.linspace(-1, 1, 100)
ax.plot(x, f(x))
ax.plot(x, 0 * x)

In [None]:
from scipy.optimize import brentq

Find the root of $f$ on the interval $[-1, 1]$

In [None]:
brentq(f, -1, 1)

### Exercises

Plot the function

$$ f(x) = \sin(2x) - 2 \sin(x) $$

on the interval $[-10, 10]$.