# Introduction to `matplotlib`

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

The first line is to convince the plots to display in the jupyter notebook, then we'll import numpy and matplotlib.pyplot - by convention name them np and plt!

## Random Numbers with `numpy`

Plotting stuff gets very boring when we're writing all the data...  Later on we'll look at real data but for now let's generate some.  Distributions that I use all the time are:

### Uniform Random Numbers

Fill an array of the given shape with numbers uniformly distributed on [0,1).

In [None]:
uni = np.random.rand(10)
print(uni)

### Poisson Random Numbers

First argument is $\lambda$ (mean and variance of distribution), second parameter is the shape of the array to return

In [None]:
lamb = 3
pois = np.random.poisson(lamb,10)
print(pois)

### Gaussian Random Numbers

First arguement is the mean of the distribution, $\mu$, second argument is the standard deviation, $\sigma$, and the third arguement is the shape of the array to return.

In [None]:
mu = 2.0
sigma = 1.0
gaus = np.random.normal(mu,sigma,10)
print(gaus)

For more distributions check out the documentation here: https://docs.scipy.org/doc/numpy/reference/routines.random.html

## Histograms

One of the most frequently used visualisations, let's get 1000 random numbers from a normal distribution (mean 0, sigma 1) and plot them with the default binning.

In [None]:
rands = np.random.normal(0,1,100000)

In [None]:
plt.xlabel("Relative Banana Length (International Banana Units)")
plt.ylabel("Bananas")
plt.hist(rands)

The default binning here is not great, we have enough statistics to justify having a few more, and since we know the mean of the distribution is at zero it would be better not to have a bin boundary there!  We can customise the binning as follows:

In [None]:
bins = np.linspace(-5, 5, 20)
print(bins)

In [None]:
plt.xlabel("Relative Banana Length (International Banana Units)")
plt.ylabel("Bananas")
plt.hist(rands,bins=bins)

## Line Plots

Another frequently used visualisation, especially for time series data.

In [None]:
x = np.linspace(0,10,100)
y = 100*x*x

In [None]:
plt.xlabel("Time (months since January 2017)")
plt.ylabel("Company names containing \"Blockchain\"")
plt.plot(x,y)

## Log Scales

Sometimes, although obviously not here, it is more useful to display one or both of the axes with a log scale.  Here we'll demonstrate with the y-axis:

In [None]:
plt.semilogy()
plt.xlabel("Time (months since January 2017)")
plt.ylabel("Company names containing \"Blockchain\"")
plt.plot(x,y)

## Scatter Plots

For example when invistigating possible relationships between variables, you may want to look at a scatter plot.

In [None]:
x = np.linspace(0,10,20)
y = 10*np.random.rand(20)
plt.xlabel("Number of times Elon Musk says AI will kill us (/week)")
plt.ylabel("Drones confiscated by Norwegian birds (/week)")
plt.scatter(x,y)

## Images

We can also display images.  If we are too lazy to find one, we can create them as follows - use two spatial and one colour dimensions:

In [None]:
image = np.zeros([200,200,3])
plt.imshow(image)

In [None]:
image[50:150,50:150,:] = 1

In [None]:
plt.imshow(image)

## Contour Plots

Sometimes we have a quantity that has a relationship to two different variables.  In this case you might think of making a 3-D plot - but humans are notoriously bad at reading these.  It is better to use contours and/or colour levels to make your point!

In [None]:
x = np.linspace(0,10,100)
y = np.linspace(0,10,100)
z = np.zeros([100,100])
for i in range(len(x)):
    for j in range(len(y)):
        z[j][i] = (x[i]-5)**2 + 0.5*(y[j]-3)**2
        
print(z.shape)

In [None]:
plt.contour(x,y,-z,20)
plt.xlabel("Some important thing (MJ/kg)")
plt.ylabel("Coverage of Dyson Sphere (%)")
plt.colorbar()

In [None]:
plt.pcolormesh(x,y,-z,cmap='gray')
plt.xlabel("Some important thing (MJ/kg)")
plt.ylabel("Coverage of Dyson Sphere (%)")
plt.colorbar()