# Numpy


Numpy is a high-performance numerical library for Python. This library has a large number of utility functions for scientific computation in Python. Before using it, however, we need to import the library.



In [None]:
import numpy

Now all the juicy possibilites of Numpy are at our disposal. 


### Creating and using an array


The basic data structure in Numpy is the `array`. There are several way to create an `array` (more on this later). Here we will use the simplest version: converting a list to an array

In [None]:
l = [1.0, 2.0, 3.0]
a = numpy.array(l)
print(a)

Now, we can directly perform several operation on the array that just was not possible with the list

In [None]:
print(a*2.0)

In [None]:
#print(l*2.0)

In [None]:
b = 2.0*a
c = b + a
print(c)

In [None]:
k = 2.0*l
print(k)

It is worthwhile to browse the numpy documentation. We can, for instance, sum the whole array or calculate trigonometric functions with ease

In [None]:
print(numpy.sum(a))

In [None]:
print(numpy.cos(a*numpy.pi))

## Qualified imports

As we can see from the cell above, writing `numpy` everywhere gets confusing quickly. We can mitigate by giving Numpy a nickname. The *de facto* standard to to call it `np`. 

In [None]:
import numpy as np

In [None]:
print(np.cos(a*np.pi))

## Other array creations

Often we need a special type of array. Say, filled with ones or zeros or in a sequence. Numpy got this covered.

In [None]:
print(np.ones(5)) # An array filled with ones

In [None]:
print(np.zeros(5)) # An array filled with zeros

In [None]:
print(np.linspace(start=1, stop=5, num=20)) # An array with 20 equally spaced points, starting in 1, 
                                            # and finishing in 5

In [None]:
print(np.arange(start=1, stop=5, step=0.5) )

In [None]:
print(np.random.rand(6)) # An array with random uniformely distributed numbers between 0 and 1

## Multidimensional arrays

Numpy also has support for multidimensional arrays. From a usage point of view, it is transparent to the user. All the functions that work on a linear array will work for a multidimensional array. Some, however, might require a few extra arguments.

In [None]:
print(np.ones((5,5))) # An array filled with ones

In [None]:
print(np.zeros((5,5))) # An array filled with zeros

In [None]:
print(np.eye(5)) # And array like the identity matrix

In [None]:
print(np.random.rand(5,5)) # An array with random uniformely distributed numbers between 0 and 1

As an example of a function that might need extra arguments when dealing with multidimensional arrays, we have the `sum`. If we use the `sum` function with no extra arguments, we will sum the whole array.

In [None]:
r = np.random.rand(3,4)
print(np.sum(r))

The shape of an array can be evaluated via:

In [None]:
r.shape

We can change this behavious with the `axis` argument. 

In [None]:
print(np.sum(r, axis=0))
print(np.sum(r, axis=1))

Several reduction functions work in a similar manner, where you can use the `axis` keyword to choose the axis to be reduced.

It is possible to get slices you array using the `:` character. 

In [None]:
print(r)
print()
print(r[:, 0]) # Print first column of r
print(r[0, :]) # Print first row of r
print(r[0, 1:4]) # Print the 2nd to 4th entry of the first row

Numpy functions and their clever combination are a powerful tool for data analysis. They help to avoid slow and complex loops which leads to faster and often more readable code. Since we cannot introduce all `numpy` functions here, we encourage you to take a brief look at the offical [numpy documentation](http://www.matplotlib.org/docs). Many function work in a similiar way as the ones that are introduced here. 

# Matplotlib

This is the go-to plotting library for Python. We will generate a few plots with it, just to give you a brief idea how to do simple visualizations. First, lets plot a simple $x \times y$ function

In [None]:
import matplotlib.pyplot as plt

In [None]:
x = np.linspace(0, 10, num=200)
y = 3*np.sin(2*np.pi*x) * np.exp(-x*0.3)

Matplotlib has the concept of a `figure` which can contain several coordinate-systems calles `axes`. Here, we will restrict ourselves to figures with only one main axis.

In [None]:
fig, ax = plt.subplots() # Initialize a new figure with one axis
ax.plot(x, y) # Plot a line with coordinates x, y in this axis

We can also create histograms with matplotlib.

In [None]:
fig, ax = plt.subplots()
y = np.random.randn(10000)
_ = ax.hist(y, bins=80, density=True)

And even a 2D histogram.

In [None]:
fig, ax = plt.subplots()
x = np.random.randn(100000)
y = np.random.randn(100000)
hist, xedges, yedges, image = ax.hist2d(x, y, bins=80, normed=True)
plt.colorbar(image)

Again we encourage you to have a look at the offical [matplotlib plotting documentation](https://matplotlib.org/api/pyplot_summary.html) and the [example gallery](https://matplotlib.org/gallery/index.html).

## Cool Numpy tricks

## Masking/bit list

Frequently, we want to get parts of an array that fulfill certain conditions. In that case, we can use what is called _masking_. We can pass an array of booleans (i.e., True of False elements) as the index, and get only the True subsection. 

This concept is easier to understand done than said, so lets jump right into it.

In [None]:
a = np.array([1,2,3])
mask = np.array([True, False, False])
print(a) #Print the whole array
print(a[mask]) #Print only the parts that have True in idx

One caveat is that the array and the mask _must have the same size_. If the sizes are incompatible, we will get an error

In [None]:
a = np.array([1,2,3])
mask = np.array([True, False, False, True])
print(a) #Print the whole array
print(a[mask]) #Print only the parts that have True in idx

Due to how Numpy handles comparisons, we can even make masks out of other arrays

In [None]:
a = np.array([3, 9, -1, 20, -100, 300])
mask = a > 0
print(mask)
print(a[mask])

A more complex use might for instance, to plot only the positive parts of a given function

In [None]:
x = np.linspace(0, 10, 1000) # create an array with 50 points between 0 and 1
y = 3*np.sin(2*np.pi*x) * np.exp(-x*0.3) # A funky looking function

mask = y > 0 # Useful to select only the positive parts of the function

In [None]:
fig, ax = plt.subplots() # Initialize a new figure with one axis
ax.plot(x[mask], y[mask], ".") # Plot a line with coordinates x, y in this axis

Another cool trick is that we can pass an array or list of indices, and automatically select these indices in such order

In [None]:
idx = [3, 1] # We will select the fourth and second places of the array, in that order
a = np.array([-12, -11, 0, 11, 22, 33, 44])
print(a[idx])

# Power jupyter notebook user tip of the day

Adding a `?` to then end of any function will provide you with a short help window.  
Task: Try to change the normalisation of the histogram above.

In [None]:
ax.hist2d?