<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /></a><div align="center">This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.</div>

# Python basics

This course assumes some familiarity with computer programming,
and basic knowledge of the Python programming language and NumPy,
such has you received in the **BIO 134** course.

For a quick refresher, you can have a look at:

* [Learn _X_ in _Y_ minutes, where _X=Python3_](https://learnxinyminutes.com/docs/python3/), <https://learnxinyminutes.com/docs/python3/>
* [NumPy for MATLAB users](http://mathesaurus.sourceforge.net/matlab-numpy.html), <http://mathesaurus.sourceforge.net/matlab-numpy.html>

## Lists

Lists are a built-in Python type, that provides an _ordered collection of items, possibly of different type._

To create a list, enclose the items in square brackets:

In [None]:
l = [1, 2.0, 'three']

You can access individual items of a list by using Python's `[]` operator:

In [None]:
l[0]

In [None]:
l[1]

**Note:** In Python, indices always start from 0.

You can also *modify* a list by assigning to specific places:

In [None]:
# replace 2nd item in `l`
l[1] = 2

# replace 4rd item in `l`
l[2] = 3

In [None]:
# `l` is now a fully numeric list
print(l)

Python provides a few built-in functions for computing features of lists of numbers:

In [None]:
# return number of items in list
len(l)

In [None]:
# return sum of numbers in list
sum(l)

In [None]:
# return maximum item in list
max(l)

In [None]:
# return minimum item in list
min(l)

A typical pattern is to build a list one item at a time, by starting with an empty list and appending items one at a time:

In [None]:
square5 = []
for n in range(5):
    square5.append(n)
    
print(square5)

In addition to `.append()` for appending items to a list, Python provides [many list operations](https://www.programiz.com/python-programming/methods/list), like `.sort()` for sorting a list *in-place*, `.count(x)` for counting the occurrences of an item `x`, etc.

## Functions

Functions are defined with the `def` keyword; after definition, they can be called anywhere by using the standard notation *fn(params)*.

Example:

In [None]:
def squares(nums):
    "Given a list `nums` of numbers, return list of their squares."
    result = []
    for item in nums:
        result.append(item * item)
    return result

s = squares([1,2,3])
print(s)

## Using external libraries

Python comes with little functionality built-in.  Most functions must be _imported_ from _packages_.

For instance the [`random`](https://docs.python.org/3/library/random.html) package provides functions for generating (pseudo) random numbers. With the following instruction, we can use all functions in the `random` package by prefixing their name with `random.`:

In [None]:
import random

In [None]:
# this yields a different result each time it's evaluated!
random.random()

-----

## Exercise 1.

Generate a list of 10 random numbers.

In [None]:
# insert your code here and evaluate the cell

## Exercise 2.

Write a function `randlist(N)` that generates and returns a list of *N* random numbers.

In [None]:
# insert your code here and evaluate the cell

## Exercise 3.

Write a function `avg(L)` which, given a list `L` of numbers, returns their mean value.

In [None]:
# insert your code here and evaluate the cell

## Exercise 4. 

Write a function `median(L)` which, given a list `L` of numbers, returns their median value.

In [None]:
# insert your code here and evaluate the cell

## Exercise 5. _(difficult)_ 

Write a function `mode(L)` which, given a list `L` of numbers, returns its *mode*, i.e., the number that occurs most frequently in the list or `None` if the distribution of values is multi-modal. _(Hint: use dictionaries.)_

In [None]:
# insert your code here and evaluate the cell

In [None]:
# Initialize the matplotlib figure
fig, ax = plt.subplots(1, figsize=(10, 7))

# Plot the total crashes
sns.barplot(x="age", y="produces", data=data4, label="Nr. of distinct animal names")

# Add a legend and informative axis label
#ax.legend(ncol=2, loc="lower right", frameon=True)
#ax.set(xlim=(0, 24), ylabel="", xlabel="Produced animal names per age group")
#sns.despine(left=True, bottom=True)

-----

## Plotting

[Matplotlib](http://matplotlib.org/gallery.html) is the most-used plotting library in the Python community: it provides a large array of (mostly low level) facilities for making plots, and a more high-level interface largely inspired by MATLAB plotting system.

[Seaborn](http://seaborn.pydata.org/index.html) is an add-on library that provides:

* better default visual styles
* easier plotting functions for many commonly-used types of plots

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt
import seaborn as sns

Function `sns.barplot` can be used to make a bar plot:

In [None]:
sns.barplot(x=[1,2,3,4], y=[1,4,9,16])

Function `plt.plot()` is used to make a line plot instead (*note:* no explicit `x=` and `y=` here):

In [None]:
plt.plot([1,2,3,4], [1,4,9,16])

NumPy arrays can (and should!) be used in place of lists when doing any serious plotting.

Placing plots side-by-side or arranging them in a grid takes a bit more work:

1. Initialize a grid of plots, through function `fig, axes = plt.subplots(rows, columns)`.
2. Select the position of a plot in the grid by extracting its "canvas" from the `axes` array: first index is row (0 = top), second index is column (0 = leftmost): e.g., `canvas = axes[1][2]` is the third column plot on second row.
3. Place a plot: for the `sns.*plot()` functions the canvas is just an additional parameter `ax=...`, for line plots one must change `plt.` with the canvas object.

In [None]:
# initialize a 2x2 grid of plots
fig, axes = plt.subplots(2, 2, figsize=[10, 7])

# `axes` is the grid of plotting canvaes: axes[row][col]
ul = axes[0][0]  # upper left
ur = axes[0][1]  # upper right
ll = axes[1][0]  # lower left
lr = axes[1][1]  # lower right

# the `sns.*plot()` functions require the drawing canvas as additional parameter `ax=`
sns.barplot(x=[1,2,3], y=[1,4,9], ax=ul)

# in the `plot.plot()` function one must substitute `plt.` with the canvas object
lr.plot([1,2,3], [1,0,-1], color='red')

## NumPy

NumPy is a Python package that provides:

* a multi-dimensional *array* and whole-array operations
* fast matrix and vector operations _(not covered here)_
* a library of mathematical functions

To use NumPy in your code, you first need to import it:

In [None]:
import numpy as np

After this `import ... as` statement, you can use all NumPy functions by prefixing them with `np.`

## NumPy arrays

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.

NumPy arrays (also called *ndarray*) are [constructed](https://docs.scipy.org/doc/numpy/user/basics.creation.html) with function `np.array`::

In [None]:
a = np.array([1, 2, 3, 4, 5])

They behave behave a lot like Python lists:

In [None]:
# get items using the [] operator
a[0]

In [None]:
# get a slice using [:]
a[0:3]

Note that again, a slice of a *ndarray* is again a *ndarray*.

In [None]:
# set an item using [] =
a[4] = 0

In [None]:
print(a)

Adding elements to an *ndarray* does not work as for *lists*

In [None]:
a.append(4)

Note that *ndarray*'s are **homogeneous** -- you cannot mix e.g. numbers and strings, but neither integers and floats:

In [None]:
a[4] = 'five'

In [None]:
# here, the `5.5` is automatically converted to integer
a[4] = 5.5

In [None]:
a

### Setting the type of array items

Arrays are homogeneous (all elements must have the same type) and the [type](https://docs.scipy.org/doc/numpy/user/basics.types.html) is set at array creation time:

In [None]:
a = np.array([1, 2, 3, 4, 5], dtype=np.float64)

a[4] = 5.5

# show a
print(repr(a))

### Shortcuts for creating special arrays

Functions are available to create special arrays and matrices (see [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.eye.html#numpy.eye) for a complete reference):
    
* [`np.zeros(n)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.diag) return a 1D array of zeros with `n` elements;
* [`np.ones(n)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.diag) return a 1D array of 1's with `n` elements;
* [`np.random.rand(n)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.diag) return a 1D array of `n` random values, sampled from a uniform distribution over the real interval `[0, 1)`.
* [`np.random.randn(n)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.diag) return a 1D array of `n` random values, sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1.

The `np.arange()` function can be used to generate an *ndarray* containing equally-spaced points from an interval on the real line:

In [None]:
# points in the interval [0,4) spaces 0.1 apart 
x = np.arange(0, 4, 0.1)

In [None]:
x

We can easily create an array with constant value:

In [None]:
y1 = np.ones(len(x))

In [None]:
y1

A line plot of y1 over x will just show a flat horizontal line:

In [None]:
plt.plot(x, y1)

The `np.linspace()` function works just like `np.arange()` but its third parameter is the *number of subdivisions of the interval* (as opposed to the difference of two consecutive points):

In [None]:
y2 = np.linspace(0.5, 3, len(x))

In [None]:
y2

In [None]:
plt.plot(x, y2)

-----

## Exercise 6.

How can you modify the plotting code above to make the line red?  How can you make it thicker?

In [None]:
# hint: look at the docs for `matplotlib.pyplot.plot`

-----

Plotting different series of data in the same figure requires a bit more work:

1. First use the [`plt.subplots`](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.subplots) function to create *figure* and an *axes* object
2. An [*axes* object](http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes) is a "frame" for a single plot -- use methods [`.plot()`](http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.plot.html#matplotlib.axes.Axes.plot) to lay a graph onto the canvas.  Each invocation of [`.plot()`](http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.plot.html#matplotlib.axes.Axes.plot) *adds* a plot onto the canvas. 
3. The [*figure* object](http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure) contains all the axes can be used for saving the final output with [`.savefig()`](http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure.savefig)

In [None]:
fig, ax = plt.subplots(1, 1, figsize=[10, 7])

ax.plot(x, y1)
ax.plot(x, y2)

## Whole-array operations

NumPy arrays implement operators `+`, `-`, `*`, `/` as *element-wise operations*.

For instance, the array sum `a+b` is the array whose `i`-th element is `a[i]+b[i]`:

In [None]:
a = np.array([1, 2, 3])
b = np.array([0.1, 0.2, 0.3])
c = a + b

In [None]:
c

Operations involving an array and a scalar value promote the scalar to a constant array having the same *shape*:

In [None]:
y_a = y2 + 1.0
y_b = y2 - 1.0

In [None]:
fig, ax = plt.subplots(1, 1, figsize=[10, 7])

ax.plot(x, y_a, linestyle='-.')
ax.plot(x, y2)
ax.plot(x, y_b, linestyle='-.')

-----

## Exercise 7.

Plot the graph of function $f(x) = 1/x$

In [None]:
# insert your code here and evaluate the cell

-----

# Mathematical functions

NumPy provides a [vast choice of mathematical functions](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs) that *operate element-wise on arrays:* among them `np.sin`, `np.cos`, `np.tan`,
`np.log`, `np.log2`, `np.log10`, `np.exp`.

-----

## Exercise 8.

Plot the functions $sin(x)$ and $2 \cdot cos(x) - 1$ on the interval $[-\pi, +\pi]$.

In [None]:
# insert code here and evaluate

-----

## Saving and loading

NumPy provides functions [`array = np.loadtxt(filename, dtype)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt) and [`np.savetxt(filename, array)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html) that should be used to do *efficient* I/O of large arrays.

SciPy additionally provides functions [`scipy.io.loadmat()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html) and [`scipy.io.loadmat()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.savemat.html) to load/save matrices in MATLAB `.mat` format.

## Further reading

Nicolas Rougier has written excellent tutorials on the use of Matplotlib and NumPy:

* [Matplotlib tutorial](https://www.labri.fr/perso/nrougier/teaching/matplotlib/)
* [NumPy tutorial](http://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html)

The Seaborn library comes with a good tutorial written by its author (note that -since Seaborn is an add-on to Matplotlib- some knowledge of Matplotlib is assumed):

* [Seaborn tutorial](http://seaborn.pydata.org/tutorial.html)