# Tutorial
## Preliminaries
Before following this tutorial we need to set up the tools and load the data. We need to import several packages, so before running this notebook you should create an environment (conda or virtualenv) with matplotlib, numpy, and scikit-image, and jupyter. 

You can use the Anaconda Navigator to do this, see:

https://docs.continuum.io/anaconda/navigator/getting-started.html

Or from the terminal (command window), for example:

`conda create -n BIO399E jupyter matplotlib numpy scipy scikit-image`

and then activate it, e.g. on Mac/Linux:

`source activate BIO399E`

or on windows

`activate BIO399E`

or select the conda env here in Jupyter.

## Modules
First import the standard tools, numpy and matplotlib. These are very well documented packages, more info can be found here:

http://www.matplotlib.org

http://www.numpy.org

In [None]:
import numpy as np
import matplotlib
np.__version__

## Numpy arrays
Lists are a simple way to store collections of data, but they are not very flexible. To deal with numerical data it is better is to use a package called numpy, which stores data in n-dimensional arrays. The simplest is a lot like a list, and we can make it from a list:


In [None]:
days = [31,28,31,30,31,30,31,31,30,31,30,31]
adays = np.array(days)
print adays

We can also make an array from scratch and fill it with zeros, ones, random values, etc, and combine arrays to compute functions:

In [None]:
a = np.zeros((12,))
x = np.ones((100,))
y = np.random.random((12,))

z = np.arange(1,13)
w = z*5 + 0.5

print x
print y
print z
print w

Now we have the days in a numpy array we can use functions from numpy to easily do the control:

In [None]:
# Calculate mean using numpy
average_days = adays.mean()
print adays
print average_days

In [None]:
# Calculate variance using numpy
var_days = adays.var()
print var_days

# Calculate variance step-by-step (same result)
devs = (adays-average_days)**2
print devs

sum_devs = devs.sum()
print sum_devs

var_days = sum_devs/len(days)
print var_days

### 2-dimensional arrays
Numpy can handle arrays of any number dimensions. For example for images we will use 2-dimensional arrays (in a later class). Here is how to make a 2-d array:

In [None]:
# A 2-dimensional array with random values
twod = np.random.random((10,5))
print twod

# Means (variance, etc.) are computed along specified axes
print np.mean(twod, axis=1)
print np.mean(twod, axis=(0,1))

## Exercise 1
As we saw above Numpy has many functions that perform calculations on arrays, see here:

https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods

One of these functions is to load data from a text file. 

In this folder you will find 3 text files, 'fluo.csv', 'od.csv', and 't.csv' (time). These files contain comma separated lists of numbers corresponding to fluorescence and od measurements at times t.

1.1 Write code to load these files into 3 numpy arrays `fluo`, `od` and `t`:

1.2 Now calculate the mean and variance of each data set. What else can you calculate with numpy that might be useful?

## Matplotlib, making graphs
Matplotlib is a module that works well with numpy arrays, and can make many kinds of graphs, charts, heatmaps etc. The part of the module that does plotting is called pyplot, we import it like this with some magic to put plots in this notebook:

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

For example, here is a simple plot of the fluorescence:

In [None]:
# Simple plot
plt.plot(fluo)
# We can also choose the color, point shape etc.
plt.plot(fluo, 'g+')

Make use of the matplotlib documentation to do the following:

## Exercise 2
2.1 Make a plot for each data set, `fluo`, `od` against time `t`. Label the axes and give the plot a title.

2.2 Do the same as above, but plot the log of the data:

2.3 Plot `fluo` and `od` in the same plot. Try and make the axes different so that you can really see od:

2.4 Plot `fluo/od` for all times:

## Exercise 3
3.1 Plot histograms of `fluo` and `od`:

3.2 Plot `fluo` against `od`:

3.3 Calculate the correlation between `fluo` and `od`:

## Data analysis, calculating gene expression

Here is a simple model for fluorescent gene expression from a single cell:

\begin{equation}
\frac{dF}{dt} = k(t) - \mu(t) F(t)
\end{equation}

where $\mu(t)$ is the growth rate. If we have $N \approx OD$ cells, then we measure the sum of their gene expression $I(t) = F(t)OD(t)$ and we can show that:

\begin{equation}
k(t) = \frac{1}{OD}\frac{dI}{dt}
\end{equation}

## Exercise 4 (Advanced)
Given the data above with $I(t)$ in the variable `fluo`, how would you calculate the gene expression rate at each time `t`?