## Advanced Data Analysis - Getting to know Jupyter
|Learning objectives of this class| Data tasks you will perform|
|:-------|:---|
|1) Become familiar with Jupyter notebooks | Automating analysis
|2) Basic Python data structures | Plotting raw data from an intracellular recording, automatic detction of spikes and plot a histogram of instantaneous firing rate 
|3) Loading and using modules/libraries| Plot the raw data from a voltage clamp experiment of Kv currents, plot the IV and activation curves
|4) Importing data: text, csv and image files| Loading images and movies, display multiple contours on images
|5) numpy arrays, indexing and exploring
|6) Visualise data 

# Basic functions of Jupyter and python objects

In [None]:
# the hash symbol comments out your code i.e. it is ignore when you run
# execute code
print('ctrl + enter will execute this cell') # everything after the # symbol on each line is "commented"
print('whereas shift + enter will execute this cell and move to the next')

In [None]:
print('When a cell is green it is in the "edit mode" and you can type in it \n')
print('ctrl+enter to execute this cell\n') 
print('it is now blue and in the "command mode", you can use the arrow keys to move up and down \nto different cells\n')
print('The b key will add a new cell below the selected cell in command mode')
print('Press b and in the new cell type a="hello " and a new line type b=1 and agian on \nanother line type c=0.1')

**a, b, and c now exist in memory, you can see them in the variable inspector**

In [None]:
print("a is a ",type(a))
print("b is a ",type(b))
print("c is a ",type(c))


**Strings**
- addition
- converting numbers

**Variables**
- int (integers)
- float (floating point number, i.e. has a decimel)

**Bools (boolean)** 
- used when data is either/or, 0 or 1 

**Lists (containers of other things including lists)** 

In [None]:


f=['lists are just lists of things','that can contain any data type']
e.append(f)
e.append('the last item in this list')

print(e)


**list are usefull to append things to, you can query how many items are in the list with len()**

In [None]:
print('there are ',len(e), 'items in e')

**Indexing/slicing**
- how to address/get different parts of a list
- watch [this video](https://youtu.be/KAXvMbD1Zac) if you are unsure about how indexing works 

## For loops

**the first line of a for loop should be:**
        
<code> for 'iterable' in list:
        the content of the loop
        is everything at one indent after the
        first line</code>
    
**loops enable iteration over data sets repeating laborious tasks, i.e. they perform automation of analysis**

In [None]:
alist=['each','of','these','is','the','contents','of','the','list','at','the','specified','index','of','the','list']

**or you can use the more general purpose "range()", it has simiar synatax to np.arange**

**Doing maths with lists....**

In [None]:
numbersInList=[0,1,2,3,4,5,6,7,8,9]


**The basics**  
**You have learnt about:**
   - jupyter edit and command modes
   - code **cells** and *markdown* ***cells***
   - *strings* and **variables: ints and floats**
   - lists: **learn more of their very useful functions here**: https://docs.python.org/3/tutorial/datastructures.html
   - how to query the data type of an object: "**type(object)**"
   - how to query the length of an object: "**len(object)**"
   - how to query an index of a list: "**list[index]**"
   - how to use for loops
   - you can't do maths with lists of variables
   
   - There are other basic data classes that we will not cover such as Dictionaries{} as we will not be using them, you can learn more about them here: https://docs.python.org/3/tutorial/datastructures.html#dictionaries 

**Break**

# Using Python modules/libaries/packages
**Python has been extended with modules developed by a massive community to enable you to do pretty much any kind of analysis or data visualisation that you need**


**To use any of these modules you first need to install the package/module:**
- via anaconda environment package manager
- or via pip install in the terminal or in a cell

##### Note on terminology: Libraries are collections of packages which are collections of modules

In [None]:
import numpy as np      # this imports all of numpy which you will now refer to as np
from matplotlib import pyplot as plt # this imports the pyplot package from the matplotlib
# library which you will now refer to as plt

plt.rcParams.update({'font.size': 14})  # sets the font size of all plots to 14

**NumPy (np)**
- Converting lists to arrays
- Maths
- Indexing (watch [this video](https://youtu.be/KAXvMbD1Zac) if you are unsure about how indexing works) 
- Place Holders / generating arrays    

**_See the NumPy cheat sheet on Minerva_**

In [None]:
a=np.asarray(numbersInList)   #we have now converted the list of numbers to an array

**Place holders / generating data**
- np.arange https://numpy.org/doc/stable/reference/generated/numpy.arange.html
- np.linspace https://numpy.org/doc/stable/reference/generated/numpy.linspace.html
- np.zeros https://numpy.org/doc/stable/reference/generated/numpy.zeros.html?highlight=zeros#numpy.zeros
- np.ones https://numpy.org/doc/stable/reference/generated/numpy.ones.html

## Loading data
- Data is a current clamp recording from a mitral cell in the olfactory bulb, a current injection is given to evoke action potential firing
- We are going to plot the raw data, use event detection to automatically identify spikes and then calculate the average firing rate of the cell
- See np.genfromtxt for how to load different formats or sections of files https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html
- If you are dealing with excel type data the pandas library is very good for loading and displaying excel files

In [None]:
data = np.genfromtxt('mitralSpikes.txt');


**Now we will visualise it**

In [None]:
plt.plot(data); # matplotlib enables very quick and simple display of data


**To make the plots more accurate we need an xscale and to add axis labels**

In [None]:
sampleRate = 15000
xScale = 

**now lets get the time of each action potential so that we can calculate spike rate**
- The Scipy library provides lots of fancy functionality
- We will use it to find the peaks i.e. spikes in this signal

In [None]:
import scipy.signal as signal

In [None]:
peaks, props = signal.find_peaks(data,height=0) #this function returns two things: 1) peaks
# which is the location of the spike in sample points and props which contains other details about the peaks


**We have a list of spike times, we can now make a histogram of the instantaneous spike rates**

**Now we will show both together**

## A more complex figure 

- This data is a voltage clamp recording of a Kv2 potassium current taken from: [Johnston, et al (2008)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2538803/)

- We are going to plot the raw data with the command voltage and make plots of the current voltage relationship and the activation kinetics of the channel

In [None]:
#Load the data
measuredI=np.loadtxt('VClampI.txt', delimiter=','); ## load some new data, this is comma seperated...
commandV=np.loadtxt('VClampV.txt', delimiter=','); ## load some new data, this is comma seperated...
VCsampleRate = 10000
xScaleVC = np.arange(0,len(measuredI)/VCsampleRate, 1/VCsampleRate)
# the units of measuredI is pA
# the units of commandV is mV


In [None]:
# check what its dimensions look like


In [None]:
# have a quick look at it
plt.plot(measuredI);


**All of the current traces are in pA lets convert them all to nA by dividing them all by 1000**

**Building a more complex figure**

**numpy arrays and plotting**  
**You have learnt about:**  
   - how to load extra modules into python and import them for use in your analysis
   - how to load text files and csv data into numpy arrays
   - numpy arrays allow maths on lists of numbers
   - how to query the dimesions of numpy arrays
   - how to quickly plot data to examine it using matplotlib
   - you used a module of scipy to detect action potentials and calculated firing rate
   - how to generate multipanel layouts with annotated figures
   - how to loop through arrays to efficiently add multiple traces to a graph
   - how to use numpy indexing to quickly generate analysis plots 
   


# Break time

# Images: importing, properties, indexing and display

- 2-Photon imaging of GCaMP6 expressing inhibotory neurons of the olfactory bulb
- We are going to plot the segmented cells as contours ontop of the average image of the field of view

In [None]:
import tifffile as tf

In [None]:
cellMaps = tf.imread('PGcell_Maps.tif')
aveProjection = tf.imread('aveProjection.tif')

In [None]:
## inspect what you have loaded


**Show all the Rois ontop of the average image as contours**

## Defining and fitting functions
**An experimenter is measuring the amonia on the breath of a patient having kidney dialysis and wants to determine the time constant of clearance of urea/amonia from the body**
- Samples were taken every minute
- each data point is the concentration of amonia in mg/dl 

In [None]:
import scipy.optimize as fit  ## imports the optimization tool box as fit
breath = np.load('decay.npy')

> 1) Plot the data as a scatter plot i.e. points for markers rather than lines between points.  
> 2) Define the appropriate equation to describe clearance from the body.  
> 3) Fit this equation to your data.    
> 4) Determine the time constant of urea clearance and print this with some descriptive text
