<a href="#Overview"></a>
# Overview
* <a href="#Introduction">Introduction</a>
* <a href="#Load-the-data">Load the data</a>
  * <a href="#Exercise-1">Exercise 1</a>
  * <a href="#Exercise-2">Exercise 2</a>
  * <a href="#Exercise-3">Exercise 3</a>
  * <a href="#Exercise-4">Exercise 4</a>
* <a href="#Slicing-data">Slicing data</a>
  * <a href="#Exercise-5">Exercise 5</a>
  * <a href="#Exercise-6">Exercise 6</a>
  * <a href="#Exercise-7">Exercise 7</a>
  * <a href="#Exercise-8">Exercise 8</a>
* <a href="#Plot-the-data">Plot the data</a>
  * <a href="#Exercise-9">Exercise 9</a>
  * <a href="#Exercise-10">Exercise 10</a>
  * <a href="#Exercise-11">Exercise 11</a>
* <a href="#Filter-the-data-trace">Filter the data trace</a>
* <a href="#Plot-the-Power-Spectral-Density-(PSD)">Plot the Power Spectral Density (PSD)</a>
  * <a href="#Exercise-12">Exercise 12</a>
  * <a href="#Exercise-13">Exercise 13</a>
  * <a href="#Exercise-14">Exercise 14</a>
* <a href="#Find-the-location-of-EPSC-peak">Find the location of EPSC peak</a>
  * <a href="#Exercise-15">Exercise 15</a>
  * <a href="#Exercise-16">Exercise 16</a>
  * <a href="#Exercise-17">Exercise 17</a>
  * <a href="#Exercise-18">Exercise 18</a>
  * <a href="#Exercise-19">Exercise 19</a>
  * <a href="#Exercise-20">Exercise 20</a>
  * <a href="#Exercise-21">Exercise 21</a>
* <a href="#Calculate-summary-statistic">Calculate summary statistic</a>
  * <a href="#Exercise-22">Exercise 22</a>
* <a href="#Summary">Summary</a>

<a name="#Introduction"></a>
## Introduction
<a href="#Overview">Return to overview</a>

* What is an EPSC?

<img src="pc.jpg" >

Patch-clamp is a technique used to study ionic currents of cell membrane.
By the patch clamp method, it is possible to record the electrical events occurring in a postsynaptic cell as a result of neurotransmitters’ release by the presynaptic terminal and the consecutive opening of ionotropic receptors. In the “voltage clamp” mode, the voltage is kept constant, so it is possible to record the current passing through the open ion channels, called “postsynaptic current” (PSC).

For an excitatory synapse, the binding of neurotrasmitters induces the opening of cationic channels, which is depolarizing the cell. The induced electrical events are called “excitatory postsynaptic currents” (EPSCs).

Recording and analysing EPSCs is the powerful and popular technique for comparing excitability of neurons in desease models and control. For example, in my past lab in Marseille we observed excitability in neurons from TSC knock-out mice and we were able to find the receptor target to treat this abnormality. 

<img src="TSC_EPSCs.jpg" >

Here, we detected EPSCs, quantified the EPSCs frequency, align them by peak, averaged and found the difference in the shape of TSC-KO EPSCs. 


<a name="#Load-the-data"></a>
## Load the data
<a href="#Overview">Return to overview</a>


Now I am a PostDoc in Physiology and Pharmacology Department, Mike Andresen lab. We are all electrophysiologists there and we use patch-clamp technique. We use Axon Instruments Multiclamp  amplifiers to measure currents and potentials of neurons in the brain slices. It means, we always deal with abf files (Axon Binary Format). In this class I want to show you one of the ways to identify peaks of Excitatory Postsynaptic currents (EPSCs).
So, first step will be the opening the data files in ABF format. For this we will use the [pyabf module](https://pypi.org/project/pyabf).</font>

Some sample data, `learn024.abf`, will be used in this class. Let's import the `pyabf` module and use it to open abf file:


In [None]:
import pyabf as pa               # package for loading abf data
import scipy.signal as ss        # Scientific package for signal processing
import numpy as np               # for numpy array objects
import matplotlib.pyplot as plt  # Plotting package to visualize data


<a name="#Exercise-1"></a>
### Exercise 1
<a href="#Overview">Return to overview</a>

Please create the string variable `abf_file` and assign the name of the file needed to open - `learn024.abf`

In [None]:
%load "answers/answer_001.txt"

Pass this filename as an argument into the loading function of pyabf. Syntaxis: `pa.ABF("your_file_name")`. Save the result in the object `data_file`

In [None]:
%load "answers/answer_002.txt"

Let's use the functions `print` and `.format` to see how this data looks like.
The result should be something like:
`data_file is an ....`

In [None]:
%load "answers/answer_003.txt"

Check what type of data we've loaded by calling the `type` function.

In [None]:
%load "answers/answer_004.txt"

`data_file` is an object of class `ABF`. The `ABF` class is provided by the `pyabf` module (abbreviated `pa` in our code). Let's see what methods and attributes are provided by the `ABF` class. Remember that you can look up these methods and attributes by typing the name of the variable followed by a period and then hitting `TAB`. 

<a name="#Exercise-2"></a>
### Exercise 2
<a href="#Overview">Return to overview</a>
Print the date when the file was created

In [None]:
# Answer 
print(data_file.abfDateTimeString)

You'll notice two names, `sweepLabelX` and `sweepUnitsX` in the list of methods and attributes on `data_file`. What are these? Are they methods or attributes? How can you learn a little more about them?

In [None]:
%load "answers/answer_005.txt"

<a name="#Exercise-3"></a>
### Exercise 3
<a href="#Overview">Return to overview</a>

Create two new variables: `time` and `current` and assign arrays of time and current values of the `data_file`. 

In [None]:
%load "answers/answer_006.txt"

<a name="#Exercise-4"></a>
### Exercise 4
<a href="#Overview">Return to overview</a>
Print the types of 'time' and 'current' variables. Use `.format` to make your output more readable. 

In [None]:
%load "answers/answer_007.txt"

Cool! The data type returned here are `numpy` arrays. We've worked with these already in previous classes. 

<a name="#Slicing-data"></a>
## Slicing data
<a href="#Overview">Return to overview</a>


<a name="#Exercise-5"></a>
### Exercise 5
<a href="#Overview">Return to overview</a>
What are the dimensions of `time` and `current`? In other words, what are their shapes?

In [None]:
%load "answers/answer_008.txt"


<a name="#Exercise-6"></a>
### Exercise 6
<a href="#Overview">Return to overview</a>

As a review from last week, how would we select only the current at the 34th time sample?

Once you've extracted the current at the 34th time sample, please print the following string, `The current at sample 34 is XX YY` where XX is the current and YY is the unit (e.g., picoamps, nanoamps, etc.). *As a bonus, format the current value so it only prints the first two decimal places. Not sure how to do this? Use Google!

In [None]:
# Answer 
c_34 = current[34]
print("The current at this time is {0} {1}".format(c_34, data_file.sweepUnitsY))

# using f-strings:
i = 34
string = f"The current at sample {i} is {current[i]:.2f} {data_file.sweepUnitsY}"
print(string)

# using format
i = 34
string = "The current at sample {i} is {value:.2f} {units}".format(i=i, value=current[i], units=data_file.sweepUnitsY)
print(string)

<a name="#Exercise-7"></a>
### Exercise 7
<a href="#Overview">Return to overview</a>
* Great, we know that the 34th sample recorded a current of -33 pA, but what if we wanted to know what was going at a particulur time, in seconds? To figure this out, we need to calculate the rate at which the data was sampled. As a hint, the difference between consecutive timepoints gives us the period. Save the calculated value in a variable, `sample_rate`.
* After you've successfully done this, use `sample_rate` to figure out what the current was at 0.5 seconds and print the following string `The current at XX sec is YY ZZ` where XX is the timepoint, YY is the current and ZZ is the unit of the current.



In [None]:
%load "answers/answer_009.txt"


<a name="#Exercise-8"></a>
### Exercise 8
<a href="#Overview">Return to overview</a>
Let's extract the current from 0.1 to 0.2 sec and save it in a new variable, `data_slice`.


In [None]:
# Answer:
t1_index = int(round(sample_rate * 0.1))
t2_index = int(round(sample_rate * 0.2))

data_slice = current[t1_index:t2_index]

print(data_slice)

**Let's briefly summarize what we've done so far** 
* loaded data, 
* figured out data types, 
* calculated the sampling rate and 
* did a refresher on how to slice data from a numpy array.

<a name="#Plot-the-data"></a>
## Plot the data
<a href="#Overview">Return to overview</a>
To visualize this data, we will use the package `matplotlib.pyplot`, which imported at the beginning as `plt`


In [None]:
# Create a matplotlib figure and plot the trace. Note how we label the axes
plt.figure()
plt.plot(time, current, color='grey')
plt.xlabel('Time (sec)')
plt.ylabel('Current (pA)')
plt.show()

 <img src="slice1.jpg" width=300 >

In our lab, we study synaptic transmission in Solitary nucleus (NTS) which is innervated by Solitary Truct and is a first reciever of information from peripheral neuros system. 
This trace is a patch-clamp recording from the NTS neuron in the slice including Solitary truct. The "artifacts" on the beginning caused by electric stimulation of the solitary truct. This stimulation evokes synchronous synaptic responses of NTS neurons from solitary truct, we call them *synchronous EPSCs*. Increasing stimuli intensity, we are able to see the thresholds for monosynaptic input and higher order connections on the neuron. You can get a lot of information about the network from the stimuli intensity thresholds, responce delay, shape and amplitude.
Let's look close on them. 

<a name="#Exercise-9"></a>
### Exercise 9
<a href="#Overview">Return to overview</a>
* Using `plt.ylim((bottom, top))` shrink the Y axis from -500 to 100 pA.
* Also shrink the x-axis in order to zoom in on the artifacts between 0.03 and 0.07 sec. See if you can figure out what function can perform this action.
* Be sure to label the x and y axes.

In [None]:
%load "answers/answer_010.txt"

<a name="#Exercise-10"></a>
### Exercise 10
<a href="#Overview">Return to overview</a>
* That looks good. But more often electrophysiologists need to analyse not evoked but spontaneous EPSC. So let's move the focus to the interval from 0.3 to 0.4 second of my recording, where we can see them. 
* To do this, we first need to locate the correct indexes. Please do this and save this indexes as two new variables: `t_03` and `t04`


In [None]:
%load "answers/answer_011.txt"


<a name="#Exercise-11"></a>
### Exercise 11
<a href="#Overview">Return to overview</a>
* Use these indexes to slice out the current trace betwee 0.3 and 0.4 seconds.
* Save this data into two new arrays `short_time` and `short_data`.
* Finally, plot this data


In [None]:
%load "answers/answer_012.txt"

<a name="#Filter-the-data-trace"></a>
## Filter the data trace
<a href="#Overview">Return to overview</a>


Without filtering EPSC recordings are usually quite noisy, making them difficult to read. Therefore, before analysis it is common to first filter everything above 1 kHz. We will do this by designing a low-pass butterworth filter using scipy's `signal` module.

Filtering data requires two steps:

* Designing the filter. A filter consists of two sets of coefficients, conventionally known as `b` and `a`. To calculate these coefficients, we use the `iirfilter` function available via `scipy`. Take a look at the documentation for the function. How many arguments are required? Hint. You can tell based on the number of parameters that do not have default values set.
* Applying the filter. This can be done using `filtfilt`, which takes the filter coefficients and the data to be filtered.

Let's start with a first-order butterworth filter with a cutoff frequency of 1000 Hz. We have to convert the cutoff frequency to a normalized frequency, `Wn`. Normalized frequency is defined as the number of half-cycles per sample (as indicated by the docstring for `iirfilter`, also can be thought of as a fraction of Nyquist frequency):

    f_cutoff = 1000
    Wn = f_cutoff / (sample_rate/2)

In [None]:
# Design the filter
f_cutoff = 1000
Wn = f_cutoff / (sample_rate/2)
b, a = ss.iirfilter(1, Wn, ftype='butter', btype='low')

Next, we will apply to our "short" data

In [None]:
# Apply the filter
filtered_data = ss.filtfilt(b, a, short_data) #filtering

# plot the filtered data
plt.figure()
plt.plot(short_time, short_data, lw=.5) #non-filtered data
plt.plot (short_time, filtered_data)    #filtered data on top
plt.xlabel('Time (s)')
plt.ylabel('Current (pA)')
plt.legend(['unfiltered data', 'filtered data'])

<a name="#Plot-the-Power-Spectral-Density-(PSD)"></a>
## Plot the Power Spectral Density (PSD)
<a href="#Overview">Return to overview</a>

Power Spectral Density (PSD) is very usefull to check did the filter work properly.
There is the function `psd` in the `matplotlib.pyplot` module which we imported as `plt`.
<a name="#Exercise-12"></a>
### Exercise 12
<a href="#Overview">Return to overview</a>
* Using documentation, figure out how to use `psd` function
* Plot PSD of both `short_data` and `filtered_data`
* Make the axis of frequencies logarithmic. Hint: xscale('log')

In [None]:
%load "answers/answer_013.txt"

Let's low-pass filter the signal with cut-off frequency 200 Hz and see what happens to the signal, and it's PSD. 

<a name="#Exercise-13"></a>
### Exercise 13
<a href="#Overview">Return to overview</a>
* Design the filter with cut-off frequency 200 Hz.
* Filter the `short_data` and save the result in new `filtered_data200` array
* plot `filtered_data` and `filtered_data200` on top of each other
* Label axises and add the legend 

In [None]:
%load "answers/answer_014.txt"

Let's see how PSD changed

<a name="#Exercise-14"></a>
### Exercise 14
<a href="#Overview">Return to overview</a>
* plot PSD of all three signals: short_data, filtered_data and filtered_data200.
* Make the axis of frequencies logarithmic
* Add the legend

In [None]:
%load "answers/answer_015.txt"

<a name="#Find-the-location-of-EPSC-peak"></a>
## Find the location of EPSC peak
<a href="#Overview">Return to overview</a>

We have reviewed a big topic of e-phys analysis, data filtering.
Let's come back to the main topic of the EPSC peaks finding.
Knowing the EPSC peak times, we can make lots of different analisys. We can avarege EPSCs, quantify the frequency, amplitude, decay time etc. 
We will test `argrelmin` function from `scipy.signal` module for this.

 **scipy.signal.argrelmin(data, axis=0, order=1, mode='clip')**
 
    Calculate the relative minima of data.
    Parameters:
            data : ndarray         #Array in which to find the relative minima.
            axis : int, optional   #Axis over which to select from data. Default is 0.
            order : int, optional  #How many points on each side to use
                                   #for the comparison to consider comparator(n, n+x) to be True.
            mode : str, optional   #How the edges of the vector are treated. Available options are ‘wrap’ 
                                   #(wrap around) or ‘clip’ 
                                   #(treat overflow as the same as the last (or first) element). 

    Returns:	
            extrema : tuple of ndarrays
                                   #Indices of the minima in arrays of integers. 
                                   #extrema[k] is the array of indices of axis k of data.


<a name="#Exercise-15"></a>
### Exercise 15
<a href="#Overview">Return to overview</a>
We imported already `scipy.signal` module as `ss`.

Using the `argrelmin` function, find the relative minima of `filtered_data` and save them to the new tuple *argrelmin_res*.
Print result.

In [None]:
%load "answers/answer_016.txt"

That's definitely not what we expected. Let's change the parameters of the function. Order (How many points on each side must be greater than the point of interest in order for it to be considered a minima) sounds right. Let's change it to 1000 and run the function one more time.

In [None]:
argrelmin_res = ss.argrelmin(filtered_data, order=1000)
print (argrelmin_res)

This looks better, but still not exactly what we need. 
Let's say, we don't need to identify any events lower they 50 pA in amplitude.
We can create the logical (`True` `False`) mask. In other words, let's create a vector that is `True` when the current is less than -50 pA and `False` when it is greater than -50 pA.

In [None]:
threshold = -50   # Don't want to call anything below -50 pA an EPSC event

# create a "mask" over all the data that meets this criteria
mask = filtered_data < threshold

In [None]:
# see how mask is just an array of true and false?
print(mask)
plt.plot(mask)

 Function `argrelmin` returns a  tuple of arrays, and the return value is a tuple even when data is one-dimensional. Thus, the array of peaks will be argrelmin_res[0]

In [None]:
all_peaks = ss.argrelmin(filtered_data, order=1000)[0]
print (all_peaks)

That gaves us 5 peaks, as we saw above. We only actually want three. So, now we use the mask to only keep
peaks ocurring when the traces was bigger than threshold.
Below there is a small loop sorting out the peaks not overlapping with mask.

In [None]:
EPSC_peaks = []  #empty array for saving the right peaks in

for peak in all_peaks:
    if mask[peak] == True:         #overlapping with mask     
        EPSC_peaks.append(peak)    #add this peak to the end of EPSC_peaks array
    else:
        pass

print("These were all peaks detected by argrelmin: {0}".format(all_peaks))

print("Only these peaks: {0} exceeded the threshold of -50 pA".format(EPSC_peaks))

Now, let's plot these peaks overlayed on the trace to confirm we ID'd the right ones.

<a name="#Exercise-16"></a>
### Exercise 16
<a href="#Overview">Return to overview</a>
* Plot `filtered_data` currents vs time. Make it grey (color='grey') 
* Plot EPSC peaks as black stars ("*", color='k', markersize=10) on top of the current trace. 

Hint:`EPSC_peaks` array stores the indexes of EPSC peaks. The times of EPSC peaks are `short_time[EPSC_peaks]` and the current values in these point are `filtered_data[EPSC_peaks]`.


In [None]:
%load "answers/answer_017.txt"

Now, we are going to combine everything we've done below for analysing the bigger piece of data. 
We still have the entire recording of current values saved as `current = data_file.sweepY` and time as `time = data_file.sweepX`

<a name="#Exercise-17"></a>
### Exercise 17
<a href="#Overview">Return to overview</a>

Extract data from 0.08 to 0.4 sec. Save current and time arrays as `time1` and `current1` 

In [None]:
%load "answers/answer_018.txt"

<a name="#Exercise-18"></a>
### Exercise 18
<a href="#Overview">Return to overview</a>

Lowpass filter `current1` with a butterworth using a cutoff of 1000 Hz.
Save to `current1_filtered`

In [None]:
%load "answers/answer_019.txt"

Create the variable `thresh` and set it equal to -47.

In [None]:
%load "answers/answer_020.txt"

Finally, create the array `mask` with logical values, where *current value < thresh* is  *True*  for `current1_filtered`

In [None]:
%load "answers/answer_021.txt"

Now finding EPSC peaks

<a name="#Exercise-19"></a>
### Exercise 19
<a href="#Overview">Return to overview</a>

* Using `argrelmin` with order = 25, create array `all_peaks` with minimas of `current1_filtered`
* Print `all_peaks`


In [None]:
%load "answers/answer_022.txt"

<a name="#Exercise-20"></a>
### Exercise 20
<a href="#Overview">Return to overview</a>

This code has a mistake and won't run. Please fix the mistake and run it:

    # Use mask to screen for real peaks
    EPSC_peaks = []

    for peak in all_peaks:
        if mask[peak] = True:
            EPSC_peaks.append(peak)
        else:
            pass


In [None]:
%load "answers/answer_023.txt"

<a name="#Exercise-21"></a>
### Exercise 21
<a href="#Overview">Return to overview</a>

Plot this results to see if it makes any sense:
* Plot `current1_filtered` in it's time axis in grey color
* Plot `EPSC_peaks` as big red stars on top of it
* Plot `all_peaks` as black dots on top of everything

In [None]:
%load "answers/answer_024.txt"

<a name="#Calculate-summary-statistic"></a>
## Calculate summary statistic
<a href="#Overview">Return to overview</a>


<a name="#Exercise-22"></a>
### Exercise 22
<a href="#Overview">Return to overview</a>
* Using **numpy.mean** and **numpy.std** functions, calculate and print mean and standard deviation of EPSC peak amplitude.

In [None]:
%load "answers/answer_025.txt"

* calculate EPSC frequency as number of events in one second. Hint: `EPSC_peaks is a list`, you can get it's lenth using `len(EPSC_peaks)`

In [None]:
%load "answers/answer_026.txt"

<a name="#Summary"></a>
## Summary
<a href="#Overview">Return to overview</a>

   * **pyabf** loads ABF files and presents data in `numpy` format
   * Low-pass Butterworth filter works in two steps
   * `argrelmin` from **scipy signal** works well but might require some additional tweaks to get it to work the way you want
   
**Thank you!** 
   