# Analysis of Ephys-AFM data

The following notebook presents an example of a typical analysis pipeline. The analysis is performed in bulk on all data files within a given folder. Versions of the functions for analyzing single files are also present in the relevant modules if needed.

## Set Path

We begin by importing the relevant modules: preprocess, summarize, and plotData. All other dependencies are included in the modules. The path variable should be set to the folder containing the data to be analyzed. Be sure to change the path to the specific folder you want to analyze.

In [1]:
#%matplotlib qt
from preprocess import *
from summarize import *
from plotData import *

## Change these if needed.

headers = ['index', 'ti', 'i', 'tv', 'v', 'tin0', 'in0', 'tz', 'z', 'tlat', 'lat']
path = "E:/Research/Thesis/modulators/cytod/"

## Preprocess Data

Next, we preprocess the data. The `preprocessDirectory` function will find all files in the path that end in the string passed as the second argument. In the example below any file that ends with "scan" will be preprocessed. The window dictates the time interval to use for baseline subtraction and should be a point in each sweep where no stimulus is applied. Each data file should have an associated "params" file and "sensitivity" file with the experimental parameters and sensitivity calibration for each experiment respectively. Preprocessing each file involves the following steps:

- Convert units to mV, ms, and pA for the columns v, ti, and i, respectively.
- Baseline subtract the current (i) and photodetector voltage (in0) creating new columns "i_blsub" and "in0_blsub", respectively.
- Calculate the deflection, position, force, and the cumulative work done over the course of the sweep and append these as new columns in the dataframe.
- Save the dataframe into a feather file with the same name as the original file and the word "_preprocessed" appended to the end of the filename.

In [None]:
preprocessDirectory(path, 'scan', headers=headers, window=[50,150])

## Summarize Data

We next summarize each sweep and create a table. The `summarizeDirectory` function will process all files in a given directory. The arguments include a path to a directory of interest, the suffix of the filenames containing the data to be analyzed, an roi window within which the stimulus is applied, and a baseline subtraction window. A new dataframe will be generated with the following parameters: 
 - **path**: The path to the file being summarized.
 - **uniqueID**: A cell-specific unique identifier.
 - **date**: The date the experiment was performed.
 - **construct**: The construct transfected for a given experiment or other experimental condition.
 - **cell**: The number of the cell for a given experimental session.
 - **protocol**: The protocol applied.
 - **sweep**: The sweep number on that cell.
 - **velocity**: The velocity of the cantilever for a given experiment.
 - **kcant**: The stiffness of the cantilever for a given experiment.
 - **dkcant**: An estimate of the uncertainty in cantilever calibration.
 - **osm**: The osmolality of the solution measured after a given experiment (not done for all experiments).
 - **Rs**: Series resistance for a given recording.
 - **Rscomp**: Percent compensation of series resistance.
 - **Cm**: Slow (whole-cell) capacitance of a given recording.
 - **seal**: Estimated seal resistance for a given recording.
 - **vhold**: Holding potential for a given recording.
 - **vstep**: Voltage step during a given recording if present.
 - **peaki**: Peak current over the course of a particular sweep.
 - **tpeaki**: The timing of the peak current in a given sweep.
 - **peakf**: The peak force of a particular sweep.
 - **tpeakf**: The timing of the peak force in a given sweep.
 - **f95**: The force 95 ms into the static phase of a given sweep.
 - **i95**: The current 95 ms into the static phase of a given sweep.
 - **peakw**: The peak work of a particular sweep.
 - **tpeakw**: The timing of the peak work of a given sweep.
 - **wpeakf**: The work done at the point where a given sweep reaches the peak force.
 - **leak**: The amount of current at a holding potential of -80 mV used for triaging.
 - **offset**: The mean current over a 10 ms window just prior to stimulus onset used to correct for any offset if present.
 - **stdev**: The standard deviation of the current over a 50 ms window prior to stimulus onset.
 - **delay**: tpeaki - tpeakf
 - **thresh**: The current threshold taken to be offset + 2 * stdev.
 - **threshind**: The index of the last point in a sweep below the threshold current.
 - **thresht**: The timing of the threshold crossing for a given sweep.
 - **fthresh**: The force applied at the time of threshold crossing.
 - **wthresh**: The cumulative work done at the time of threshold crossing.
 
 These parameters are saved into a separate file that shares the filename with the original data superceded by the tag "_summarized" and is used for further downstream analysis.

In [1]:
roi = [750,1200]
blsub = [50,150]

summarizeDirectory(path, 'scan', roi=roi, window=blsub)

NameError: name 'summarizeDirectory' is not defined

## Plot Data

Next we plot the preprocessed traces for all files in a given directory. Some of the summary parameters such as the threshold, time of peak current, and time of peak force are overlaid on the data. This is an important step to quality check the data and ensure that the analysis is outputting expected results. At this point you should check whether a particular cell was responsive, whether you are picking up recording artifacts for a particular trace and need to adjust some of the analysis parameters, whether the detected peaks and thresholds are sensible given the data.


In [None]:
protocol='scan'
path_list = []
for root, dirs, files in os.walk(path):
    for file in files:
      if file.find(protocol + '_preprocessed.feather') != -1:
        path_list.append(os.path.join(root, file).replace("\\","/"))

In [None]:
for i in path_list:
    dat  = PlotData(i)
    dat.plot_all_sweeps(['position', 'force', 'work', 'i_blsub'], roi=[500,1150], scalebars=True, checksum=True)

## Make approaches file

Next, we will concatenate the traces corresponding to the compression phase of our stimulus for each uniqueID and sweep number we have in our summary file. Be sure to update the path to the summary file you created earlier.

In [None]:
summaryPath = "E:/Research/Thesis/modulators/cytod/k2p/k2pcd_summary.csv"
make_sweepfile(summaryPath)

## Measure Slopes

We must also determine the slopes in order to calculate the work resolution. To do this we must switch our plotting backend from `matplotlib inline` to `matplotlib qt`. To do this restart the kernel and uncomment the first line of the first cell of code where it says `#%matplotlib qt`. Run the first cell then come down to the last cell and run the code with the appropriate path to the summary file for the dataset and the preprocessed sweeps for that dataset. We must change the backend so that we can interact with the plots. It will go through each individual sweep and plot the approach phase of the data with absolute current as a function of work done. You can manually select the region to be fit with a line by mousing over the region of interest on the plot. The code will return the slope which can then be added to the summary file for later analysis. The paths should be adjusted to include the summary file and the approaches file you created earlier.

In [None]:
summaryPath = "E:/Research/Thesis/modulators/cytod/k2p/k2pcd_summary.csv"
sweepPath = "E:/Research/Thesis/modulators/cytod/k2p/k2pcd_approaches.csv"
v = find_slopes(summaryPath, sweepPath, 'work', 'absi_blsub_mech')

20210326-01
20210326-10
20210326-11
20210326-02
20210326-03
20210326-04
20210326-05
20210326-06
20210326-07
20210326-08
20210326-09
20210527-c1-scan
20210527-c8-scan
20210527-c7-scan
20210527-c2-scan
20210527-c3-scan
20210527-c4-scan
20210527-c5-scan
20210331-c1-scan
20210331-c2-scan
20210331-c3-scan
20210610-c1-scan
20210610-c2-scan
20210610-c8-scan
20210610-c4-scan
20210610-c5-scan
20210610-c6-scan
20210610-c7-scan
20210610-c9-scan
20210914-c1-scan
20210914-c2-scan
20210914-c3-scan
20210914-c4-scan
20210914-c5-scan
20210914-c6-scan
20210914-c7-scan
20210914-c8-scan
20210914-c9-scan
20210607-c1-scan
20210607-c2-scan
20210607-c3-scan
20210607-c4-scan
20210607-c5-scan
20210607-c6-scan
20210607-c7-scan
20210522-01
20210522-02
20210522-03
20210522-04
20210522-05
20210522-06
20210522-07
20210522-09
20210323-02
20210323-03
20210323-04
20200311-8
