# Extraction of electrical features (eFeatures) from experimental data

____

## Overview (from last week)

____

In this tutorial we will see how to extract electrical features (eFeatures), such as spike amplitude, firing frequency, etc... from experimental traces. The eFeatures describe the electrical behavior our neuron model should reproduce.

The steps we will follow are:

* Select and visualize the data.

* Electrophysiological features will be extracted from the voltage traces, thanks to the ** Electrophys Feature Extraction Library ** [eFEL](https://github.com/BlueBrain/eFEL).

* We will use experimental current traces to create protocols that we will use to simulate our neuron model.

* In weeks 10 and 11 we will use the **Blue Brain Python Optimisation Library** [BluePyOpt](https://github.com/BlueBrain/BluePyOpt) to create a model template for the [NEURON simulator](https://www.neuron.yale.edu/neuron/). There you'll see how the morphology you've chosen, the eFeatures and the stimuli will be combined in setting up the optimization of your neuron model.

___
### You will implement part of the code in order to perform the steps above (follow the **TODOs** in this notebook).
___

We first import some useful Python modules.

In [1]:
%load_ext autoreload
%autoreload

import numpy, IPython
import json, os

import matplotlib.pyplot as plt
%matplotlib notebook
plt.rcParams['figure.figsize'] = 10, 10

import collections

from json2html import *

# 1. Electrophysiology data

The folder "data" should contain all the traces you have downloaded for the neuron you want to model. If not, exctract now the traces in the "data" folder.

In this section we will process the electrophysiological data recorded with patch clamp (current clamp) experiments.

We store the data in a Python dictionary.

In [2]:
# Define the directory containing the traces
data_dir = 'data/'
!ls data

DA07_APWaveform_ch2_1056.dat  DA07_IDRest_ch2_3107.dat	DA07_IV_ch2_2029.dat
DA07_APWaveform_ch2_1057.dat  DA07_IDRest_ch2_3108.dat	DA07_IV_ch2_2030.dat
DA07_APWaveform_ch2_1058.dat  DA07_IDRest_ch2_3109.dat	DA07_IV_ch2_2031.dat
DA07_APWaveform_ch2_1059.dat  DA07_IDRest_ch2_3110.dat	DA07_IV_ch2_2032.dat
DA07_APWaveform_ch2_1060.dat  DA07_IDRest_ch2_3111.dat	DA07_IV_ch2_2033.dat
DA07_APWaveform_ch2_1061.dat  DA07_IDRest_ch2_3112.dat	DA07_IV_ch2_2034.dat
DA07_APWaveform_ch2_2037.dat  DA07_IDRest_ch2_4099.dat	DA07_IV_ch2_2035.dat
DA07_APWaveform_ch2_2038.dat  DA07_IDRest_ch2_4100.dat	DA07_IV_ch2_2036.dat
DA07_APWaveform_ch2_2039.dat  DA07_IDRest_ch2_4101.dat	DA07_IV_ch2_3018.dat
DA07_APWaveform_ch2_2040.dat  DA07_IDRest_ch2_4102.dat	DA07_IV_ch2_3019.dat
DA07_APWaveform_ch2_2041.dat  DA07_IDRest_ch2_4103.dat	DA07_IV_ch2_3020.dat
DA07_APWaveform_ch2_2042.dat  DA07_IDRest_ch2_4104.dat	DA07_IV_ch2_3021.dat
DA07_APWaveform_ch2_3029.dat  DA07_IDRest_ch2_4105.dat	DA07_IV_ch2_3022.dat

___
### Traces description

* All the recordings you see above represent different **stimuli** (e.g. "APWaveform", "IDRest", "IV"). 
* Each stimulus comprises different **sweeps** (e.g. "APWaveform*46-51"), of increasing/decreasing amplitudes.
* Each stimulus is repeated multiple times (e.g. APWaveform 46-51, 1042-1047, 2042-2047, 3042-3047 ). In this example above we have four **repetitions** of each stimulus.

Any individual recording has a trace number (e.g. "_1046"). Note that we have pairs of recordings with the same trace number (e.g. "exp_APWaveform_ch7_51.dat" and "exp_APWaveform_ch6_51.dat"). One of them contains the current stimulus (in this case "*ch7*") and the other the voltage response (in this case "*ch6*").
___

**TODO** With the code below we select traces based on trace number and store them in Python dictionaries. Last week we chose three stimuli and three repetitions. You are free to choose less stimuli, if you think that the firing of your neuron can be well described with a smaller stimuli subset. However, you should choose more than one repetition, as we've seen that the same cell may respond a bit differently although the stimulus is the same.

If you choose different stimuli, look carefully at the code below the "TODO", to have the appropriate number of entries and names in the "steps_v_dict" and "stepv_i_dict".

In [3]:
# TODO modify the line below to write the trace numbers of your choice in the list "selected traces"
selected_traces = [1129,1130,1131,1132,4090, 1045, 1056,1057,1058,1059,4099,4100,4101] #better to choose the last (bigger amplitude) for 1000, 2000,3000,4000

# Store voltage data in a dictionary step_name : [list of repetitions]
steps_v_dict = collections.OrderedDict({'LongStepNeg': [], 'ShortStepPos': [], 'LongStepPos': []})

# Store current data in a dictionary step_name : [list of repetitions]
steps_i_dict = collections.OrderedDict({'LongStepNeg': [], 'ShortStepPos': [], 'LongStepPos': []})

# Import the glob Python module to interact with the data directory
import glob

files_list = glob.glob1(data_dir, "*.dat")

for file_name in files_list:
    # Get channel and trace number from the file_name
    channel = int(file_name[:-4].split('_')[2][2:])
    tracenum = int(file_name[:-4].split('_')[-1])
    
    # Even channel numbers are voltage traces in this case
    if channel % 2 == 0:
        if "APWaveform" in file_name and tracenum in selected_traces:
            steps_v_dict['ShortStepPos'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
        if "IDRest" in file_name and tracenum in selected_traces:
            steps_v_dict['LongStepPos'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
        if "IV" in file_name and tracenum in selected_traces:
            steps_v_dict['LongStepNeg'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
            
    # Odd channel numbers are voltage traces in this case        
    elif channel % 2 == 1:
        if "APWaveform" in file_name and tracenum in selected_traces:
            steps_i_dict['ShortStepPos'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
        if "IDRest" in file_name and tracenum in selected_traces:
            steps_i_dict['LongStepPos'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
        if "IV" in file_name and tracenum in selected_traces:
            steps_i_dict['LongStepNeg'].append(numpy.fromfile(os.path.join(data_dir,file_name)))
        

**TODO** We can now plot these traces.

In [4]:
# TODO: if you run this cell, without any modification, you should see the traces. 
# Each subpart of the figure shows the repetitions of one stimulus

# Initialize a figure
fig1, axes = plt.subplots(len(steps_v_dict), sharey = True)

# Plot the voltage traces
for idx, step_name in enumerate(steps_v_dict.keys()):
    for rep, trace in enumerate(steps_v_dict[step_name]):
        data = trace.reshape(len(trace)/2,2)
        axes[idx].plot(data[:,0],data[:,1], label = 'Rep. ' + str(rep+1))
        axes[idx].set_ylabel('Voltage (mV)')
        axes[idx].legend(loc = 'best')
        axes[idx].set_title(step_name)
    axes[-1].set_xlabel('Time (ms)')

<IPython.core.display.Javascript object>

# 2. Electrophysiological features
To build a detailed neuron model, we need to quantify the electrical behavior we want to reproduce. The metrics we use are the eFeatures, that measure parameters describing for instance the shape of the action potential or the firing rate of a neuron.

The eFeatures extracted from the data and later from the model will be used to compare the model's responses with the experimental data. The mean features values, along with the standard deviations will be stored in a .json file.

**TODO** You will define the information on the stimulus start and end times, along with the eFeatures you want to extract. Look at [here](http://bluebrain.github.io/eFEL/eFeatures.html) to have an idea on the eFeatures that you can extract or use the function "efel.getFeatureNames()"

In [8]:
# Extract features
import efel

# TODO: Look at the plots above to find the stimulus start and end time for each stimulus (in ms),
# Replace "0" and "10000' with stimulus start and end times
steps_info = {'LongStepNeg': [250, 3800], 'ShortStepPos': [200, 900], 'LongStepPos': [50, 10000]}


# TODO: write here the feature names of your choice, for each stimulus you've chosen
LongStepNeg_feat = ['voltage_base', 'voltage_deflection_begin']
LongStepPos_feat = ['AP_amplitude']
ShortStepPos_feat = ['AHP_depth_abs']

# Prepare the traces for eFEL
def get_features(data):
    # All the traces converted in eFEL format
    efel_traces = {'LongStepNeg': [], 'ShortStepPos': [], 'LongStepPos': []}
    for step_name, step_traces in data.items():
        for rep in step_traces:            
            data = rep.reshape(len(rep)/2,2)
            # A single eFEL trace 
            trace = {}
            trace['T'] = data[:,0]
            trace['V'] = data[:,1] 
            trace['stim_start'] = [steps_info[step_name][0]]
            trace['stim_end'] = [steps_info[step_name][1]]
            trace['name'] = step_name
            
            efel_traces[step_name].append(trace)
    
    features_values = collections.defaultdict(dict)       
    
    features_values['LongStepNeg'] = efel.getMeanFeatureValues(efel_traces['LongStepNeg'], LongStepNeg_feat
                                                                )
    
    features_values['LongStepPos'] = efel.getMeanFeatureValues(efel_traces['LongStepPos'], LongStepPos_feat)
    
    features_values['ShortStepPos'] = efel.getMeanFeatureValues(efel_traces['ShortStepPos'], ShortStepPos_feat)    

    return features_values

#efel.getFeatureNames()

**TODO** We can now visualise the feature values we computed, each row in the table corresponds to a repetition of the same step.

In [9]:
# TODO: run the code below to visualize the features extracted from each repetition, each stimulus. 
# Do these values make sense?

efel_features = dict(get_features(steps_v_dict))
IPython.display.HTML(json2html.convert(json=efel_features))

0,1
LongStepPos,AP_amplitude58.364583331462.848958331267.343750002359.031249996557.0781249953
ShortStepPos,AHP_depth_abs-70.8125-73.75-76.3124999995
LongStepNeg,voltage_base-73.3308017928voltage_deflection_begin9.60735

AP_amplitude
58.3645833314
62.8489583312
67.3437500023
59.0312499965
57.0781249953

AHP_depth_abs
-70.8125
-73.75
-76.3124999995

0,1
voltage_base,-73.3308017928
voltage_deflection_begin,9.60735


We compute features mean and standard deviation.

In [10]:
#TODO: run the code below to compute the mean and standard deviations from the repetitions of each stimulus

features_dict = collections.OrderedDict()
for step_name, reps in efel_features.items():
    feature_values = collections.defaultdict(list)
    for rep in reps: 
        for feature_name, value in rep.iteritems():
            feature_values[feature_name].append(value)
   
    features_dict[step_name] = {"soma":{}}
    for name, values in feature_values.items():
        features_dict[step_name]["soma"][name] = [numpy.mean(values), numpy.std(values)]
        
IPython.display.HTML(json2html.convert(json=dict(features_dict)))

TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'

We write the eFeatures in a .json file that we will use later in the exercise.

In [9]:
# TODO: run the code below to save the efeatures in a .json file. You can also open it with a text editor and compare
# with the same file we obtained last week.
with open('features.json', 'w') as fp:
    json.dump(features_dict, fp, indent = 4)

## 3. Write out the stimulation protocols

Now it's time to process the current stimuli that were used to record the voltage responses seen above.

We will estimate the stimuli amplitude from the trace and save them in a file "protocols.json". They will be used later on in the project to stimulate your neuron model.

**TODO** Plot the current traces

In [12]:
# TODO: run this cell to visualize the current stimuli. The graph should appear similar 
# to the one with the voltage responses (multiple repetitions grouped by stimulus name)

# Plot the current traces
# Initialize a figure
fig1, axes = plt.subplots(len(steps_i_dict), sharey = True)

for idx, step_name in enumerate(steps_i_dict.keys()):
    for rep, trace in enumerate(steps_i_dict[step_name]):
        data = trace.reshape(len(trace)/2,2)
        axes[idx].plot(data[:,0],data[:,1], label = 'Rep. ' + str(rep+1))
        axes[idx].set_ylabel('Current (nA)')
        axes[idx].legend(loc = 'best')
        axes[idx].set_title(step_name)
    axes[-1].set_xlabel('Time (ms)')

<IPython.core.display.Javascript object>

**TODO** copy the "steps_info" dictionary.

In [None]:
protocols_dict = collections.OrderedDict()

# TODO: Replace the line below to copy the "steps_info" dictionary. 
# We will use the stimuli start and end to write the current protocol to simulate the response in our neuron
steps_info = {'LongStepNeg': [0, 10000], 'ShortStepPos': [0, 10000], 'LongStepPos': [0, 10000]}

# Stimuli holding current and step current amplitudes in nA
amps_info = collections.defaultdict(list)
for step_name in steps_i_dict.keys():
    
    iholds = []
    isteps = []
    for trace in steps_i_dict[step_name]:
        data = trace.reshape(len(trace)/2,2)
        tot_duration = steps_info[step_name][1]+steps_info[step_name][0]
   
        dt = float(tot_duration)/len(data)
        ihold = numpy.mean(data[:,1][0:int(steps_info[step_name][0]/dt)])

        istep = numpy.mean(data[:,1][int(steps_info[step_name][0]/dt):int(steps_info[step_name][1]/dt)])-ihold
        iholds.append(ihold)
        isteps.append(istep)
       
    amps_info[step_name].append(round(numpy.mean(isteps), 4))
    amps_info[step_name].append(round(numpy.mean(iholds), 4)) 
    
#amps_info  = {'LongStepNeg': [-0.01, 0.05], 'ShortStepPos': [0.18,0.05],'LongStepPos': [0.15 ,0.05]}

for step_name, reps in efel_features.items():   
    protocols_dict[step_name] = {"stimuli":[]}
    protocols_dict[step_name]["stimuli"].append({"delay":steps_info[step_name][0],
                                               "amp":amps_info[step_name][0],
                                               "duration":steps_info[step_name][1]-steps_info[step_name][0],
                                               "totduration":steps_info[step_name][1]+steps_info[step_name][0]})
    protocols_dict[step_name]["stimuli"].append({"delay":0,
                                               "amp":amps_info[step_name][1],
                                               "duration":steps_info[step_name][1]+steps_info[step_name][0],
                                               "totduration":steps_info[step_name][1]+steps_info[step_name][0]})
 

In [11]:
# TODO: run the line below to visualize the protocols that we have computed. 
# For each stimulus you should have two lines, representing the step current parameters and the holding current parameters
IPython.display.HTML(json2html.convert(json=dict(protocols_dict)))

0,1
LongStepPos,stimulidelayampdurationtotduration2500.08273600410000.060241004100
ShortStepPos,stimulidelayampdurationtotduration2500.107922572500.06725725
LongStepNeg,stimulidelayampdurationtotduration250-0.05833000350000.059935003500

0,1
stimuli,delayampdurationtotduration2500.08273600410000.060241004100

delay,amp,duration,totduration
250,0.0827,3600,4100
0,0.0602,4100,4100

0,1
stimuli,delayampdurationtotduration2500.107922572500.06725725

delay,amp,duration,totduration
250,0.1079,225,725
0,0.06,725,725

0,1
stimuli,delayampdurationtotduration250-0.05833000350000.059935003500

delay,amp,duration,totduration
250,-0.0583,3000,3500
0,0.0599,3500,3500


In [12]:
# TODO: save the protocols in the "protocols.json" information
# Save the protocols in a .json file
with open('protocols.json', 'w') as fp:
    json.dump(protocols_dict, fp, indent = 4)