## Extraction of features to be used by the somatic validation

In [1]:
import json
import logging
from pathlib import Path

In [2]:
import pandas as pd

In [3]:
from extract_features import translate_legacy_targets, get_files_metadata, extract_efeatures

In [4]:
logger = logging.getLogger()
logger.setLevel(logging.ERROR)

In [5]:
traces_dir = Path("..") / "feature_extraction" / "input-traces"

# get all folder names in traces_dir
cell_ids = [x.name for x in traces_dir.iterdir() if x.is_dir()]
print(f"Found {len(cell_ids)} cells")

with open("experiments.json", "r") as f:
    experiments = json.load(f)

Found 44 cells


#### The protocols to be used in extraction

IDRest and IDthresh protocols will only be used in the computation of the rheobase.

In [6]:
experiments.keys()

dict_keys(['IDthresh', 'IDhyperpol', 'sAHP', 'APThreshold'])

In [7]:
experiments["sAHP"]

{'location': 'soma.v',
 'tolerances': [20.0],
 'targets': [150, 170, 200, 220, 250, 270, 300, 350],
 'efeatures': ['Spikecount',
  'AP_amplitude',
  'inv_time_to_first_spike',
  'AHP_depth_abs',
  'sag_ratio1',
  'decay_time_constant_after_stim',
  'steady_state_voltage',
  'minimum_voltage',
  'steady_state_voltage_stimend']}

### Translating the experiments into a format that bluepyefe2 understands

In [8]:
targets = translate_legacy_targets(experiments)
files_metadata = get_files_metadata(traces_dir, cell_ids, experiments)

Cells used 44/44


### Running the extraction

Enable the plot flag for more detailed plots to be written in the etype directory.

In [9]:
etype = "L5TPC"
protocols_rheobase = ["IDthresh", "IDRest"]

extract_efeatures(
    etype, files_metadata, targets, protocols_rheobase, plot=False, per_cell=True
)

  self.efeatures[efeature_name] = numpy.nanmean(value)


extracting features for per cell...


## Features extracted from a group of cells

In this section we will look at the features extracted from a group of cells having the (cADpyr) e-type.

In [10]:
with open(Path(etype) / "features.json", "r") as features_file:
    etype_features = json.load(features_file)

All protocols applied to the cells are listed below.

The key before the underscore is the name of the protocol such as `sAHP` or `IDhyperpol`.

The number following the underscore represents the percentage amplitude of the current with respect to cell's rheobase.

* `sAHP_150` for example gives a current input that is equivalent of the 1.5X of the cell's rheobase.

* `sAHP_300` instead is the 3X of the cell's rheobase.

In [11]:
etype_features.keys()

dict_keys(['IDthresh_120', 'IDhyperpol_150', 'IDhyperpol_170', 'IDhyperpol_220', 'IDhyperpol_250', 'IDhyperpol_270', 'sAHP_150', 'sAHP_170', 'sAHP_200', 'sAHP_220', 'sAHP_250', 'sAHP_270', 'sAHP_300', 'sAHP_350', 'APThreshold_300', 'APThreshold_330'])

We are going to use the following function to display the features.

In [12]:
def features_df(features_config: dict, protocol: str) -> pd.DataFrame:
    """Returns the dataframe containing features for the given protocol."""
    df = pd.DataFrame(features_config[protocol]["soma"])
    df["mean"] = df["val"].apply(lambda x : x[0])
    df["variance"] = df["val"].apply(lambda x : x[1])
    df["relative_variance"] = df["variance"] / abs(df["mean"])
    df = df.drop(['val', "efeature_name"], axis=1)
    return df

# Features extracted from the sAHP_250 protocol

The below table shows the features extracted from the recordings after the sAHP_250 protocol is applied.

n stands for the number of traces used in computing the feature.

In [13]:
sahp_250 = "sAHP_250"
etype_sahp_df = features_df(etype_features, sahp_250)
etype_sahp_df.head(10)

Unnamed: 0,feature,n,efel_settings,mean,variance,relative_variance
0,Spikecount,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",5.775,1.234656,0.213793
1,AP_amplitude,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",71.731232,6.512071,0.090784
2,inv_time_to_first_spike,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",192.329675,59.675707,0.310278
3,AHP_depth_abs,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",-57.956532,4.580531,0.079034
4,decay_time_constant_after_stim,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",17.32212,55.046015,3.177787
5,steady_state_voltage,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",-75.432364,2.144763,0.028433
6,minimum_voltage,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",-86.296875,2.316065,0.026838
7,steady_state_voltage_stimend,40,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",-69.69006,2.585147,0.037095


The table above contains the feature mean, variance and relative variances computed from response of sAHP_250 protocol.

The relative variances are computed relative to the absolute value of mean as explained in here.
https://en.wikipedia.org/wiki/Index_of_dispersion

# Features extracted from the APThreshold_300 protocol

Similarly, the features extracted from the response of APThreshold_300 protocol is as follows. 

In [14]:
apthreshold_300 = "APThreshold_300"
apthreshold_df = features_df(etype_features, apthreshold_300)
apthreshold_df.head(10)

Unnamed: 0,feature,n,efel_settings,mean,variance,relative_variance
0,Spikecount,69,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",15.710145,2.548665,0.162231
1,AP_amplitude,69,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",70.697421,7.405067,0.104743
2,inv_first_ISI,69,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",6.157991,0.893994,0.145176
3,AP1_amp,69,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",76.23442,8.142366,0.106807
4,APlast_amp,69,"{'Threshold': -30.0, 'interp_step': 0.1, 'stri...",66.707065,7.761749,0.116356
