# Smestern ABF to Nwb conversion pipeline


---



This tutorial will show you how to convert ABF files to NWB files in a 'cell-dependant' manner. First, we have to install some dependencies.

In [None]:
!git clone https://github.com/smestern/ipfx.git
!git clone https://github.com/smestern/example-abf-files.git

Cloning into 'ipfx'...
remote: Enumerating objects: 13, done.[K
remote: Counting objects: 100% (13/13), done.[K
remote: Compressing objects: 100% (10/10), done.[K
remote: Total 5067 (delta 3), reused 9 (delta 3), pack-reused 5054[K
Receiving objects: 100% (5067/5067), 32.88 MiB | 25.72 MiB/s, done.
Resolving deltas: 100% (3650/3650), done.
Cloning into 'example-abf-files'...
remote: Enumerating objects: 28, done.[K
remote: Total 28 (delta 0), reused 0 (delta 0), pack-reused 28[K
Unpacking objects: 100% (28/28), done.


In [None]:
!apt-get install -qq /content/ipfx
!pip uninstall statsmodels -y
!pip uninstall tables -y
!pip install statsmodels==0.9.0
!pip install tables==3.5.1
!pip install /content/ipfx --log /content/log.txt


E: Unsupported file /content/ipfx given on commandline
Uninstalling statsmodels-0.10.2:
  Successfully uninstalled statsmodels-0.10.2
Uninstalling tables-3.4.4:
  Successfully uninstalled tables-3.4.4
Collecting statsmodels==0.9.0
[?25l  Downloading https://files.pythonhosted.org/packages/85/d1/69ee7e757f657e7f527cbf500ec2d295396e5bcec873cf4eb68962c41024/statsmodels-0.9.0-cp36-cp36m-manylinux1_x86_64.whl (7.4MB)
[K     |████████████████████████████████| 7.4MB 4.3MB/s 
Installing collected packages: statsmodels
Successfully installed statsmodels-0.9.0
Collecting tables==3.5.1
[?25l  Downloading https://files.pythonhosted.org/packages/ab/79/4e1301a87f3b7f27aa6c9cb1aeba4875ff3edb62a6fe3872dc8f04983db4/tables-3.5.1-cp36-cp36m-manylinux1_x86_64.whl (4.3MB)
[K     |████████████████████████████████| 4.3MB 4.6MB/s 
Collecting mock>=2.0
  Downloading https://files.pythonhosted.org/packages/30/6a/9bde648117ec7087c89a45de0a8b25aba21d54d3defd08cb24eacded875f/mock-4.0.1-py3-none-any.whl
Install

The above commands downloads a few example abf files to be used in this tutorial. Additionally, it downloads & installs my (slightly modified) version of Allen Institute's IPFX. This is the package utilized for conversion. 

## Step 1 Generate your JSON files

IPFX's primary input files are javascript object notation (.json). For the specific process of converting ABF files into NWB to use with IPFX, we need to create / modify 3 JSON files:

1.   mcc-settings.json
2.   stimulus_ontology.json
3.   conversion_input.json



### mcc-settings.json

ABF's only provide a small amount of data regarding the experiments. As a result, some information needs to be extracted from the programs used to gather the data. Specifically, we need to pull information from 'multi-clamp commander' (mcc). This can be achieved using '[mcc_get_settings.py](https://github.com/smestern/ipfx/blob/master/ipfx/bin/mcc_get_settings.py)' from the ipfx package. (NOTE: if you attempt this, it requires a 32-bit installation of python)

Ideally, this would be done at the time of the experiment. However, since these experiments were done in the past, we will have to compose this manually. I have generated this file on my lab computer using the '[mcc_get_settings.py](https://github.com/smestern/ipfx/blob/master/ipfx/bin/mcc_get_settings.py)' script. This is provided in the example data:

Please see the documented example below: (terms are defined by the axon guide: [https://mdc.custhelp.com/euf/assets/content/Axon%20Guide%203rd%20edition.pdf](https://mdc.custhelp.com/euf/assets/content/Axon%20Guide%203rd%20edition.pdf))



```
{
    "00830749_1": { //this section refers to the settings pulled from Multi-clamp commander
        "GetFastCompCap": 7.385518412117431e-12, //These are auto calculated by MCC. The current value is about average for our experiments.
        "GetFastCompTau": 9.270073064726603e-07, //These are auto calculated by MCC. The current value is about average for our experiments.
        "GetHolding": -0.0699935033917427,
        "GetHoldingEnable": false,
        "GetLeakSubEnable": false,
        "GetLeakSubResist": 10000000.0,
        "GetMeterIrmsEnable": false,
        "GetMeterResistEnable": false,
        "GetMode": 0, //For most cases mode '0' = current clamp, mode '1' = voltage clamp. However this does not really matter as the ABF file contains this information
        "GetModeSwitchEnable": false,
        "GetOscKillerEnable": false,
        "GetOutputZeroEnable": false,
        "GetPipetteOffset": 35.03935899958014488, // for our data, the pipette offset hovers aproxiamtely around 30-40mV. This will vary depending on the lab. 
        "GetPrimarySignal": 0,
        "GetPrimarySignalGain": 5.0, //this is experiment specific, for our 'NHP' recording, the current clamp gain was set to 5. 
        "GetPrimarySignalHPF": 0.0,
        "GetPrimarySignalLPF": 1600.0,
        "GetPulseAmplitude": 5.009999999776482582,
        "GetPulseDuration": 0.009999999776482582,
        "GetRsCompBandwidth": 1020.47998046875,
        "GetRsCompCorrection": 0.0,
        "GetRsCompEnable": false,
        "GetRsCompPrediction": 0.0,
        "GetScopeSignalLPF": 0.0,
        "GetSecondarySignal": 1,
        "GetSecondarySignalGain": 1.0,
        "GetSecondarySignalLPF": 10000.0,
        "GetSlowCompCap": 1.4285714538403438e-12, //These are auto calculated by MCC. The current value is about average for our experiments.
        "GetSlowCompTau": 0.00010256410314468667, //These are auto calculated by MCC. The current value is about average for our experiments.
        "GetSlowCompTauX20Enable": false,
        "GetWholeCellCompCap": 3.300633377723017e-11,
        "GetWholeCellCompEnable": false,
        "GetWholeCellCompResist": 9997999.0,
        "GetZapDuration": 0.0005000000237487257
    },
    "ScaleFactors": { //Refers to the scale applied to each stimulus.
        "C1NSD1SHORT": 1.05,
        "IC1": 5,
        "H:\\Monkey\\Ephys Protocols\\Monkey #3-4 Protocols\\Monkey Gap free": 5, \\Enter your custom stimuli here
        "H:\\Monkey\\Ephys Protocols\\Monkey #3-4 Protocols\\Monkey_1000 ms step": 5,
        "H:\\Monkey\\Ephys Protocols\\Monkey #3-4 Protocols\\Monkey_3 ms step": 5,
        "LSFINEST": 1.05,
        "SSFINEST": 7,
        "TRIPPLE": 7
    },
    "timestamp": "2019-09-30T12:52:31.387664Z",
    "uids": {
        "IN0": "00830749_1" //this links the 'input' (recording) electrode to the above MCC settings. 
        //In our case, the input electrode is labeled as IN0 in our ABF files. Therefore we must state the IN0 is the 'same' 
        // as 00830749_1
    }
}
```



![alt text](https://)### stimulus_ontology.json

 *(most)* ABF files have the stimulus waveform built into them, so no extra steps are needed to convert the stimulus into the NWB (this is done automatically).

  However, stimulus ontology allows IPFX to link specific stimuli to the equivalent stimulus used at the Allen Institute. This is crucial when using the resulting NWB file with Allen institutes suite of tools.

  To generate this file, the user should examine the stimulus of each ABF file [this can be done in *clampfit* by navigating to edit->Create Stimulus Waveform Signal; the stimulus protocol name can be found in file->properties]. The user should then find the matching stimuli from Allen institutes list of protocols (found here: [http://help.brain-map.org/download/attachments/8323525/CellTypes_Ephys_Overview.pdf?version=2&modificationDate=1508180425883&api=v2](http://help.brain-map.org/download/attachments/8323525/CellTypes_Ephys_Overview.pdf?version=2&modificationDate=1508180425883&api=v2). For example, our protocol "Monkey_1000 ms step.pro" matches Allen Institute's "long pulse": 

  **Monkey_1000 ms step.pro**
  ![Monkey_1000 ms step.pro](https://i.imgur.com/YJ7WVQp.png)
  **AI Long pulse**
  ![LONG PULSE](https://i.imgur.com/pHBuy8W.png)


Therefore I have to create an entry in 'stimulus_ontology.json' to link these two protocols:

```
[
        [
            "code", //The name of the protocol file used to create the ABF
            "Monkey_1000 ms step"
        ],
        [
            "name", //The name of the AI protocol you want to link it to
            "Long Square"
        ]
]
```

The finished stimulus_ontology.json should be placed in /ipfx/ipfx/defaults/


### conversion_input.json

This file is not needed for my script. However, if you are using an un-modified version of IPFX you will need to create an input.json for the x_to_NWB script.

## Step 2 Organize your files

For our experiments, each cell had several recordings (ABF files) associated with it. For my purposes, I needed each NWB file to represent a single cell. So, I sought to convert a collection of ABF files into a single NWB file. My script assumes that all ABF files found in a single folder represent a single cell (and subsequently builds an NWB file based on that folder). Therefore it is handy to organize your abf files like so:



```
|- Main folder
|-|----Cell_1
|-|----|---Cell_1_file1.abf
|-|----|---Cell_1_file2.abf
|-|----Cell_2
|-|----|---Cell_2_file1.abf
|-|----|---Cell_2_file2.abf
```



Note two things:

1.   Each folder contains only abf files generated from a specific cell
2.   Each folder is named after the specific cell.

Now we can point the script at the 'main folder' and it will build two distinct NWB files


1.   Cell_1.nwb (containing only:  Cell_1_file1.abf , Cell_1_file2.abf)
2.   Cell_2.nwb (containing only:  Cell_1_file1.abf , Cell_1_file2.abf)





## Step 3 run run_NHP_to_nwb_conversion.py

run_NHP_to_nwb_conversion.py is a customized version of "run_x_to_nwb_conversion.py," which more efficiently allows the user to batch convert abf files into NWB (based on the folder structure above).

Below I will walk you through the code.

To begin with, we need to define the main imports and primary function.
These are unmodified from Allen Institute's "run_x_to_nwb_conversion.py":


In [None]:
#!/bin/env python
import shutil
import os
import argparse
import logging
log = logging.getLogger(__name__)
import pyabf
from ipfx.x_to_nwb.ABFConverter import ABFConverter
from ipfx.x_to_nwb.DatConverter import DatConverter
import numpy as np

def convert(inFileOrFolder, overwrite=False, fileType=None, outputMetadata=False, outputFeedbackChannel=False, multipleGroupsPerFile=False, compression=True):
    """
    Convert the given file to a NeuroDataWithoutBorders file using pynwb
    Supported fileformats:
        - ABF v2 files created by Clampex
        - DAT files created by Patchmaster v2x90
    :param inFileOrFolder: path to a file or folder
    :param overwrite: overwrite output file, defaults to `False`
    :param fileType: file type to be converted, must be passed iff `inFileOrFolder` refers to a folder
    :param outputMetadata: output metadata of the file, helpful for debugging
    :param outputFeedbackChannel: Output ADC data which stems from stimulus feedback channels (ignored for DAT files)
    :param multipleGroupsPerFile: Write all Groups in the DAT file into one NWB
                                  file. By default we create one NWB per Group (ignored for ABF files).
    :param compression: Toggle compression for HDF5 datasets
    :return: path of the created NWB file
    """

    if not os.path.exists(inFileOrFolder):
        raise ValueError(f"The file {inFileOrFolder} does not exist.")

    if os.path.isfile(inFileOrFolder):
        root, ext = os.path.splitext(inFileOrFolder)
    if os.path.isdir(inFileOrFolder):
        if not fileType:
            raise ValueError("Missing fileType when passing a folder")

        inFileOrFolder = os.path.normpath(inFileOrFolder)
        inFileOrFolder = os.path.realpath(inFileOrFolder)

        ext = fileType
        root = os.path.join(inFileOrFolder, "..",
                            os.path.basename(inFileOrFolder))

    outFile = root + ".nwb"

    if not outputMetadata and os.path.exists(outFile):
        if overwrite:
            os.remove(outFile)
        else:
            raise ValueError(f"The output file {outFile} does already exist.")

    if ext == ".abf":
        if outputMetadata:
            ABFConverter.outputMetadata(inFileOrFolder)
        else:
            ABFConverter(inFileOrFolder, outFile, outputFeedbackChannel=outputFeedbackChannel, compression=compression)
    elif ext == ".dat":
        if outputMetadata:
            DatConverter.outputMetadata(inFileOrFolder)
        else:
            DatConverter(inFileOrFolder, outFile, multipleGroupsPerFile=multipleGroupsPerFile, compression=compression)

    else:
        raise ValueError(f"The extension {ext} is currently not supported.")

    return outFile

This defines the 'convert' function, which we call repeatedly in order to generate our NWB files.

Next we will define the location of our files:

In [None]:
NHPPath = "//content//example-abf-files//Example Files//"
os.chdir("//content//")
os.getcwd()
!ls

example-abf-files  ipfx  log.txt  sample_data


This should point to your root folder, which contains all of the cell folders (as outlined above). In our case, we downloaded some example abfs earlier in this guide; therefore, we will point the path to the example folders. (if loaded in google 'colab' click the small arrow on the left to view the files).

Now we are ready to convert:

In [None]:
print("Converting" + NHPPath)
for r, celldir, f in os.walk(NHPPath): ##OS.walk steps through each folder and file in a given directory. This also searchs subfolders 
                                      ## for additonal folders
              
              for c in celldir: ##Walks through each folder (cell folder) in the root folder
                   c = os.path.join(r, c) ##loads the subdirectory path
                   
                   shutil.copy("/content/example-abf-files/mcc-settings.json",c) ### this path should point todays the mcc-settings.json we created earlier.
                                                                                ## copys the mcc-settings.json into the cell folder for conversion. 
                                                                                ##otherwise throws an error
                   print(f"Converting {c}")
                   try:  
                        convert(c,
                            overwrite=True,
                            fileType='.abf',
                            outputMetadata=False,
                            outputFeedbackChannel=False,
                            multipleGroupsPerFile=True,
                            compression=True) ## this calls the conver command. It tells the command to look for all possible ABF files in the sub-folder.
                   except:
                        print(c)

Converting//content//example-abf-files//Example Files//
Converting //content//example-abf-files//Example Files//Cell 2


  warn("Date is missing timezone information. Updating to local timezone.")


Converting //content//example-abf-files//Example Files//Cell 1


Despite the errors, the conversion was a success! As you can now see, the script copied mcc-settings.json into each folder. Additionally, two new NWB files exist in our root folder. These files are ready to be loaded into other programs!

A quick look in HDFview reveals that everything is as it should be:
![hdfview](https://i.imgur.com/kdIcmiY.png)