Skip to content

SONATA for Arbor

noraabiakar edited this page Feb 18, 2019 · 5 revisions

This page simplifies and summarizes the SONATA Developer guide. It lists the necessary elements to build a complete description of a simulation network.

Morphologies

Morphologies are described using SWC format using the following structure:

Column Interpretaion
1 ID (Sample Number)
2 Type
3 X Position (um)
4 Y Position (um)
5 Z Position (um)
6 Radius (um)
7 Parent ID (Parent Sample Number)

Types are mapped as follows: 1 - soma 2 - axon 3 - basal dendrite 4 - apical dendrite.
IDs must start at 1 and be contiguous
Other rules are listed in the Developer Guide.

Ion channels and synapse models

For NEURON, nmod files are used. Potentially the same can be specified for Arbor.

Biophysical neuron channel distribution and composition

Represents the parameterization and distribution of ion channels and passive properties of neurons**.
Supported file formats:

  1. XML file using a NeuroML "biophysicalProperties" and “concentrationModel”. (example).
  2. JSON file using Allen Cell Types Data Base schema (example).
  3. HOC template (example).

**_AFAICT, the main point of this file is to set the parameters of density mechanisms and ion channels (specified in mod files). I think using JSON would be a good place to start.

Network

Networks are represented as nodes and edges.

Nodes

2 files are needed: A CSV file that describes node types and an HDF5 file that lists all nodes and their types and properties.

1. CSV file: node types

Has to contain at least the following columns:

  • node_type_id: unique node type id per node population (refer to developer guide for more info on node populations)
  • pop_name: to identify the node population

Other columns that must either be included in the CSV file, or per node in the HDF5 file

  • model_type: one of: biophysical, virtual, single_compartment, and point_neuron
  • model_template: used to reference a template or class describing the electrophysical properties and mechanisms of the node. Specified using the following synatx: schema : resource.
    • schema: type of template being specified. Reserved options include: nml, nrn, nest, hoc, ctdb, pynn.
    • resource: reference to the template file-name or class.
      For Arbor: we could add arb as a new schema for predefined arbor cells such as spiking_cell, lif_cell, and benchmark_cell.
      To add mechanisms to a regular cell we probably have to use nml or hoc (or maybe just use a json file?)
  • dynamics_params: AFAICT refers to the Biophysical neuron channel distribution and composition files discussed above. Sets the parameters of a cell's mechanisms.
  • morphology: refers to the morphology files discussed above.

Other reserved columns that don't need to be added can be found in the Developer Guide. Extra undefined columns can be added for convenience.

2. HDF5 file: full list of nodes

Structured as follows:

nodes  
|_______ population0
|                  |_________ node_group_id
|                  |_________ node_group_index  
|                  |_________ node_id
|                  |_________ node_type_id
|                  |_________ 0
|                  |          |_________ dynamics_params
|                  |          |_________ morphology
|                  |          |_________ etc
|                  |_________ 1
|                  |          |_________ model_template
|                  |_________ 2
|                  |          |_________ ...
|                  |_________ ...
|                  |_________ ...
|                  |_________ _last group id_
|                             |_________ ...
|______ population1  
|                  |_________ ...
|______ population2  
|                  |_________ ...
|______ etc.  
  • node_group_id : Assigns each node to a specific group of nodes
  • node_group_index : Indicates the index within a node_group that contains all the attributes for a particular node under consideration.
  • node_id : Assigns a key to uniquely identify and lookup a node within a population. It is primarily used to specify the source and target of an edge (connection)
  • node_type_id : refers to node_type_id from the CSV file previously discussed.

Edges:

2 files are needed: A CSV file that describes edge types and an HDF5 file that lists all edges and their types and properties.

1. CSV file: edge types

Has to contain at least the following columns:

  • edge_type_id: unique edge type id per edge population (refer to developer guide for more info on edge populations)
  • pop_name: to identify the edge population

Optional reserved columns:

  • model_template: String name of the template to create an object from parameters in dynamics_params. Can be NULL. (eg: expsyn in Arbor)
  • dynamics_params: AFAICT refers to the Biophysical neuron channel distribution and composition files discussed above.
  • delay: Axonal delay when the synaptic event begins relative to a spike from the presynaptic source. Units depend on the application (ms in NEURON).
  • syn_weight: Strength of the connection between the source and target nodes. The units depend on the requirements of the target mechanism.
  • source_pop_name: population name of the sender node id of a connection.
  • target_pop_name: population name of the receiver node id of a connection.
  • afferent_section_id: The specific section on the target node where a synapse is placed
  • afferent_section_pos: Given the section of where a synapse is placed on the target node, the position along the length of that section a (normalized to the range [0, 1], where 0 is at the start of the section and 1 is at the end of the section).
  • efferent_section_id: Same as afferent_section_id, but for source node.
  • efferent_section_pos: Same as afferent_section_pos, but for source node.

Other reserved columns that don't need to be added can be found in the Developer Guide. Extra undefined columns can be added for convenience.

2. HDF5 file: full list of edges

Structured as follows:

edges  
|_______ population0
|                  |_________ edge_group_id
|                  |_________ edge_group_index  
|                  |_________ edge_id
|                  |_________ edge_type_id
|                  |_________ source_node_id
|                  |_________ target_node_id
|                  |_________ 0
|                  |          |_________ dynamics_params
|                  |          |_________ delay
|                  |          |_________ etc
|                  |_________ 1
|                  |          |_________ model_template
|                  |_________ 2
|                  |          |_________ ...
|                  |_________ ...
|                  |_________ ...
|                  |_________ _last group id_
|                             |_________ ...
|______ population1  
|                  |_________ ...
|______ population2  
|                  |_________ ...
|______ etc.  
  • edge_group_id : Assigns each edge to a specific group of edges
  • edge_group_index : Indicates the index within a edge_group that contains all the attributes for a particular edge under consideration.
  • edge_id : Assigns a key to uniquely identify and lookup a edge within a population. It is primarily used to specify the source and target of an edge (connection)
  • edge_type_id : refers to edge_type_id from the CSV file previously discussed.
  • source_node_id : Specifies the sender node_id of the connection. The "source_pop_name" attribute of this dataset specifies the name of the source node population in which the node is valid.
  • target_node_id : Specifies the receiver node id of the connection. The "target_pop_name" attribute of this dataset specifies the name of the target node population in which the node is valid.

Optional edge indexing

This way of defining edges is designed to make it easy to connect nodes from one population to another, but it is not ideal for finding the source node of a given target node or vice-versa. This optional extension to the HDF5 edge file provides an indexing scheme that allows to query the source or target nodes of given node more efficiently.
This indexing is located in a single group called indices inside an edge population: (edges/population_name/indices/)
For convenience the following diagram doesn't include the full edges directory:

populationX  
|_______ indices  
|              |_________ source_to_target  
|              |                         |__________ node_id_to_ranges  
|              |                         |__________ range_to_edge_id  
|              |_________ target_to_source     
|              |                         |__________ node_id_to_ranges  
|              |                         |__________ range_to_edge_id  
  • source_to_target/node_id_to_ranges:

    • Indexed by source node_id.
    • Each node_id it points to a range of elements from source_to_target/range_to_edge_id dataset.
  • source_to_target/range_to_edge_id:

    • Indexed with the indices from node_id_to_ranges.
    • Defines ranges of edges in the edge population where each range contains edges whose source node is the same (the target node should also be the same for compactness).

The datasets from the target_to_source group are defined symmetrically.

This indexing has been designed for networks where if two nodes are connected, they tend to have multiple edges connecting them (multi-synapse connections in detailed morphology networks). For single edge networks this index has excessive overhead.

Network/Circuit and Simulation configuration files are not summarized in this document. Descriptions can be found in the developer guide.