GitHub - flyconnectome/hnf: Documentation for the hierarchical neuron format

Hierarchical Neuron Format

The Hierarchical Neuron Format (HNF) is a schema for storing neuron morphologies and meta data in Hdf5 files.

We provide read/write implementations for R and Python:

for R see nat.hdf5
for Python see navis

Preamble

There are a few file formats that can store neuron morphology. To name but a few:

SWC for simple skeletons
NeuroML is an XML-based format primarily used for modelling but can store compartment models (i.e. skeletons) of neurons and meta data
NWB (neurodata without borders) is an HDF5-based format focused on physiological data
NRRD files can be used to store dotprops

Why then start a new format?

Because none of the existing formats tick all the boxes! We need a file format that can hold:

thousands of neurons
multiple representations (mesh, skeleton, dotprops) of a given neuron
annotations (e.g. synapses) associated with each neuron
meta data such as names, soma positions, etc.

Enter HDF5: basically a filesystem-in-a-file. The important thing is that we don't have to worry about how data is en-/decoded because other libraries (like h5py for Python or hdf5r for R) take care of that. All we have to do is come up with a schema.

Schema

HDF5 knows "groups" (=folders), "datasets" and "attributes". The basic idea for our schema is this:

the root contains info about the format as attributes
each group in root represents a neuron and the group's name is the neuron's ID
a neuron group holds and meta data, and the neuron's representations (mesh, skeleton and/or dotprops) and annotations in separate sub-groups

To illustrate the basic principle:

.
├── attrs: format-related meta data
├── group: neuron1
│   ├── attrs: neuron-related meta data
│   ├── group: skeleton
│   |    ├── attrs: skeleton-related meta data
|   |    └── datasets: node table, etc
│   ├── group: dotprops
│   |    ├── attrs: dotprops-related meta data
|   |    └── datasets: points, tangents, alpha, etc
│   ├── group: mesh
│   |    ├── attrs: mesh-related meta data
|   |    └── datasets: vertices, faces, etc
|   └── group: annotations
|       └── group: e.g. connectors
|           ├── attrs: connector-related meta data
|           └── datasets: connector data
├── group: neuron2
|   ├── ...  
...

Root attributes

The root meta data must contain two attributes:

format_spec specifies format and version
format_url points to a library or format specifications

.
├── attr['format_spec']: str = 'hnf_v1'   
├── attr['format_url']: str = 'https://github.com/schlegelp/navis'
...

Neuron base groups

Each neuron group contains properties that apply to all the neuron's potential representations - for example a neuron_name. Note that if an attribute is defined at the neuron level and again at a deeper level (i.e. the skeleton, mesh or dotprops), the more proximal attribute takes precedence for a given representation.

.
└── group['123456']  # note that numeric IDs will be "stringified"
    ├── attr["neuron_name"]: str = "some name"
...

Skeletons

Attributes:

units_nm (float | int | tuple, optional): specifies the units in nanometer space - can be a tuple of (x, y, z) if units are non-isotropic
soma (int, optional): the node ID of the soma

Datasets:

node_id (int): IDs for the nodes
parent_id (int): for each node, the ID of it's parent; nodes with out parents (i.e. roots) have parent_id of -1
x, y, z (float | int): node coordinates
radius (float | int, optional): radius for each node

└── group['123456']
    ├── attr['neuron_name'] = "example neuron with a skeleton"
    ├── attr['units_nm'] = (4, 4, 40)
    └── grp['skeleton']
         ├── attr['soma']: 1
         ├── ds['node_id']: (N, ) array
         ├── ds['parent_id']: (N, ) array
         ├── ds['x']: (N, ) array
         ├── ds['y']: (N, ) array
         ├── ds['z']: (N, ) array
         └── ds['radius']: (N, ) array, optional

Meshes

Meshes are principally represented as vertices + triangular faces (navis is using trimesh under the hood).

Attributes:

units_nm (float | int | tuple, optional): specifies the units in nanometer space - can be a tuple of (x, y, z) if units are non-isotropic
soma (tuple, optional): tuple of (x, y, z) coordinates of the soma

Datasets:

vertices (int | float): (N, 3) array of vertex positions
faces (int): (M, 3) array of vertex indices forming the faces (indices start at 0)
skeleton_map (int, optional): (N, ) array mapping each vertex to a node ID in the skeleton

└── group['4353421']
    ├── attr['neuron_name'] = "example neuron with a mesh"
    ├── attr['units_nm'] = (4, 4, 40)
    └── grp['mesh']
         ├── attr['soma']: (1242, 6533, 400)
         ├── ds['vertices']: (N, 3) array
         ├── ds['faces']: (M, 3) array
         └── ds['skeleton_map']: (N, ) array, optional

Dotprops

Attributes:

k (int): number of k-nearest neighbours used to calculate the tangent vectors from the point cloud
units_nm (float | int | tuple, optional): specifies the units in nanometer space - can be a tuple of (x, y, z) if units are non-isotropic
soma (tuple, optional): tuple of (x, y, z) coordinates of the soma

Datasets:

points (int | float): (N, 3) array of x/y/z positions
vect (int | float, optional): (N, 3) array of tangent vectors -
generated if not provided
alpha (int | float, optional): (N, ) array of alpha values for each point in points generated if not provided

└── group['65432']    
    ├── attr['neuron_name'] = "example neuron with dotprops"    
    └── grp['dotprops']
        ├── attr['k'] = 5
        ├── attr['units_nm'] = (4, 4, 40)
        ├── attr['soma']: (1242, 6533, 400)
        ├── ds['points']: (N, 3) array
        ├── ds['vect']: (N, 3) array
        └── ds['alpha']: (N, ) array

Annotations

Annotations are meant to be flexible and are principally parsed into pandas DataFrames. Because they won't follow a common format, it is good practice to leave some (optional) meta data pointing to columns containing data relevant for e.g. plotting:

Attributes:

point_col (str | list thereof): pointer to the column(s) containing x/y/z positions
type_col (str): pointer to a column specifying types
skeleton_map (str): pointer to a column associating the row with a node ID in the skeleton

Let's illustrate this with a mock synapse table:

└── group['32434566']
    ├── attr['neuron_name'] = "example neuron with synapse annotations"
    ├── attr['units_nm'] = 1
    └── grp['annotations']
         └── grp['synapses']
             ├── attr['points']: ['x', 'y', 'z']
             ├── attr['types']: 'prepost'
             ├── attr['skeleton_map']: 'node_id'
             ├── ds['x']: (N, ) array
             ├── ds['x']: (N, ) array
             ├── ds['z']: (N, ) array
             ├── ds['prepost']: (N, ) array of [0, 1, 2, 3, 4]
             └── ds['node_id']: (N, )

"Hidden" attributes & datasets

It can be useful to have attributes and datasets that contain information that's only pertinent for the reader/writer but does not directly relate to the neuron.

For this, we prefix the attribute/dataset with a .:

└── group['4353421']
    ├── attr['neuron_name'] = "example neuron with a mesh"
    ├── attr['units_nm'] = (4, 4, 40)
    ├── attr['.hidden_attribute'] = "typically ignored when reading"
    └── grp['mesh']
         ├── attr['soma']: (1242, 6533, 400)

We use hidden attributes to e.g. store a serialized version of a neuron instead/ in addition to the raw data to speed up reading the data.

A final remark

The above schema describes a "minimal" layout - i.e. we expect no less data than that. However, e.g. the navis implementations for reading/writing the schema are flexible: you can add more attributes or datasets and navis will by default try to read and attach them to the neuron.

Is this stable?

Ish? The format is versioned and I will maintain readers/writers for past versions in navis. In other good news: the HDF5 backend is stable - so even if navis acts up when parsing your file, you can always read it manually using h5py.

Changelog

The current version of the format is 1.0.

Changes:

2021/02/01: Version 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical Neuron Format

Preamble

Schema

Root attributes

Neuron base groups

Skeletons

Meshes

Dotprops

Annotations

"Hidden" attributes & datasets

A final remark

Is this stable?

Changelog

About

License

flyconnectome/hnf

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Neuron Format

Preamble

Schema

Root attributes

Neuron base groups

Skeletons

Meshes

Dotprops

Annotations

"Hidden" attributes & datasets

A final remark

Is this stable?

Changelog

About

Topics

Resources

License

Stars

Watchers

Forks