Skip to content
M. Sonntag edited this page May 22, 2017 · 49 revisions

General Description

Common to all entities is that they have an id, a name (freely assignable), a definition, and a type (except for the Dimension which does not have a type and name field). The type is of outstanding importance because it provides the context to understand the stored data. Further it assumes a central role in the linking between data and metadata (see below).

Data Model NIX v1.3.2

id (obligatory)

Unique string which is used to identify and to reference data and metadata objects. In order to avoid collisions even in large collections of data from different sources, the id consists of a domain prefix followed by a random 64 digit hexadecimal number, both separated by an underscore. It is recommended to use a registered DNS domain as a prefix (e.g. g-node.org) but this is not a requisite constraint.

type (obligatory)

The type of a data object defines its actual nature. This allows us to introduce domain specificity into a general model.

name (obligatory)

A user provided human-readable specifier for the entity. The name, if present, is not required to be unique. However, in order to avoid confusion it is strongly recommended.

definition (optional)

A freely assignable textual definition of the entity.

date (obligatory)

Every entity has the date fields createdAt and updatedAt which are handled automatically and cannot be manually set.

DataArray

The core of the data model is the so called DataArray. Its main purpose is to store arbitrary raw data inside an n-dimensional array. Furthermore the DataArray (see Table) provides means to describe the physical nature of the stored data.

Name Type Optional
id String false
name String false
definition String true
type String false
metadata Section true
data NDArray true
data_type DataType false
unit String true
label String true
sources Source[] true
dimensions Dimension[] true
polynom_coefficients double[] true
expansion_origin double true

type (obligatory)

The type entry can be a string representing the type of data contained in the DataArray. Types should be defined in a terminology or ontology to allow for standardization. The type of a DataArray object also can be a mime type. In this case data should be a 1-dimensional array containing the raw data of a file.

metadata (optional)

This field allows linking the DataArray to arbitrary metadata described in a Section.

data (optional)

This field stores the data as an n-dimensional array. Thus, it can be a simple vector or an image stack represented as a 3-dimensional matrix. The type of the array is specified in the data type field described below.

data_type (obligatory)

A String value providing the data type of the n-dimensional data array stored in the data field. The following identifiers for data types are specified: byte, uint16, uint32 (uint), int16, int32 (int), int64 (long), float, double and string.

unit (optional)

A String providing the unit of the stored data. Please note that only SI units are supported.

label (optional)

A String defining the label of the axis that represents the data entries when plotting the data.

sources (optional)

Array of Source entities (see below). This provides the opportunity to link the data to a certain Source (e.g. a recording channel). One DataArray can be assigned to various sources.

polynom_coefficients and expansion_origin (optional)

In some cases it is much more efficient or convenient to store data not as floating point numbers but rather as (16 bit) integer values as, for example read from a data acquisition board. In order to convert such data to the correct values, we follow the approach taken by the http://www.comedi.org and provide polynomCoefficients and an expansionOrigin.

dimensions (optional)

An array of exactly num_dimensions entries of the Dimension type. These define how the different dimensions of the data within the DataArray have to be interpreted. Generally, there is always one entry per dimension.

Dimension

The Dimension entities are used to define what the dimensions of the data represent. There are four different Dimension subtypes which will be described in the following paragraphs. All realizations of Dimension have a label field providing a textual feature for the axis (e.g. ’time relative to stimulus onset’ for the dimension representing time in the data).

Set Dimension

The Set entity describes a list of values or datasets, where the index of the values in the array doesn't represent a (real) dimension. This can be, for example, a histogram of categorial data, a set of signals, or a list of spike-times. A Set can have a label for each data index it defines, e.g. the categories of a histogram. A list of spike times would not define any labels (unless you want to give each individual spike a name). Remember that all entries in the data must be of the same type, i.e. the same data_type and unit!

Name Type Optional
index size_t false
type Enum {Sample, Set, Range} false
labels String[] true

labels (optional)

An array of Strings that name the category.

Range Dimension

The Range entity is used to specify an axis that has been sampled at irregular intervals.

Name Type Optional
index size_t false
type Enum {Sample, Set, Range} false
label String true
unit String true
ticks Double[] false

label (optional)

A String representing the axis label.

unit (optional)

A String representing the unit of the data. Only SI units should be used.

ticks (obligatory)

An array of doubles specifying the sampling points.

Sample Dimension

The Sample entity is used to define a dimension which contains regularly sampled data. For example the time axis of a membrane potential recording that is sampled at a certain rate.

Name Type Optional
index size_t false
type Enum {Sample, Set, Range} false
label String true
unit String true
sampling_interval Double false
offset Double true

label (optional)

A String representing the axis label.

unit (optional)

A String representing the unit of the data. Only SI units should be used.

sampling_interval (obligatory)

A double entry specifying the sampling interval in terms of the associated unit. If omitted a sampling interval of 1 is assumed.

offset (optional)

This entry may be used to define a starting point different from 0. The offset is given in terms of the specified unit.

Tag

Tag entities are used to annotate the data stored in one or more DataArrays. A Tag can be a variety of things from events one wants to mark or the presentation of a certain stimulus, etc. Thus, they have a starting position and an extent.

Name Type Optional
id String false
type String false
name String false
definition String true
metadata Section true
position Double[] false
extent Double[] true
units String[] true
sources Source[] true
references DataArray[] true
features Feature[] true

metadata (optional)

This field allows linking the Tag to arbitrary metadata.

position (obligatory)

An array of double values pointing to a position in all associated DataArray objects defined by the field references. In order to define a specific coordinate in the referenced DataArray objects the length of the position array must correspond to the number of dimensions in the referenced DataArrays.

extents (optional)

If the Tag defines not only a position but an interval, this array of double values defines the length of this interval. The number of entries along the second dimension must match the number of dimensions in the referenced DataArrays.

references (optional)

Tag objects can reference one or many DataArray entities for each position entry. As described above, positions and extents are considered to mark one or more points or segments in all referenced DataArrays, respectively.

units (optional)

Array of SI units specifying the units of position and extent. In combination with the dimensions definition of each referenced DataArray and the values defined in position and extent this set of units can be used to calculate the array indices for the position and extent inside referenced data. In order to achieve this, it is necessary that the units defined in a Tag can be mapped to the units of the referenced DataArray (e.g. seconds to milliseconds). If a unit for one or more dimensions of the DataArray is unset or has the value ’none’ the corresponding entries for position or extent are interpreted directly as array indices.

features (optional)

Array of Feature entities. Each Tag itself may contain some data. For example: A detected spike in a referenced signal may not only have a defined position, but also a waveform. This kind of data associated with a Tag can be defined by another DataArray referenced in the field features. Each position can have a set of representations (e.g. the waveform of a detected spike and further spike characteristics like spike width and height).

MultiTag

MultiTag entities are the second type of tags to annotate data. In comparison with the Tag the MultiTag is used to annotate data at multiple positions and with multiple extents.

Name Type Optional
id String false
type String false
name String false
definition String true
metadata Section true
positions DataArray false
units String[] true
extents DataArray true
sources Source[] true
references DataArray[] true
features Feature[] true

metadata (optional)

This field allows linking the MultiTag to arbitrary metadata described in a Section.

positions (obligatory)

A DataArray pointing to a positions in all referenced DataArray objects (field references). The dimensionality of the positions DataArray must match that of the referenced DataArrays.

extents (optional)

The DataArray defines the extents of series of intervals. The dimensionality of the extent must match that of the referenced DataArray, the number of entries in the first dimension must match the positions DataArray.

references (optional)

MultiTag objects can reference one or many DataArray entities for each position entry. As said before, positions and extents are considered to mark points or segments in all referenced DataArrays, respectively.

features (optional)

Array of Feature entities. Each MultiTag itself may contain some data. For example: A detected spike in a referenced signal, may not only have a defined position, but also a waveform. This kind of data that is associated with a MultiTag can be defined by another DataArray referenced in the field features. Each position can have a set of representations (e.g. the waveform of a detected spike and further spike characteristics like spike width and height).

Feature

Feature objects are used to attach further data to Tags or MultiTags than just references into the raw data. Such features can be more detailed descriptions of the Tag/MultiTag. The descriptive data is stored in DataArray entities. Note: there is no name field in the Feature itself. A name is provided by the linked DataArray.

Name Type Optional
id String false
link_type Enum {tagged, untagged, indexed} false
data DataArray false

data (obligatory)

This field links to the DataArray containing the feature data.

link_type (obligatory)

The link type assumes one of the following values:

  • tagged indicates that position and extent of the Tag entity are also applied to the data used as a feature.
  • untagged indicates that the complete data referenced by Feature is to be taken into account.
  • indexed indicates that the feature has to be accessed according to the index of the respective position entry.

Source

A Source entity describes the provenance of a DataArray or Tag. Although many scientists may consider such an information as metadata, we decided to include this into the data model in order to provide compatibility with other formats like NEO or Neuroshare without forcing people to use odML.

Name Type Optional
id String false
type String false
name String false
definition String true
metadata Section true
sources Source[] true

metadata (optional)

This field allows linking the Source to arbitrary metadata described in a Section.

sources (optional)

This field allows linking to further Source entities thus building up a tree of sources. This can, for example, be used to specify that a source electrode array contains multiple electrodes as its child sources.

Section

A Section entity allows linking an Entity to arbitrary metadata described in a Section. Sections are the main containers within the metadata tree structure.

Name Type Optional
id String false
type String false
name String false
definition String true
repository String true
link Section true
mapping String true
properties Property[] true
sections Section[] true

repository (optional)

This field describes the repository in which a section of this type is defined. Usually this information is provided in the form of a URL.

link (optional)

Establish a link to another Section. The linking section inherits the properties defined in the linked section. Properties of the same name are overridden.

mapping (optional)

The mapping is provided as a path or URL to another section.

sections (optional)

Each Section object contain further Sections. This facilitates the creation of Section tree.

properties (optional)

Each Section can contain an Array of Properties.

Property

In the odML model information is stored in the form of extended key-value pairs which correspond to pairs of `Property` and `Values`. A Property contains information that is valid for all Values stored in it.

Name Type Optional
id String false
name String false
definition String true
unit String true
values Value[] false
mapping String true

unit (optional)

This field is the unit of all child Values.

mapping (optional)

Mapping defines how this Property should be treated in a mapping procedure. The mapping is provided in form of an url pointing to the definition of a section into which this property should be mapped.

values (obligatory)

Array of Values associated with this Property.

Block

A Block is the top level grouping element for data objects that somehow are related to each other. It is a requirement that every data object has to be associated with one Block object. This way a block can be seen as something that represents a dataset or the combined results of an experiment.

Name Type Optional
id String false
type String false
name String false
definition String true
metadata Section true
sources Source[] true
data_arrays DataArray[] true
tags Tag[] true
multi_tags MultiTag[] true
groups Groups[] true

Group

The Group establishes subgroups below the Block while entities can be part of multiple Groups. A Group can contain DataArrays, Tags and MultiTags. It further allows linking to Sources and to attach metadata.

Name Type Optional
id String false
type String false
name String false
definition String true
metadata Section true
sources Source[] true
data_arrays DataArray[] true
tags Tag[] true
multi_tags MultiTag[] true