Model Definition

General Description

Common to all entities is that they have an id, a name (freely assignable), a definition, and a type (except for the Dimension which does not have a type and name field). The type is of outstanding importance because it provides the context to understand the stored data. Further it assumes a central role in the linking between data and metadata (see below).

Data Model NIX v1.3.2

id (obligatory)

Unique string which is used to identify and to reference data and metadata objects. In order to avoid collisions even in large collections of data from different sources, the id consists of a domain prefix followed by a random 64 digit hexadecimal number, both separated by an underscore. It is recommended to use a registered DNS domain as a prefix (e.g. g-node.org) but this is not a requisite constraint.

type (obligatory)

The type of a data object defines its actual nature. This allows us to introduce domain specificity into a general model.

name (obligatory)

A user provided human-readable specifier for the entity. The name, if present, is not required to be unique. However, in order to avoid confusion it is strongly recommended.

definition (optional)

A freely assignable textual definition of the entity.

date (obligatory)

Every entity has the date fields createdAt and updatedAt which are handled automatically and cannot be manually set.

DataArray

The core of the data model is the so called DataArray. Its main purpose is to store arbitrary raw data inside an n-dimensional array. Furthermore the DataArray (see Table) provides means to describe the physical nature of the stored data.

Name	Type	Optional
id	String	false
name	String	false
definition	String	true
type	String	false
metadata	Section	true
data	NDArray	true
data_type	DataType	false
unit	String	true
label	String	true
sources	Source[]	true
dimensions	Dimension[]	true
polynom_coefficients	double[]	true
expansion_origin	double	true

type (obligatory)

The type entry can be a string representing the type of data contained in the DataArray. Types should be defined in a terminology or ontology to allow for standardization. The type of a DataArray object also can be a mime type. In this case data should be a 1-dimensional array containing the raw data of a file.

metadata (optional)

This field allows linking the DataArray to arbitrary metadata described in a Section.

data (optional)

This field stores the data as an n-dimensional array. Thus, it can be a simple vector or an image stack represented as a 3-dimensional matrix. The type of the array is specified in the data type field described below.

data_type (obligatory)

A String value providing the data type of the n-dimensional data array stored in the data field. The following identifiers for data types are specified: byte, uint16, uint32 (uint), int16, int32 (int), int64 (long), float, double and string.

unit (optional)

A String providing the unit of the stored data. Please note that only SI units are supported.

label (optional)

A String defining the label of the axis that represents the data entries when plotting the data.

sources (optional)

Array of Source entities (see below). This provides the opportunity to link the data to a certain Source (e.g. a recording channel). One DataArray can be assigned to various sources.

polynom_coefficients and expansion_origin (optional)

In some cases it is much more efficient or convenient to store data not as floating point numbers but rather as (16 bit) integer values as, for example read from a data acquisition board. In order to convert such data to the correct values, we follow the approach taken by the http://www.comedi.org and provide polynomCoefficients and an expansionOrigin.

dimensions (optional)

An array of exactly num_dimensions entries of the Dimension type. These define how the different dimensions of the data within the DataArray have to be interpreted. Generally, there is always one entry per dimension.

Dimension

The Dimension entities are used to define what the dimensions of the data represent. There are four different Dimension subtypes which will be described in the following paragraphs. All realizations of Dimension have a label field providing a textual feature for the axis (e.g. ’time relative to stimulus onset’ for the dimension representing time in the data).

Set Dimension

The Set entity describes a list of values or datasets, where the index of the values in the array doesn't represent a (real) dimension. This can be, for example, a histogram of categorial data, a set of signals, or a list of spike-times. A Set can have a label for each data index it defines, e.g. the categories of a histogram. A list of spike times would not define any labels (unless you want to give each individual spike a name). Remember that all entries in the data must be of the same type, i.e. the same data_type and unit!

Name	Type	Optional
index	size_t	false
type	Enum {Sample, Set, Range}	false
labels	String[]	true

labels (optional)

An array of Strings that name the category.

Range Dimension

The Range entity is used to specify an axis that has been sampled at irregular intervals.

Name	Type	Optional
index	size_t	false
type	Enum {Sample, Set, Range}	false
label	String	true
unit	String	true
ticks	Double[]	false

label (optional)

A String representing the axis label.

unit (optional)

A String representing the unit of the data. Only SI units should be used.

ticks (obligatory)

An array of doubles specifying the sampling points.

Sample Dimension

The Sample entity is used to define a dimension which contains regularly sampled data. For example the time axis of a membrane potential recording that is sampled at a certain rate.

Name	Type	Optional
index	size_t	false
type	Enum {Sample, Set, Range}	false
label	String	true
unit	String	true
sampling_interval	Double	false
offset	Double	true

label (optional)

A String representing the axis label.

unit (optional)

A String representing the unit of the data. Only SI units should be used.

sampling_interval (obligatory)

A double entry specifying the sampling interval in terms of the associated unit. If omitted a sampling interval of 1 is assumed.

offset (optional)

This entry may be used to define a starting point different from 0. The offset is given in terms of the specified unit.

Tag

Tag entities are used to annotate the data stored in one or more DataArrays. A Tag can be a variety of things from events one wants to mark or the presentation of a certain stimulus, etc. Thus, they have a starting position and an extent.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
metadata	Section	true
position	Double[]	false
extent	Double[]	true
units	String[]	true
sources	Source[]	true
references	DataArray[]	true
features	Feature[]	true

metadata (optional)

This field allows linking the Tag to arbitrary metadata.

position (obligatory)

An array of double values pointing to a position in all associated DataArray objects defined by the field references. In order to define a specific coordinate in the referenced DataArray objects the length of the position array must correspond to the number of dimensions in the referenced DataArrays.

extents (optional)

If the Tag defines not only a position but an interval, this array of double values defines the length of this interval. The number of entries along the second dimension must match the number of dimensions in the referenced DataArrays.

references (optional)

Tag objects can reference one or many DataArray entities for each position entry. As described above, positions and extents are considered to mark one or more points or segments in all referenced DataArrays, respectively.

units (optional)

Array of SI units specifying the units of position and extent. In combination with the dimensions definition of each referenced DataArray and the values defined in position and extent this set of units can be used to calculate the array indices for the position and extent inside referenced data. In order to achieve this, it is necessary that the units defined in a Tag can be mapped to the units of the referenced DataArray (e.g. seconds to milliseconds). If a unit for one or more dimensions of the DataArray is unset or has the value ’none’ the corresponding entries for position or extent are interpreted directly as array indices.

features (optional)

Array of Feature entities. Each Tag itself may contain some data. For example: A detected spike in a referenced signal may not only have a defined position, but also a waveform. This kind of data associated with a Tag can be defined by another DataArray referenced in the field features. Each position can have a set of representations (e.g. the waveform of a detected spike and further spike characteristics like spike width and height).

MultiTag

MultiTag entities are the second type of tags to annotate data. In comparison with the Tag the MultiTag is used to annotate data at multiple positions and with multiple extents.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
metadata	Section	true
positions	DataArray	false
units	String[]	true
extents	DataArray	true
sources	Source[]	true
references	DataArray[]	true
features	Feature[]	true

metadata (optional)

This field allows linking the MultiTag to arbitrary metadata described in a Section.

positions (obligatory)

A DataArray pointing to a positions in all referenced DataArray objects (field references). The dimensionality of the positions DataArray must match that of the referenced DataArrays.

extents (optional)

The DataArray defines the extents of series of intervals. The dimensionality of the extent must match that of the referenced DataArray, the number of entries in the first dimension must match the positions DataArray.

references (optional)

MultiTag objects can reference one or many DataArray entities for each position entry. As said before, positions and extents are considered to mark points or segments in all referenced DataArrays, respectively.

features (optional)

Array of Feature entities. Each MultiTag itself may contain some data. For example: A detected spike in a referenced signal, may not only have a defined position, but also a waveform. This kind of data that is associated with a MultiTag can be defined by another DataArray referenced in the field features. Each position can have a set of representations (e.g. the waveform of a detected spike and further spike characteristics like spike width and height).

Feature

Feature objects are used to attach further data to Tags or MultiTags than just references into the raw data. Such features can be more detailed descriptions of the Tag/MultiTag. The descriptive data is stored in DataArray entities. Note: there is no name field in the Feature itself. A name is provided by the linked DataArray.

Name	Type	Optional
id	String	false
link_type	Enum {tagged, untagged, indexed}	false
data	DataArray	false

data (obligatory)

This field links to the DataArray containing the feature data.

link_type (obligatory)

The link type assumes one of the following values:

tagged indicates that position and extent of the Tag entity are also applied to the data used as a feature.
untagged indicates that the complete data referenced by Feature is to be taken into account.
indexed indicates that the feature has to be accessed according to the index of the respective position entry.

Source

A Source entity describes the provenance of a DataArray or Tag. Although many scientists may consider such an information as metadata, we decided to include this into the data model in order to provide compatibility with other formats like NEO or Neuroshare without forcing people to use odML.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
metadata	Section	true
sources	Source[]	true

metadata (optional)

This field allows linking the Source to arbitrary metadata described in a Section.

sources (optional)

This field allows linking to further Source entities thus building up a tree of sources. This can, for example, be used to specify that a source electrode array contains multiple electrodes as its child sources.

Section

A Section entity allows linking an Entity to arbitrary metadata described in a Section. Sections are the main containers within the metadata tree structure.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
repository	String	true
link	Section	true
mapping	String	true
properties	Property[]	true
sections	Section[]	true

repository (optional)

This field describes the repository in which a section of this type is defined. Usually this information is provided in the form of a URL.

link (optional)

Establish a link to another Section. The linking section inherits the properties defined in the linked section. Properties of the same name are overridden.

mapping (optional)

The mapping is provided as a path or URL to another section.

sections (optional)

Each Section object contain further Sections. This facilitates the creation of Section tree.

properties (optional)

Each Section can contain an Array of Properties.

Property

In the odML model information is stored in the form of extended key-value pairs which correspond to pairs of `Property` and `Values`. A Property contains information that is valid for all Values stored in it.

Name	Type	Optional
id	String	false
name	String	false
definition	String	true
unit	String	true
values	Value[]	false
mapping	String	true

unit (optional)

This field is the unit of all child Values.

mapping (optional)

Mapping defines how this Property should be treated in a mapping procedure. The mapping is provided in form of an url pointing to the definition of a section into which this property should be mapped.

values (obligatory)

Array of Values associated with this Property.

Block

A Block is the top level grouping element for data objects that somehow are related to each other. It is a requirement that every data object has to be associated with one Block object. This way a block can be seen as something that represents a dataset or the combined results of an experiment.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
metadata	Section	true
sources	Source[]	true
data_arrays	DataArray[]	true
tags	Tag[]	true
multi_tags	MultiTag[]	true
groups	Groups[]	true

Group

The Group establishes subgroups below the Block while entities can be part of multiple Groups. A Group can contain DataArrays, Tags and MultiTags. It further allows linking to Sources and to attach metadata.

Name	Type	Optional
id	String	false
type	String	false
name	String	false
definition	String	true
metadata	Section	true
sources	Source[]	true
data_arrays	DataArray[]	true
tags	Tag[]	true
multi_tags	MultiTag[]	true

Model Definition

General Description

id (obligatory)

type (obligatory)

name (obligatory)

definition (optional)

date (obligatory)

DataArray

type (obligatory)

metadata (optional)

data (optional)

data_type (obligatory)

unit (optional)

label (optional)

sources (optional)

polynom_coefficients and expansion_origin (optional)

dimensions (optional)

Dimension

Set Dimension

labels (optional)

Range Dimension

label (optional)

unit (optional)

ticks (obligatory)

Sample Dimension

label (optional)

unit (optional)

sampling_interval (obligatory)

offset (optional)

Tag

metadata (optional)

position (obligatory)

extents (optional)

references (optional)

units (optional)

features (optional)

MultiTag

metadata (optional)

positions (obligatory)

extents (optional)

references (optional)

features (optional)

Feature

data (obligatory)

link_type (obligatory)

Source

metadata (optional)

sources (optional)

Section

repository (optional)

link (optional)

mapping (optional)

sections (optional)

properties (optional)

Property

unit (optional)

mapping (optional)

values (obligatory)

Block

Group

Clone this wiki locally