Skip to content

Commit

Permalink
Merge pull request #285 from NeurodataWithoutBorders/doc/object_id
Browse files Browse the repository at this point in the history
Document object_id
  • Loading branch information
rly committed Sep 17, 2019
2 parents ddbb568 + f39446c commit da3bcd7
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 26 deletions.
18 changes: 17 additions & 1 deletion docs/format/source/format_description.rst
Expand Up @@ -203,6 +203,23 @@ Extending *NWBContainer* works in the same way, e.g., to create more specific ty
data processing.


Common attributes
-----------------

All NWB:N Groups and Datasets with an assigned neurodata_type have three required attributes: `neurodata_type`,
`namespace`, and `object_id`.

- ``neurodata_type`` (variable-length string) is the name of the NWB:N primitive that this group or dataset maps onto
- ``namespace`` (variable-length string) is the namespace where ``neurodata_type`` is defined, e.g. "core" or the
namespace of an extension
- ``object_id`` (variable-length string) is a universally unique identifier for this object within its hierarchy.
It should be set to the string representation of a random UUID version 4 value
(see `RFC 4122 <https://tools.ietf.org/html/rfc4122>`_) upon first creation. It is **not** a hash of the data. Files
that contain the exact same data but were generated in different instances will have different ``object_id`` values.
Currently, modification of an object does not require its ``object_id`` to be changed.



Comments and Definitions
========================

Expand Down Expand Up @@ -313,4 +330,3 @@ The timestamps\_link and data\_link fields refer to links made between
time series, such as if timeseries A and timeseries B, each having
different data (or time) share time (or data). This is much more
important information as it shows structural associations in the data.

2 changes: 1 addition & 1 deletion docs/storage/source/storage_description.rst
Expand Up @@ -15,7 +15,7 @@ basic primitives, e.g., Files, Groups, Datasets, Attributes, and Links to descri
The role of the data storage then is to store large collections of neuroscience data. In other words,
the role of the storage is to map NWB:N primitives (and types, i.e., neurodata_types) to persistent storage.
For an overview of the various components of the NWB:N project
see `here <http://nwb-overview.readthedocs.io/en/latest/nwbintro.html>`_ .
see `here <https://neurodatawithoutborders.github.io/overview>`_ .

How are NWB:N files stored?
===========================
Expand Down
39 changes: 20 additions & 19 deletions docs/storage/source/storage_hdf5.rst
Expand Up @@ -5,7 +5,7 @@ HDF5
====

The NWB:N format currently uses the `Hierarchical Data Format (HDF5) <https://www.hdfgroup.org/HDF5/>`_
as primary mechanism for data storage. HDF5 was selected for the
as the primary mechanism for data storage. HDF5 was selected for the
NWB format because it met several of the project's
requirements. First, it is a mature data format standard with libraries
available in multiple programming languages. Second, the format's
Expand All @@ -27,7 +27,7 @@ technology.
Format Mapping
==============

Here we describe the mapping of NWB primitives (e.g,. Groups, Datasets, Attributes, Links etc.) used by
Here we describe the mapping of NWB primitives (e.g,. Groups, Datasets, Attributes, Links, etc.) used by
the NWB format and specification to HDF5 storage primitives. As the NWB:N format was designed with HDF5
in mind, the high-level mapping between the format specification and HDF5 is quite simple:

Expand All @@ -42,20 +42,21 @@ in mind, the high-level mapping between the format specification and HDF5 is qui
Group Group
Dataset Dataset
Attribute Attribute
Link Soft Link (or External Link)
Link Soft Link or External Link
============= ===============================================


.. note::

In HDF5 Links are stored as HDF5 Soft Links (or External Links). Hard Links are not used in NWB as the primary location
and, hence, primary ownership and link path for secondary locations, cannot be determined for Hard Links.
Using HDF5, NWB links are stored as HDF5 Soft Links or External Links. Hard Links are not used in NWB because
the primary location and, hence, primary ownership and link path for secondary locations, cannot be determined
for Hard Links.


Key Mapping
===========

Here we describe the mapping of keys from the specifcation language to HDF5 storage objects:
Here we describe the mapping of keys from the specification language to HDF5 storage objects:

Groups
------
Expand All @@ -77,7 +78,8 @@ Groups
linkable Not mapped; Stored in schema only
quantity Not mapped; Number of appearances of the dataset.
neurodata_type Attribute ``neurodata_type``
namespace ID Attribute ``neurodata_namespace``
namespace ID Attribute ``namespace``
object ID Attribute ``object_id``
============================ ======================================================================================


Expand All @@ -89,7 +91,6 @@ Datasets
.. table:: Mapping of datasets
:class: longtable


============================ ======================================================================================
NWB Key HDF5
============================ ======================================================================================
Expand All @@ -102,14 +103,15 @@ Datasets
linkable Not mapped; Stored in schema only
quantity Not mapped; Number of appearances of the dataset.
neurodata_type Attribute ``neurodata_type``
namespace ID Attribute ``neurodata_namespace``
namespace ID Attribute ``namespace``
object ID Attribute ``object_id``
============================ ======================================================================================

.. note::

* TODO Update mapping of namespace ID
* TODO Update mapping of dims


Attributes
----------

Expand All @@ -127,7 +129,6 @@ Attributes
shape Shape of the HDF5 dataset if the shape is fixed, otherwise shape defines the maxshape
dims Not mapped; Reflected by the shape of the attribute data
required Not mapped; Stored in schema only
parent Not mapped; In HDF5 all attributes are explicitly tied to the parent.
value Data value of the attribute
============================ ======================================================================================

Expand Down Expand Up @@ -157,7 +158,7 @@ The mappings of data types is as follows
+--------------------------+----------------------------------+----------------+
| ``dtype`` **spec value** | **storage type** | **size** |
+--------------------------+----------------------------------+----------------+
| * "float" | single precision floating point | 32 bit |
| * "float" | single precision floating point | 32 bit |
| * "float32" | | |
+--------------------------+----------------------------------+----------------+
| * "double" | double precision floating point | 64 bit |
Expand All @@ -173,13 +174,13 @@ The mappings of data types is as follows
+--------------------------+----------------------------------+----------------+
| * "int8" | signed 8 bit integer | 8 bit |
+--------------------------+----------------------------------+----------------+
| * "uint32" | unsigned 32 bit integer | 32 bit |
| * "uint32" | unsigned 32 bit integer | 32 bit |
+--------------------------+----------------------------------+----------------+
| * "uint16" | unsigned 16 bit integer | 16 bit |
| * "uint16" | unsigned 16 bit integer | 16 bit |
+--------------------------+----------------------------------+----------------+
| * "uint8" | unsigned 8 bit integer | 8 bit |
| * "uint8" | unsigned 8 bit integer | 8 bit |
+--------------------------+----------------------------------+----------------+
| * "bool" | boolean | 8 bit |
| * "bool" | boolean | 8 bit |
+--------------------------+----------------------------------+----------------+
| * "text" | unicode | variable |
| * "utf" | | |
Expand All @@ -196,9 +197,9 @@ The mappings of data types is as follows
| * region | Reference to a region | |
| | of another dataset | |
+--------------------------+----------------------------------+----------------+
| compound dtype + HDF5 compound data type | |
| * compound dtype | HDF5 compound data type | |
+--------------------------+----------------------------------+----------------+
| * "isodatetime" | ASCII ISO8061 datetime string. | variable |
| * "isodatetime" | ASCII ISO8061 datetime string. | variable |
| | For example | |
| | ``2018-09-28T14:43:54.123+02:00``| |
+--------------------------+----------------------------------+----------------+
Expand All @@ -222,4 +223,4 @@ data type (e.g., ``dtype=special_dtype(vlen=binary_type)`` in Python). The speci
``/specifications/<namespace-name>/<version>/namespace`` while additional source files are stored in
``/specifications/<namespace-name>/<version>/<source-filename>``. Here ``<source-filename>`` refers to the main name
of the source-file without file extension (e.g,. the core namespace defines ``nwb.ephys.yaml`` as source which would
be stored in ``/specifications/core/2.0.1/nwb.ecephys``).
be stored in ``/specifications/core/2.0.1/nwb.ecephys``).
11 changes: 6 additions & 5 deletions docs/storage/source/storage_release_notes.rst
Expand Up @@ -2,23 +2,24 @@
Release Notes
=============

NWB:N - v2.1.0
--------------
Added documentation for new NWB key 'object_id' (see also format release notes for NWB 2.1.0: https://nwb-schema.readthedocs.io/en/latest/format_release_notes.html#september-2019).

NWB:N - v2.0.1
--------------
Added missing documentation on how format specification are cached in HDF5.

NWB:N - v2.0.0
---------------

Created seperate reStructuredText documentation (i.e., this document) discuss and govern
Created separate reStructuredText documentation (i.e., this document) discuss and govern
storage-related concerns. In particular this documents describes how primitives and keys
described via the specification language are mapped to storage, in particular HDF5.

NWB:N - v1.0.x and earlier
--------------------------

For version 1.0.x and earlier, there was no official seperate document governing NWB:N storage concerns as
For version 1.0.x and earlier, there was no official separate document governing NWB:N storage concerns as
HDF5 was the only supported storage backend with implicit mapping between HDF5 types and NWB:N
language primitives.



0 comments on commit da3bcd7

Please sign in to comment.