Skip to content

Latest commit

 

History

History
127 lines (83 loc) · 6.21 KB

dataset.rst

File metadata and controls

127 lines (83 loc) · 6.21 KB

Data Containers: Data sets and Meta data

Table of Contents

Rationale

Data products produced by data processing procedures are meant to be read, underatood, and used by others. Many people tend to store data with no note of meaning attached to those data. Without attached explanation, it is difficult for other people to fully understand or use correctly a collection of numbers. It could be difficult for even the data producer to recall the exact meaning of the numbers after a while. When someone receives a data 'product', besides dataets, one would expect explanation informaion associated with the product.

FDI implements a data product container scheme so that not only description and other metadata (data about data) are always attached to the "payload" data, but also that your data can have its context data attached as light weight references. One can organize scalar, vector, array, table types of data in the form of sequences, mappings, with nesting and referencing.

FDI is meant to be a small open-source package. Data stored in FDI ojects are easily accessible with Python API and are exported (serialized and stored by default) in cross-platform, human-readable JSON format. There are heavier formats (e..g. HDF5) and packages (e.g. iRODS) for similar goals. FDI's data model was originally inspired by Herschel Common Software System (v15) products, taking other requirements of scientific observation and data processing into account. The APIs are kept as compatible with HCSS (written in Java, and in Jython for scripting) as possible.

Data Containers

Dataset

Three types of datasets are implemented to store potentially any hierarchical data as a dataset. Like a product, all datasets may have meta data, with the distinction that the meta data of a dataset is related to that particular dataset only.

array dataset

a dataset containing array data (say a data vector, array, cube etc...) and may have a unit and a typecode for efficient storing.

Examples (from quickstart page):

table dataset

a dataset containing a collection of columns with column header as the key. Each column contains array dataset. All columns have the same number of rows.

Examples (from quickstart page):

composite dataset

a dataset containing a collection of datasets. This allows arbitrary complex structures, as a child dataset within a composite dataset may be a composite dataset itself and so on...

Meta data and Parameters

FDI datasets and products not only contain data, but also their metadata -- data about the "payload" data. Metadata is defined as a collection of named Parameters.

Often a parameter shows a property. So a parameter in the metadata of a dataset or product is often called a property.

Parameter

scalar or vector variable with attributes. There are the following parameter types:

  • Parameter: Types are defined in metadata.ParameterTypes. If requested, a Parameter can check its value or a given value with the validity specification, which can be a combination of descrete values, ranges, and bit-masked values.
  • NumericParameter
  • DateParameter
  • StringParameter
Parameter class

parameter value

parameter attributes

Parameter typed objects description, type, validity descriptor, and default value
NumericParameter a number (scalar), a Vector2D (2D), a Vector (3D), or a Quaternion (4D) all above plus a unit and a typecode
DateParameter FineTime date-time Same as Parameter, type is 'finetime', Python datetime.format string as the default typecode.
StringParameter String Same as Parameter, type is 'string', 'B' (for byte unsigned) as the default typecode

Examples (from quickstart page):

Metadata

class manages parameters for datasets and products.

Examples (from quickstart page):

run tests

You can test sub-package dataset and utils with test1 and test5 respectively.

In the install directory:

make test1
make test5

test3 is for pns server self-test only

Design

Packages

image

Classes

image

fdi