This document is an explanation of the Elucidation Metatadata Standard and our own implementation of the standard as a set of libraries and tools. This Elucidation Metadata Standard provides a way to create implementation-independent metadata standards using text, so that you can create arbitrary byte-based metadata and store it using technologies like databases, without having to resort to creating your own libraries or locking you in to a particular library or knowing schemas in advance. Elucidator itself is a set of libraries and tools intended to provide real-world use of this standard. Other implementors are free to use this standard to implement other tools and libraries as needed.
This standard specifies how to describe byte-based metadata using text, and interpret it on reading. From here on
From here on, "The Standard" is an abbreviation of "The Elucidation Metadata Standard".
Metadatais a collection of bytes that can be interpreted based on someSpecification(singularMetadatum).Groupis a set ofMetadata Specifications, often particular to a project or domain.Specificationis the association of someIdentifierwith some set of rules for interpretation.Identifieris the string which is associated with some set of rules about how something should be interpreted. Identifiers must be UTF-8 encoded alphanumerical or underscore characters, beginning with an alphabetical character.Interpreteris a routine which can convert an individualMetadatuminto the correct associated types.Metadata Specificationis theSpecificationof a MetadataDesignationand its associated, orderedMembers.Designationis theIdentifierassociated with a particular class ofMetadata.Memberis a component of an individualMetadatum, which has an associatedIdentifier,Data Type, andValue. For a member, the identifier must be unique to an individualMetadata Specification, but need not be unique to across allMetadata Specifications.Data Typeindicates how a particularValueshould be extracted from a collection of bytes. The Standard specifies a discrete set of possible types. This is often abbreviated asDtype.Valueis the contents of someMemberinterpreted through itsData Type.Member Specificationdefines the association of someIdentifierwith aData Typefor a particularMember.Arrayis an ordered set of values with homogeneousData Type.
Group
├── Metadata Specification
│ ├── Designation
│ ├── Member 0
│ │ ├── Identifier
│ │ └── Data Type
│ ├── Member 1
│ │ ├── Identifier
│ │ └── Data Type
│ ├── ...
│ └── Member n
│ ├── Identifier
│ └── Data Type
├── Metadata Specification
│ └── ...
├── ...
└── Metadata Specification
└── ...
Metadata Specification consists of a Designation and an ordered set of Members.
The Designation should be unique for a given set of related Metadata.
Implementors are allowed to make any link between a Designation and the ordered set of Members that they please; for example, using columns in a SQL database, one for Designationand one forMetadata Specificationwhich contains a textual representation of theMember`s.
In the absence of an implementation-defined linkage, the following grammar should be used to indicate the mapping of designation to ordered member sets:
specification: Designation(member, member, member, ...)(context);
with Designation the designation for this specification, context an optional string with additional descriptive information, and members specified by the grammar
member: Identifier: Dtype
.
Compliant implementations may NOT use a context to perform any processing; this field is intended for human readability and information only, much like comments in source code.
Whitespace is ignored except for the context string, as Identifiers and Dtypes are not allowed to contain them.
The following table indicates all allowable data types. Compliant implementations must implement all data types.
| Name | String Representation |
|---|---|
| Byte | u8 |
| Unsigned 16-bit integer | u16 |
| Unsigned 32-bit integer | u32 |
| Unsigned 64-bit integer | u64 |
| Signed 8-bit integer | i8 |
| Signed 16-bit integer | i16 |
| Signed 32-bit integer | i32 |
| Signed 64-bit integer | i64 |
| IEEE 32-bit floating point | f32 |
| IEEE 64-bit floating point | f64 |
| String | string |
All Data Types which are not String may be constructed as an Array.
An Array may be of fixed size in the Member Specification, or of dynamic size.
NOTE: signed integers used for dynamic sizing are NOT compliant with The Standard.
Arrays are specified using the following grammar for fixed size:
Dtype[literal]
and the following grammar for dynamic size:
Dtype[]
For all types, little endian byte ordering is required.
The String type consists of one unsigned 64-bit integer, followed by that number of bytes to represent the string.
NOTE: The String type is NOT nul-terminated.
For fixed arrays, the underlying data type is repeated for the size of the array with no padding.
For dynamic arrays, like Strings, the array begins with one unsigned 64-bit integer, followed by that number of elements of the designated type in byte representation.
Elucidator contains the following components:
- A rust-based library which implements manipulations of metadata based on The Standard
- A rust-based library which adds database storage of metadata with spatiotemporal bounding boxes associated with each metadatum
- Python bindings for the rust libraries
- C bindings for the rust libraries
- A small set of utility tools