# Design goals

1. Able to handle different types of data
    * Components defined in a generic and flexible way
2. Preserve relationships between datasets, and avoid data duplication
    * Components hashed to produce unique IDs
3. Promote construction of organized datasets
    * Components allow users to group information in useful and meaningful ways

# Database

* **Definition:** the backend tool that supports data I/O and querying.
* **Example:** Mongo database on local machine

# Configuration (CO)

* **Definition:** the inputs to a calculation of a material property
* **Example:** atomic types/positions, cell vectors, PBCs, constraints
* **Additional information:** user-provided names/labels, chemical formula, element concentrations, ...

# Property Instance (PI)

* **Definition:** the outputs of a calculation; a computed material property
* **Example:** configuration energy, atomic forces
* **Additional information:** units, pointers to configurations/definitions/settings

# Property Definition (PD)

* **Definition:**  explanation of the contents of a property
* **Example:** property name, data types/shapes, human-readable descriptions of fields

# PropertySettings (PS)

* **Definition:** additional metadata for setting up a calculation
* **Example:** software package/version, xc-functional, k-point mesh, full input file(s)

# ConfigurationSet (CS)

* **Definition:** a group of configurations
* **Example:** "Snapshots from a molecular dynamics run at 1000K"
* **Additional information:**
    * Aggregates configuration information (e.g., atom counts, labels, chemical systems, ...)
    * Useful for improving discoverability and interpretablility

# Dataset (DS)

* **Definition:** a group of computed properties and their associated configurations
* **Example:** QM9, Si PRX GAP, user-contributed datasets
* **Additional information:**
    * Aggregates property and configuration information (e.g., property types, labels, configuration set info, ...)
    * Pointers to CSs (instead of COs) to help keep data organized