Skip to content

Latest commit

 

History

History
3409 lines (2480 loc) · 189 KB

optimade.rst

File metadata and controls

3409 lines (2480 loc) · 189 KB

OPTIMADE API specification v1.2.0-rc.1

Introduction

As researchers create independent materials databases, much can be gained from retrieving data from multiple databases. However, automating the retrieval of data is difficult if each database has a different application programming interface (API). This document specifies a standard API for retrieving data from materials databases. This API specification has been developed over a series of workshops entitled "Open Databases Integration for Materials Design", held at the Lorentz Center in Leiden, Netherlands and the CECAM headquarters in Lausanne, Switzerland.

The API specification described in this document builds on top of the JSON API v1.0 specification. In particular, the JSON API specification is assumed to apply wherever it is stricter than what is formulated in this document. Exceptions to this rule are stated explicitly (e.g. non-compliant responses are tolerated if a non-standard response format is explicitly requested).

Definition of Terms

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Database provider

A service that provides one or more databases with data desired to be made available using the OPTIMADE API.

Database-provider-specific prefix

Every database provider is designated a unique prefix. The prefix is used to separate the namespaces used by provider-specific extensions. The list of presently defined prefixes is maintained externally from this specification. For more information, see section Database-Provider-Specific Namespace Prefixes.

API implementation

A realization of the OPTIMADE API that a database provider uses to serve data from one or more databases.

Identifier

Names that MUST start with a lowercase letter ([a-z]) or an underscore ("_") followed by any number of lowercase alphanumerics ([a-z0-9]) and underscores ("_").

Base URL

The topmost URL under which the API is served. See section Base URL.

Versioned base URL

A URL formed by the base URL plus a path segment indicating a version of the API. See section Base URL.

Entry

A single instance of a specific type of resource served by the API implementation. For example, a structures entry is comprised by data that belong to a single structure.

Entry type

Entries are categorized into types, e.g., structures, calculations, references. Entry types MUST be named according to the rules for identifiers.

Entry property

One data item which belongs to an entry, e.g., the chemical formula of a structure.

Entry property name

The name of an entry property. Entry property names MUST follow the rules for identifiers and MUST NOT have the same name as any of the entry types.

Relationship

Any entry can have one or more relationships with other entries. These are described in section Relationships. Relationships describe links between entries rather than data that belong to a single entry, and are thus regarded as distinct from the entry properties.

Query filter

An expression used to influence the entries returned in the response to a URL query. The filter is specified using the URL query parameter filter using a format described in the section API Filtering Format Specification.

Queryable property

An entry property that can be referred to in the filtering of results. See section API Filtering Format Specification for more information on formulating filters on properties. The section Entry List specifies the REQUIRED level of query support for different properties. If nothing is specified, any support for queries is OPTIONAL.

ID

The ID entry property is a unique string referencing a specific entry in the database. The following constraints and conventions apply to IDs:

  • Taken together, the ID and entry type MUST uniquely identify the entry.
  • Reasonably short IDs are encouraged and SHOULD NOT be longer than 255 characters.
  • IDs MAY change over time.
Immutable ID

A unique string that specifies a specific resource in a database. The string MUST NOT change over time.

Response format

The data format for the HTTP response, which can be selected using the response_format URL query parameter. For more info, see section Response Format.

Field

The key used in response formats that return data in associative-array-type data structures. This is particularly relevant for the default JSON-based response format. In this case, field refers to the name part of the name-value pairs of JSON objects.

Data types

An API implementation handles data types and their representations in three different contexts:

  • In the HTTP URL query filter, see section API Filtering Format Specification.
  • In the HTTP response. The default response format is JSON-based and thus uses JSON data types. However, other response formats can use different data types. For more info, see section Responses.
  • The underlying database backend(s) from which the implementation serves data.

Hence, entry properties are described in this proposal using context-independent types that are assumed to have some form of representation in all contexts. They are as follows:

  • Basic types: string, integer, float, boolean, timestamp.
  • list: an ordered collection of items, where all items are of the same type, unless they are unknown. A list can be empty, i.e., contain no items.
  • dictionary: an associative array of keys and values, where keys are pre-determined strings, i.e., for the same entry property, the keys remain the same among different entries whereas the values change. The values of a dictionary can be any basic type, list, dictionary, or unknown.

An entry property value that is not present in the database is unknown. This is equivalently expressed by the statement that the value of that entry property is null. For more information see section Properties with an unknown value

The definition of a property of an entry type specifies a type. The value of that property MUST either have a value of that type, or be unknown.

General API Requirements and Conventions

Versioning of this standard

This standard describes a communication protocol that, when implemented by a server, provides clients with an API for data access.

Released versions of the standard are versioned using semantic versioning v2 in reference to changes in that API (i.e., not in the server-side implementation of the protocol).

To clarify: semantic versioning mandates version numbers of the form MAJOR.MINOR.PATCH, where a "backwards incompatible API change" requires incrementing the MAJOR version number. A future version of the OPTIMADE standard can mandate servers to change their behavior to be compliant with the newer version. However, such changes are only considered "backwards incompatible API changes" if they have the potential to break clients that correctly use the API according to the earlier version.

Furthermore, the addition of new keys in key-value-formatted responses of the OPTIMADE API are not regarded as "backwards incompatible API changes." Hence, a client MUST disregard unrecognized keys when interpreting responses (but MAY issue warnings about them). On the other hand, a change of the OPTIMADE standard that fundamentally alters the interpretation of a response due to the presence of a new key will be regarded as a "backwards incompatible API change" since a client interpreting the response according to a prior version of the standard would misinterpret that response.

Working copies distributed as part of the development of the standard are marked with the version number for the release they are based on with an additional "~develop" suffix. These "versions" do not refer to a single specific instance of the text (i.e., the same "~develop" version string is retained until a release), nor is it clear to what degree they contain backwards incompatible API changes. Hence, the suffix is intentionally designed to make these version strings not to conform with semantic versioning to prevent incorrect comparisons to released versions using the scheme prescribed by semantic versioning. Version strings with a "~develop" suffix MAY be used by implementations during testing. However, a client that encounters them unexpectedly SHOULD NOT make any assumptions about the level of API compatibility.

In conclusion, the versioning policy of this standard is designed to allow clients using the OPTIMADE API according to a specific version of the standard to assume compatibility with servers implementing any future (non-development) version of the standard sharing the same MAJOR version number.

Base URL

Each database provider will publish one or more base URLs that serve the API, for example: http://example.com/optimade/. Every URL path segment that follows the base URL MUST behave as standardized in this API specification.

Versioned base URLs

Access to the API is primarily provided under versioned base URLs. An implementation MUST provide access to the API under a URL where the first path segment appended to the base URL is /vMAJOR, where MAJOR is one of the major version numbers of the API that the implementation supports. This URL MUST serve the latest minor/patch version supported by the implementation. For example, the latest minor and patch version of major version 1 of the API is served under /v1.

An implementation MAY also provide versioned base URLs on the forms /vMAJOR.MINOR and /vMAJOR.MINOR.PATCH. Here, MINOR is the minor version number and PATCH is the patch version number of the API. A URL on the form /vMAJOR.MINOR MUST serve the latest patch version supported by the implementation of this minor version.

API versions that are published with a suffix, e.g., -rc<number> to indicate a release candidate version, SHOULD be served on versioned base URLs without this suffix.

If a request is made to a versioned base URL that begins with /v and an integer followed by any other characters, indicating a version that the implementation does not recognize or support, the implementation SHOULD respond with the custom HTTP server error status code 553 Version Not Supported, preferably along with a user-friendly error message that directs the client to adapt the request to a version it provides.

It is the intent that future versions of this standard will not assign different meanings to URLs that begin with /v and an integer followed by other characters. Hence, a client can safely attempt to access a specific version of the API via the corresponding versioned base URL. For other forms of version negotiation, see section Version Negotiation.

Examples of valid versioned base URLs:

Examples of invalid versioned base URLs:

Database providers SHOULD strive to implement the latest released version of this standard, as well as the latest patch version of any major and minor version they support.

Note: The base URLs and versioned base URLs themselves are not considered part of the API, and the standard does not specify the response for a request to them. However, it is RECOMMENDED that implementations serve a human-readable HTML document on base URLs and versioned base URLs, which explains that the URL is an OPTIMADE URL meant to be queried by an OPTIMADE client.

Unversioned base URL

Implementations MAY also provide access to the API on the unversioned base URL as described in this subsection.

Access via the unversioned URL is primarily intended for (i) convenience when manually interacting with the API, and (ii) to provide version agnostic permanent links to resource objects. Clients that perform automated processing of responses SHOULD access the API via versioned base URLs.

Implementations serving the API on the unversioned base URL have a few alternative options:

  1. Direct access MAY be provided to the full API.
  2. Requests to endpoints under the unversioned base URL MAY be redirected using an HTTP 307 temporary redirect to the corresponding endpoints under a versioned base URL.
  3. Direct access MAY be limited to only single entry endpoints (see section Single Entry Endpoints), i.e., so that this form of access is only available for permanent links to resource objects.

Implementations MAY combine direct access to single entry endpoints with redirects for other API queries.

The client MAY provide a query parameter api_hint to hint the server about a preferred API version. When this parameter is provided, the request is to be handled as described in section Version Negotiation, which allows a "best suitable" version of the API to be selected to serve the request (or forward the request to). However, if api_hint is not provided, the implementation SHOULD serve (or redirect to) its preferred version of the API (i.e., the latest, most mature, and stable version). In this case, that version MUST also be the first version in the response of the versions endpoint (see section Versions Endpoint).

For implementers: Before enabling access to the API on unversioned base URLs, implementers are advised to consider that an upgrade of the major version of the API served this way can change the behaviors of associated endpoints in ways that are not backward compatible.

Version Negotiation

The OPTIMADE API provides three concurrent mechanisms for version negotiation between client and server.

  1. The versions endpoint served directly under the unversioned base URL allows a client to discover all major API versions supported by a server in the order of preference (see section Versions Endpoint).
  2. A client can access the API under versioned base URLs. In this case, the server MUST respond according to the specified version or return an error if the version is not supported (see section Versioned Base URLs).
  3. When accessing the API under the unversioned base URL, clients are encouraged to append the OPTIONAL query parameter api_hint to hint the server about a preferred API version for the request. This parameter is described in more detail below.

The api_hint query parameter MUST be accepted by all API endpoints. However, for endpoints under a versioned base URL the request MUST be served as usual according to the version specified in the URL path segment regardless of the value of api_hint. In this case, the server MAY issue a warning if the value of api_hint suggests that the query may not be properly supported. If the client provides the parameter, the value SHOULD have the format vMAJOR or vMAJOR.MINOR, where MAJOR is a major version and MINOR is a minor version of the API. For example, if a client appends api_hint=v1.0 to the query string, the hint provided is for major version 1 and minor version 0.

If the server supports the major version indicated by the api_hint parameter at the same or a higher minor version (if provided), it SHOULD serve the request using this version. If the server does not support the major version hinted, or if it supports the major version but only at a minor version below the one hinted, it MAY use the provided values to make a best-effort attempt at still serving the request, e.g., by invoking the closest supported version of the API. If the hinted version is not supported by the server and the request is not served using an alternative version, the server SHOULD respond with the custom HTTP server error status code 553 Version Not Supported. Note that the above protocol means that clients MUST NOT expect that a returned response is served according to the version that is hinted.

For end users: Users are strongly encouraged to include the api_hint query parameter for URLs in, e.g., journal publications for queries on endpoints under the unversioned base URL. The version hint will make it possible to serve such queries in a reasonable way even after the server changes the major API version used for requests without version hints.

Index Meta-Database

A database provider MAY publish a special Index Meta-Database base URL. The main purpose of this base URL is to allow for automatic discoverability of all databases of the provider. Thus, it acts as a meta-database for the database provider's implementation(s).

The index meta-database MUST only provide the info and links endpoints, see sections Info Endpoints and Links Endpoint. It MUST NOT expose any entry listing endpoints (e.g., structures).

These endpoints do not need to be queryable, i.e., they MAY be provided as static JSON files. However, they MUST return the correct and updated information on all currently provided implementations.

The is_index field under attributes as well as the relationships field, MUST be included in the info endpoint for the index meta-database (see section Base Info Endpoint). The value for is_index MUST be true.

A few suggestions and mandatory requirements of the OPTIMADE specification are specifically relaxed only for index meta-databases to make it possible to serve them in the form of static files on restricted third-party hosting platforms:

  • When serving an index meta-database in the form of static files, it is RECOMMENDED that the response excludes the subfields in the top-level meta field that would need to be dynamically generated (as described in the section JSON Response Schema: Common Fields.) The motivation is that static files cannot keep dynamic fields such as time_stamp updated.
  • The JSON API specification requirements on content negotiation using the HTTP headers Content-type and Accept are NOT mandatory for index meta-databases. Hence, API Implementations MAY ignore the content of these headers and respond to all requests. The motivation is that static file hosting is typically not flexible enough to support these requirements on HTTP headers.
  • API implementations SHOULD serve JSON content with either the JSON API mandated HTTP header Content-Type: application/vnd.api+json or Content-Type: application/json. However, if the hosting platform does not allow this, JSON content MAY be served with Content-Type: text/plain.

Database-Provider-Specific Namespace Prefixes

This standard refers to database-provider-specific prefixes and database providers.

A list of known providers and their assigned prefixes is published in the form of an OPTIMADE Index Meta-Database with base URL https://providers.optimade.org. Visiting this URL in a web browser gives a human-readable description of how to retrieve the information in the form of a JSON file, and specifies the procedure for registration of new prefixes.

API implementations SHOULD NOT make up and use new prefixes without first getting them registered in the official list.

Examples:

  • A database-provider-specific prefix: exmpl. Used as a field name in a response: _exmpl_custom_field.

The initial underscore indicates an identifier that is under a separate namespace under the ownership of that organization. Identifiers prefixed with underscores will not be used for standardized names.

URL Encoding

Clients SHOULD encode URLs according to RFC 3986. API implementations MUST decode URLs according to RFC 3986.

Relationships

The API implementation MAY describe many-to-many relationships between entries along with OPTIONAL human-readable descriptions that describe each relationship. These relationships can be to the same, or to different, entry types. Response formats have to encode these relationships in ways appropriate for each format.

In the default response format, relationships are encoded as JSON API Relationships, see section Entry Listing JSON Response Schema.

For implementers: For database-specific response formats without a dedicated mechanism to indicate relationships, it is suggested that they are encoded alongside the entry properties. For each entry type, the relationships with entries of that type can then be encoded in a field with the name of the entry type, which are to contain a list of the IDs of the referenced entries alongside the respective human-readable description of the relationships. It is the intent that future versions of this standard uphold the viability of this encoding by not standardizing property names that overlap with the entry type names.

Properties with an unknown value

Many databases allow specific data values to exist for some of the entries, whereas for others, no data value is present. This is referred to as the property having an unknown value, or equivalently, that the property value is null.

The text in this section describes how the API handles properties with the value null. The use of null values inside nested property values (such as, e.g., lists or dictionaries) are described in the definitions of those data structures elsewhere in the specification, see section Entry List. For these properties, null MAY carry a special meaning.

REQUIRED properties with an unknown value MUST be included and returned in the response with the value null.

OPTIONAL properties with an unknown value, if requested explicitly via the response_fields query parameter, MUST be included and returned in the response with the value null. (For more info on the response_fields query parameter, see section Entry Listing URL Query Parameters.)

The interaction of properties with an unknown value with query filters is described in the section Filtering on Properties with an unknown value. In particular, filters with IS UNKNOWN and IS KNOWN can be used to match entries with values that are, or are not, unknown for some property, respectively.

Handling unknown property names

When an implementation receives a request with a query filter that refers to an unknown property name it is handled differently depending on the database-specific prefix:

  • If the property name has no database-specific prefix, or if it has the database-specific prefix that belongs to the implementation itself, the error 400 Bad Request MUST be returned with a message indicating the offending property name.
  • If the property name has a database-specific prefix that does not belong to the implementation itself, it MUST NOT treat this as an error, but rather MUST evaluate the query with the property treated as unknown, i.e., comparisons are evaluated as if the property has the value null.
    • Furthermore, if the implementation does not recognize the prefix at all, it SHOULD return a warning that indicates that the property has been handled as unknown.
    • On the other hand, if the prefix is recognized, i.e., as belonging to a known database provider, the implementation SHOULD NOT issue a warning but MAY issue diagnostic output with a note explaining how the request was handled.

The rationale for treating properties from other databases as unknown rather than triggering an error is for OPTIMADE to support queries using database-specific properties that can be sent to multiple databases.

For example, the following query can be sent to API implementations exmpl1 and exmpl2 without generating any errors:

filter=_exmpl1_band_gap<2.0 OR _exmpl2_band_gap<2.5

Responses

Response Format

This section defines a JSON response format that complies with the JSON API v1.0 specification. All endpoints of an API implementation MUST be able to provide responses in the JSON format specified below and MUST respond in this format by default.

Each endpoint MAY support additional formats, and SHOULD declare these formats under the endpoint /info/<entry type> (see section Entry Listing Info Endpoints). Clients can request these formats using the response_format URL query parameter. Specifying a response_format different from json (e.g. response_format=xml) allows the API to break conformance not only with the JSON response format specification, but also, e.g., in terms of how content negotiation is implemented.

Database-provider-specific response_format identifiers MUST include a database-provider-specific prefix (see section Database-Provider-Specific Namespace Prefixes).

JSON Response Schema: Common Fields

In the JSON response format, property types translate as follows:

  • string, boolean, list are represented by their similarly named counterparts in JSON.
  • integer, float are represented as the JSON number type.
  • timestamp uses a string representation of date and time as defined in RFC 3339 Internet Date/Time Format.
  • dictionary is represented by the JSON object type.
  • unknown properties are represented by either omitting the property or by a JSON null value.

Every response SHOULD contain the following fields, and MUST contain at least meta:

  • meta: a JSON API meta member that contains JSON API meta objects of non-standard meta-information. It MUST be a dictionary with these fields:

    • api_version: a string containing the full version of the API implementation. The version number string MUST NOT be prefixed by, e.g., "v". Examples: 1.0.0, 1.0.0-rc.2.
    • query: information on the query that was requested. It MUST be a dictionary with this field:
      • representation: a string with the part of the URL following the versioned or unversioned base URL that serves the API. Query parameters that have not been used in processing the request MAY be omitted. In particular, if no query parameters have been involved in processing the request, the query part of the URL MAY be excluded. Example: /structures?filter=nelements=2.
    • more_data_available: false if the response contains all data for the request (e.g., a request issued to a single entry endpoint, or a filter query at the last page of a paginated response) and true if the response is incomplete in the sense that multiple objects match the request, and not all of them have been included in the response (e.g., a query with multiple pages that is not at the last page).

    meta SHOULD also include these fields:

    • schema: a JSON API links object that points to a schema for the response. If it is a string, or a dictionary containing no meta field, the provided URL MUST point at an OpenAPI schema. It is possible that future versions of this specification allows for alternative schema types. Hence, if the meta field of the JSON API links object is provided and contains a field schema_type that is not equal to the string OpenAPI the client MUST not handle failures to parse the schema or to validate the response against the schema as errors.
    • time_stamp: a timestamp containing the date and time at which the query was executed.
    • data_returned: an integer containing the total number of data resource objects returned for the current filter query, independent of pagination.
    • provider: information on the database provider of the implementation. It MUST be a dictionary with these fields:

      provider MAY include these fields:

      • homepage: a JSON API links object, pointing to the homepage of the database provider, either directly as a string, or as a link object which can contain the following fields:
        • href: a string containing the homepage URL.
        • meta: a meta object containing non-standard meta-information about the database provider's homepage.

    meta MAY also include these fields:

    • data_available: an integer containing the total number of data resource objects available in the database for the endpoint.
    • last_id: a string containing the last ID returned.
    • response_message: response string from the server.
    • request_delay: a non-negative float giving time in seconds that the client is suggested to wait before issuing a subsequent request.

    Implementation note: the functionality of this field overlaps to some degree with features provided by the HTTP error 429 Too Many Requests and the Retry-After HTTP header. Implementations are suggested to provide consistent handling of request overload through both mechanisms.

    • implementation: a dictionary describing the server implementation, containing the OPTIONAL fields:
      • name: name of the implementation.
      • version: version string of the current implementation.
      • homepage: a JSON API links object, pointing to the homepage of the implementation.
      • source_url: a JSON API links object pointing to the implementation source, either downloadable archive or version control system.
      • maintainer: a dictionary providing details about the maintainer of the implementation, MUST contain the single field:
        • email with the maintainer's email address.
      • issue_tracker: a JSON API links object pointing to the implementation's issue tracker.
    • warnings: a list of warning resource objects representing non-critical errors or warnings. A warning resource object is defined similarly to a JSON API error object, but MUST also include the field type, which MUST have the value "warning". The field detail MUST be present and SHOULD contain a non-critical message, e.g., reporting unrecognized search attributes or deprecated features. The field status, representing an HTTP response status code, MUST NOT be present for a warning resource object. This is an exclusive field for error resource objects.

      Example for a deprecation warning:

      {
        "id": "dep_chemical_formula_01",
        "type": "warning",
        "code": "_exmpl_dep_chemical_formula",
        "title": "Deprecation Warning",
        "detail": "chemical_formula is deprecated, use instead chemical_formula_hill"
      }

      Note: warning ids MUST NOT be trusted to identify the exceptional situations (i.e., they are not error codes), use instead the field code for this. Warning ids can only be trusted to be unique in the list of warning resource objects, i.e., together with the type.

      General OPTIMADE warning codes are specified in section Warnings.

    • Other OPTIONAL additional information global to the query that is not specified in this document, MUST start with a database-provider-specific prefix (see section Database-Provider-Specific Namespace Prefixes).
    • Example for a request made to http://example.com/optimade/v1/structures/?filter=a=1 AND b=2:

      {
        "meta": {
          "query": {
            "representation": "/structures/?filter=a=1 AND b=2"
          },
          "api_version": "1.0.0",
          "schema": "http://schemas.optimade.org/openapi/v1/optimade.json",
          "time_stamp": "2007-04-05T14:30:20Z",
          "data_returned": 10,
          "data_available": 10,
          "more_data_available": false,
          "provider": {
            "name": "Example provider",
            "description": "Provider used for examples, not to be assigned to a real database",
            "prefix": "exmpl",
            "homepage": "http://example.com"
          },
          "implementation": {
            "name": "exmpl-optimade",
            "version": "0.1.0",
            "source_url": "http://git.example.com/exmpl-optimade",
            "maintainer": {
              "email": "admin@example.com"
            },
            "issue_tracker": "http://tracker.example.com/exmpl-optimade"
          }
        }
        // ...
      }
  • data: The schema of this value varies by endpoint, it can be either a single JSON API resource object or a list of JSON API resource objects. Every resource object needs the type and id fields, and its attributes (described in section API Endpoints) need to be in a dictionary corresponding to the attributes field.

The response MAY also return resources related to the primary data in the field:

  • links: JSON API links is REQUIRED for implementing pagination. (see section Entry Listing URL Query Parameters.) Each field of a links object, i.e., a "link", MUST be one of:

    • null
    • a string representing a URI, or
    • a dictionary ("link object") with fields
      • href: a string representing a URI
      • meta: (OPTIONAL) a meta object containing non-standard meta-information about the link

    Example links objects:

    • base_url: a links object representing the base URL of the implementation. Example:

      {
        "links": {
          "base_url": {
            "href": "http://example.com/optimade",
            "meta": {
              "_exmpl_db_version": "3.2.1"
            }
          }
          // ...
        }
        // ...
      }

    The following fields are REQUIRED for implementing pagination:

    • next: represents a link to fetch the next set of results. When the current response is the last page of data, this field MUST be either omitted or null-valued.

    An implementation MAY also use the following reserved fields for pagination. They represent links in a similar way as for next.

    • prev: the previous page of data. null or omitted when the current response is the first page of data.
    • last: the last page of data.
    • first: the first page of data.
  • included: a list of JSON API resource objects related to the primary data contained in data. Responses that contain related resources under included are known as compound documents in the JSON API.

    The definition of this field is found in the JSON API specification. Specifically, if the query parameter include is included in the request, included MUST NOT include unrequested resource objects. For further information on the parameter include, see section Entry Listing URL Query Parameters.

    This value MUST be either an empty array or an array of related resource objects.

If there were errors in producing the response all other fields MAY be present, but the top-level data field MUST be skipped, and the following field MUST be present:

  • errors: a list of JSON API error objects, where the field detail MUST be present. All other fields are OPTIONAL.

An example of a full response:

{
  "links": {
    "next": null,
    "base_url": {
      "href": "http://example.com/optimade",
      "meta": {
         "_exmpl_db_version": "3.2.1"
      }
    }
  },
  "meta": {
    "query": {
      "representation": "/structures?filter=a=1 AND b=2"
    },
    "api_version": "1.0.0",
    "time_stamp": "2007-04-05T14:30:20Z",
    "data_returned": 10,
    "data_available": 10,
    "last_id": "xy10",
    "more_data_available": false,
    "provider": {
      "name": "Example provider",
      "description": "Provider used for examples, not to be assigned to a real database",
      "prefix": "exmpl",
      "homepage": {
        "href": "http://example.com",
        "meta": {
          "_exmpl_title": "This is an example site"
        }
      }
    },
    "response_message": "OK"
    // <OPTIONAL implementation- or database-provider-specific metadata, global to the query>
  },
  "data": [
    // ...
  ],
  "included": [
    // ...
  ]
}

HTTP Response Status Codes

All HTTP response status codes MUST conform to RFC 7231: HTTP Semantics. The code registry is maintained by IANA and can be found here.

See also the JSON API definitions of responses when fetching data, i.e., sending an HTTP GET request.

Important: If a client receives an unexpected 404 error when making a query to a base URL, and is aware of the index meta-database that belongs to the database provider (as described in section Index Meta-Database), the next course of action SHOULD be to fetch the resource objects under the links endpoint of the index meta-database and redirect the original query to the corresponding database ID that was originally queried, using the object's base_url value.

HTTP Response Headers

There are relevant use-cases for allowing data served via OPTIMADE to be accessed from in-browser JavaScript, e.g. to enable server-less data aggregation. For such use, many browsers need the server to include the header Access-Control-Allow-Origin: * in its responses, which indicates that in-browser JavaScript access is allowed from any site.

Warnings

Non-critical exceptional situations occurring in the implementation SHOULD be reported to the referrer as warnings. Warnings MUST be expressed as a human-readable message, OPTIONALLY coupled with a warning code.

Warning codes starting with an alphanumeric character are reserved for general OPTIMADE error codes (currently, none are specified). For implementation-specific warnings, they MUST start with _ and the database-provider-specific prefix of the implementation (see section Database-Provider-Specific Namespace Prefixes).

API Endpoints

Access to API endpoints as described in the subsections below are to be provided under the versioned and/or the unversioned base URL as explained in the section Base URL.

The endpoints are:

  • a versions endpoint
  • an "entry listing" endpoint
  • a "single entry" endpoint
  • an introspection info endpoint
  • an "entry listing" introspection info endpoint
  • a links endpoint to discover related implementations
  • a custom extensions endpoint prefix

These endpoints are documented below.

Query parameters

Query parameters to the endpoints are documented in the respective subsections below. However, in addition, all API endpoints MUST accept the api_hint parameter described under Version Negotiation.

Versions Endpoint

The versions endpoint aims at providing a stable and future-proof way for a client to discover the major versions of the API that the implementation provides. This endpoint is special in that it MUST be provided directly on the unversioned base URL at /versions and MUST NOT be provided under the versioned base URLs.

The response to a query to this endpoint is in a restricted subset of the RFC 4180 CSV (text/csv; header=present) format. The restrictions are: (i) field values and header names MUST NOT contain commas, newlines, or double quote characters; (ii) Field values and header names MUST NOT be enclosed by double quotes; (iii) The first line MUST be a header line. These restrictions allow clients to parse the file line-by-line, where each line can be split on all occurrences of the comma ',' character to obtain the head names and field values.

In the present version of the API, the response contains only a single field that is used to list the major versions of the API that the implementation supports. The CSV format header line MUST specify version as the name for this field. However, clients MUST accept responses that include other fields that follow the version.

The major API versions in the response are to be ordered according to the preference of the API implementation. If a version of the API is served on the unversioned base URL as described in the section Base URL, that version MUST be the first value in the response (i.e., it MUST be on the second line of the response directly following the required CSV header).

It is the intent that all future versions of this specification retain this endpoint, its restricted CSV response format, and the meaning of the first field of the response.

Example response:

version
1
0

The above response means that the API versions 1 and 0 are served under the versioned base URLs /v1 and /v0, respectively. The order of the versions indicates that the API implementation regards version 1 as preferred over version 0. If the API implementation allows access to the API on the unversioned base URL, this access has to be to version 1, since the number 1 appears in the first (non-header) line.

Entry Listing Endpoints

Entry listing endpoints return a list of resource objects representing entries of a specific type. For example, a list of structures, or a list of calculations.

Each entry in the list includes a set of properties and their corresponding values. The section Entry list specifies properties as belonging to one of three categories:

  1. Properties marked as REQUIRED in the response. These properties MUST always be present for all entries in the response.
  2. Properties marked as REQUIRED only if the query parameter response_fields is not part of the request, or if they are explicitly requested in response_fields. Otherwise they MUST NOT be included. One can think of these properties as constituting a default value for response_fields when that parameter is omitted.
  3. Properties not marked as REQUIRED in any case, MUST be included only if explicitly requested in the query parameter response_fields. Otherwise they SHOULD NOT be included.

Examples of valid entry listing endpoint URLs:

There MAY be multiple entry listing endpoints, depending on how many types of entries an implementation provides. Specific standard entry types are specified in section Entry list.

The API implementation MAY provide other entry types than the ones standardized in this specification. Such entry types MUST be prefixed by a database-provider-specific prefix (i.e., the resource objects' type value should start with the database-provider-specific prefix, e.g., type = _exmpl_workflows). Each custom entry type SHOULD be served at a corresponding entry listing endpoint under the versioned or unversioned base URL that serves the API with the same name (i.e., equal to the resource objects' type value, e.g., /_exmpl_workflows). It is RECOMMENDED to align with the OPTIMADE API specification practice of using a plural for entry resource types and entry type endpoints. Any custom entry listing endpoint MUST also be added to the available\_endpoints and entry\_types\_by\_format attributes of the Base Info Endpoint.

For more on custom endpoints, see Custom Extension Endpoints.

Entry Listing URL Query Parameters

The client MAY provide a set of URL query parameters in order to alter the response and provide usage information. While these URL query parameters are OPTIONAL for clients, API implementations MUST accept and handle them. To adhere to the requirement on implementation-specific URL query parameters of JSON API v1.0, query parameters that are not standardized by that specification have been given names that consist of at least two words separated by an underscore (a LOW LINE character '_').

Standard OPTIONAL URL query parameters standardized by the JSON API specification:

  • filter: a filter string, in the format described below in section API Filtering Format Specification.
  • page_limit: sets a numerical limit on the number of entries returned. See JSON API 1.0. The API implementation MUST return no more than the number specified. It MAY return fewer. The database MAY have a maximum limit and not accept larger numbers (in which case an error code -- 403 Forbidden -- MUST be returned). The default limit value is up to the API implementation to decide. Example: http://example.com/optimade/v1/structures?page_limit=100
  • page_{offset, number, cursor, above, below}: A server MUST implement pagination in the case of no user-specified sort parameter (via the links response field, see section JSON Response Schema: Common Fields). A server MAY implement pagination in concert with sort. The following parameters, all prefixed by "page_", are RECOMMENDED for use with pagination. If an implementation chooses

    • offset-based pagination: using page_offset and page_limit is RECOMMENDED.
    • cursor-based pagination: using page_cursor and page_limit is RECOMMENDED.
    • page-based pagination: using page_number and page_limit is RECOMMENDED. It is RECOMMENDED that the first page has number 1, i.e., that page_number is 1-based.
    • value-based pagination: using page_above/page_below and page_limit is RECOMMENDED.

    Examples (all OPTIONAL behavior a server MAY implement):

    • skip 50 structures and fetch up to 100: /structures?page_offset=50&page_limit=100.
    • fetch page 2 of up to 50 structures per page: /structures?page_number=2&page_limit=50.
    • fetch up to 100 structures above sort-field value 4000 (in this example, server chooses to fetch results sorted by increasing id, so page_above value refers to an id value): /structures?page_above=4000&page_limit=100.
  • sort: If supporting sortable queries, an implementation MUST use the sort query parameter with format as specified by JSON API 1.0.

    An implementation MAY support multiple sort fields for a single query. If it does, it again MUST conform to the JSON API 1.0 specification.

    If an implementation supports sorting for an entry listing endpoint, then the /info/<entries> endpoint MUST include, for each field name <fieldname> in its data.properties.<fieldname> response value that can be used for sorting, the key sortable with value true. If a field name under an entry listing endpoint supporting sorting cannot be used for sorting, the server MUST either leave out the sortable key or set it equal to false for the specific field name. The set of field names, with sortable equal to true are allowed to be used in the "sort fields" list according to its definition in the JSON API 1.0 specification. The field sortable is in addition to each property description and other OPTIONAL fields. An example is shown in section Entry Listing Info Endpoints.

  • include: A server MAY implement the JSON API concept of returning compound documents by utilizing the include query parameter as specified by JSON API 1.0.

    All related resource objects MUST be returned as part of an array value for the top-level included field, see section JSON Response Schema: Common Fields.

    The value of include MUST be a comma-separated list of "relationship paths", as defined in the JSON API. If relationship paths are not supported, or a server is unable to identify a relationship path a 400 Bad Request response MUST be made.

    The default value for include is references. This means references entries MUST always be included under the top-level field included as default, since a server assumes if include is not specified by a client in the request, it is still specified as include=references. Note, if a client explicitly specifies include and leaves out references, references resource objects MUST NOT be included under the top-level field included, as per the definition of included, see section JSON Response Schema: Common Fields.

    Note: A query with the parameter include set to the empty string means no related resource objects are to be returned under the top-level field included.

Standard OPTIONAL URL query parameters not in the JSON API specification:

  • response_format: the output format requested (see section Response Format). Defaults to the format string 'json', which specifies the standard output format described in this specification. Example: http://example.com/optimade/v1/structures?response_format=xml
  • email_address: an email address of the user making the request. The email SHOULD be that of a person and not an automatic system. Example: http://example.com/optimade/v1/structures?email_address=user@example.com
  • response_fields: a comma-delimited set of fields to be provided in the output. If provided, these fields MUST be returned along with the REQUIRED fields. Other OPTIONAL fields MUST NOT be returned when this parameter is present. Example: http://example.com/optimade/v1/structures?response_fields=last_modified,nsites

Additional OPTIONAL URL query parameters not described above are not considered to be part of this standard, and are instead considered to be "custom URL query parameters". These custom URL query parameters MUST be of the format "<database-provider-specific prefix><url_query_parameter_name>". These names adhere to the requirements on implementation-specific query parameters of JSON API v1.0 since the database-provider-specific prefixes contain at least two underscores (a LOW LINE character '_').

Example uses of custom URL query parameters include providing an access token for the request, to tell the database to increase verbosity in error output, or providing a database-specific extended searching format.

Examples:

  • http://example.com/optimade/v1/structures?_exmpl_key=A3242DSFJFEJE
  • http://example.com/optimade/v1/structures?_exmpl_warning_verbosity=10
  • http://example.com/optimade/v1/structures?\_exmpl\_filter="elements all in [Al, Si, Ga]"

    Note: the specification presently makes no attempt to standardize access control mechanisms. There are security concerns with access control based on URL tokens, and the above example is not to be taken as a recommendation for such a mechanism.

Entry Listing JSON Response Schema

"Entry listing" endpoint response dictionaries MUST have a data key. The value of this key MUST be a list containing dictionaries that represent individual entries. In the default JSON response format every dictionary (resource object) MUST have the following fields:

  • type: field containing the Entry type as defined in section Definition of Terms
  • id: field containing the ID of entry as defined in section Definition of Terms. This can be the local database ID.
  • attributes: a dictionary, containing key-value pairs representing the entry's properties, except for type and id.

    Database-provider-specific properties need to include the database-provider-specific prefix (see section Database-Provider-Specific Namespace Prefixes).

OPTIONALLY it can also contain the following fields:

  • links: a JSON API links object can OPTIONALLY contain the field
    • self: the entry's URL
  • meta: a JSON API meta object that contains non-standard meta-information about the object.
  • relationships: a dictionary containing references to other entries according to the description in section Relationships encoded as JSON API Relationships. The OPTIONAL human-readable description of the relationship MAY be provided in the description field inside the meta dictionary of the JSON API resource identifier object. All relationships to entries of the same entry type MUST be grouped into the same JSON API relationship object and placed in the relationships dictionary with the entry type name as key (e.g., structures).

Example:

{
  "data": [
    {
      "type": "structures",
      "id": "example.db:structs:0001",
      "attributes": {
        "chemical_formula_descriptive": "Es2 O3",
        "url": "http://example.db/structs/0001",
        "immutable_id": "http://example.db/structs/0001@123",
        "last_modified": "2007-04-05T14:30:20Z"
      }
    },
    {
      "type": "structures",
      "id": "example.db:structs:1234",
      "attributes": {
        "chemical_formula_descriptive": "Es2",
        "url": "http://example.db/structs/1234",
        "immutable_id": "http://example.db/structs/1234@123",
        "last_modified": "2007-04-07T12:02:20Z"
      }
    }
    // ...
  ]
  // ...
}

Single Entry Endpoints

A client can request a specific entry by appending a URL-encoded ID path segment to the URL of an entry listing endpoint. This will return properties for the entry with that ID.

In the default JSON response format, the ID component MUST be the content of the id field.

Examples:

  • http://example.com/optimade/v1/structures/exmpl%3Astruct_3232823
  • http://example.com/optimade/v1/calculations/232132

The rules for which properties are to be present for an entry in the response are the same as defined in section Entry Listing Endpoints.

Single Entry URL Query Parameters

The client MAY provide a set of additional URL query parameters for this endpoint type. URL query parameters not recognized MUST be ignored. While the following URL query parameters are OPTIONAL for clients, API implementations MUST accept and handle them: response_format, email_address, response_fields. The URL query parameter include is OPTIONAL for both clients and API implementations. The meaning of these URL query parameters are as defined above in section Entry Listing URL Query Parameters.

Single Entry JSON Response Schema

The response for a 'single entry' endpoint is the same as for 'entry listing' endpoint responses, except that the value of the data field MUST have only one or zero entries. In the default JSON response format, this means the value of the data field MUST be a single response object or null if there is no response object to return.

Example:

{
  "data": {
    "type": "structures",
    "id": "example.db:structs:1234",
    "attributes": {
      "chemical_formula_descriptive": "Es2",
      "url": "http://example.db/structs/1234",
      "immutable_id": "http://example.db/structs/1234@123",
      "last_modified": "2007-04-07T12:02:20Z"
    }
  },
  "meta": {
    "query": {
      "representation": "/structures/example.db:structs:1234?"
    }
    // ...
  }
  // ...
}

Info Endpoints

Info endpoints provide introspective information, either about the API implementation itself, or about specific entry types.

There are two types of info endpoints:

  1. Base info endpoints: placed directly under the versioned or unversioned base URL that serves the API (e.g., http://example.com/optimade/v1/info or http://example.com/optimade/info)
  2. Entry listing info endpoints: placed under the endpoints belonging to specific entry types (e.g., http://example.com/optimade/v1/info/structures or http://example.com/optimade/info/structures)

The types and output content of these info endpoints are described in more detail in the subsections below. Common for them all are that the data field SHOULD return only a single resource object. If no resource object is provided, the value of the data field MUST be null.

Base Info Endpoint

The Info endpoint under a versioned or unversioned base URL serving the API (e.g. http://example.com/optimade/v1/info or http://example.com/optimade/info) returns information relating to the API implementation.

The single resource object's response dictionary MUST include the following fields:

  • type: "info"
  • id: "/"

- attributes: Dictionary containing the following fields:

  • api_version: Presently used full version of the OPTIMADE API. The version number string MUST NOT be prefixed by, e.g., "v". Examples: 1.0.0, 1.0.0-rc.2.
  • available_api_versions: MUST be a list of dictionaries, each containing the fields:
    • url: a string specifying a versioned base URL that MUST adhere to the rules in section Base URL
    • version: a string containing the full version number of the API served at that versioned base URL. The version number string MUST NOT be prefixed by, e.g., "v". Examples: 1.0.0, 1.0.0-rc.2.
  • formats: List of available output formats.
  • entry_types_by_format: Available entry endpoints as a function of output formats.
  • available_endpoints: List of available endpoints (i.e., the string to be appended to the versioned or unversioned base URL serving the API).
  • license: A JSON API links object giving a URL to a web page containing a human-readable text describing the license (or licensing options if there are multiple) covering all the data and metadata provided by this database. Clients are advised not to try automated parsing of this link or its content, but rather rely on the field available_licenses instead. Example: https://example.com/licenses/example_license.html.

attributes MAY also include the following OPTIONAL fields:

  • is_index: if true, this is an index meta-database base URL (see section Index Meta-Database).

    If this member is not provided, the client MUST assume this is not an index meta-database base URL (i.e., the default is for is_index to be false).

  • available_licenses: List of SPDX license identifiers <https://spdx.org/licenses/> specifying a set of alternative licenses under which the client is granted access to all the data and metadata in this database. If the data and metadata is available under multiple alternative licenses, identifiers of these multiple licenses SHOULD be provided to let clients know under which conditions the data and metadata can be used. Inclusion of a license identifier in the list is a commitment of the database that the rights are in place to grant clients access to all the data and metadata according to the terms of either of these licenses (at the choice of the client). If the licensing information provided via the field license omits licensing options specified in available_licenses, or if it otherwise contradicts them, a client MUST still be allowed to interpret the inclusion of a license in available_licenses as a full commitment from the database that the data and metadata is available, without exceptions, under the respective licenses. If the database cannot make that commitment, e.g., if only part of the data is available under a license, the corresponding license identifier MUST NOT appear in available_licenses (but, rather, the field license is to be used to clarify the licensing situation.) An empty list indicates that none of the SPDX licenses apply for the entirety of the database and that the licensing situation is clarified in human readable form in the field license.

If this is an index meta-database base URL (see section Index Meta-Database), then the response dictionary MUST also include the field:

  • relationships: Dictionary that MAY contain a single JSON API relationships object:

    • default: Reference to the links identifier object under the links endpoint that the provider has chosen as their "default" OPTIMADE API database. A client SHOULD present this database as the first choice when an end-user chooses this provider. This MUST include the field:
      • data: JSON API resource linkage. It MUST be either null or contain a single links identifier object with the fields:
        • type: links
        • id: ID of the provider's chosen default OPTIMADE API database. MUST be equal to a valid child object's id under the links endpoint.

    Lastly, is_index MUST also be included in attributes and be true.

Example:

{
  "data": {
    "type": "info",
    "id": "/",
    "attributes": {
      "api_version": "1.0.0",
      "available_api_versions": [
        {"url": "http://db.example.com/optimade/v0/", "version": "0.9.5"},
        {"url": "http://db.example.com/optimade/v0.9/", "version": "0.9.5"},
        {"url": "http://db.example.com/optimade/v0.9.2/", "version": "0.9.2"},
        {"url": "http://db.example.com/optimade/v0.9.5/", "version": "0.9.5"},
        {"url": "http://db.example.com/optimade/v1/", "version": "1.0.0"},
        {"url": "http://db.example.com/optimade/v1.0/", "version": "1.0.0"}
      ],
      "formats": [
        "json",
        "xml"
      ],
      "entry_types_by_format": {
        "json": [
          "structures",
          "calculations"
        ],
        "xml": [
          "structures"
        ]
      },
      "available_endpoints": [
        "structures",
        "calculations",
        "info",
        "links"
      ],
      "is_index": false
    }
  }
  // ...
}

Example for an index meta-database:

{
  "data": {
    "type": "info",
    "id": "/",
    "attributes": {
      "api_version": "1.0.0",
      "available_api_versions": [
        {"url": "http://db.example.com/optimade/v0/", "version": "0.9.5"},
        {"url": "http://db.example.com/optimade/v0.9/", "version": "0.9.5"},
        {"url": "http://db.example.com/optimade/v0.9.2/", "version": "0.9.2"},
        {"url": "http://db.example.com/optimade/v1/", "version": "1.0.0"},
        {"url": "http://db.example.com/optimade/v1.0/", "version": "1.0.0"}
        ],
      "formats": [
        "json",
        "xml"
      ],
      "entry_types_by_format": {
        "json": [],
        "xml": []
      },
      "available_endpoints": [
        "info",
        "links"
      ],
      "is_index": true
    },
    "relationships": {
      "default": {
        "data": { "type": "links", "id": "perovskites" }
      }
    }
  }
  // ...
}

Entry Listing Info Endpoints

Entry listing info endpoints are accessed under the versioned or unversioned base URL serving the API as /info/<entry_type> (e.g., http://example.com/optimade/v1/info/structures or http://example.com/optimade/info/structures). The response for these endpoints MUST include the following information in the data field:

  • description: Description of the entry.
  • properties: A dictionary describing properties for this entry type, where each key is a property name and the value is an OPTIMADE Property Definition described in detail in the section Property Definitions.
  • formats: List of output formats available for this type of entry.
  • output_fields_by_format: Dictionary of available output fields for this entry type, where the keys are the values of the formats list and the values are the keys of the properties dictionary.

Example (note: the description strings have been wrapped for readability only):

{
  "data": {
    "description": "a structures entry",
    "properties": {
      "nelements": {
        "title": "Number of elements",
        "type": ["integer", "null"],
        "description": "Number of different elements in the structure as an integer.\n
         \n
         -  Note: queries on this property can equivalently be formulated using `elements LENGTH`.\n
         -  A filter that matches structures that have exactly 4 elements: `nelements=4`.\n
         -  A filter that matches structures that have between 2 and 7 elements: `nelements>=2 AND nelements<=7`.",
        "examples": [
          3
        ],
        "x-optimade-property": {
          "property-uri": "urn:uuid:10a05e55-0c20-4f68-89ad-35a18eb7076f",
        },
        "x-optimade-unit": "dimensionless",
        "x-optimade-implementation": {
          "sortable": true,
          "query-support": "all mandatory"
        },
        "x-optimade-requirements": {
          "support": "should",
          "sortable": false,
          "query-support": "all mandatory"
        }
      },
      "lattice_vectors": {
        "title": "Unit cell lattice vectors",
        "type": ["array", "null"],
        "description": "The three lattice vectors in Cartesian coordinates, in ångström (Å).\n
        \n
        - MUST be a list of three vectors *a*, *b*, and *c*, where each of the vectors MUST BE a
          list of the vector's coordinates along the x, y, and z Cartesian coordinates.
        ",
        "examples": [
          [[4.0, 0.0, 0.0], [0.0, 4.0, 0.0], [0.0, 1.0, 4.0]]
        ],
        "x-optimade-unit": "inapplicable",
        "x-optimade-property": {
          "property-uri": "urn:uuid:81edf372-7b1b-4518-9c14-7d482bd67834",
          "unit-definitions": [
            {
              "symbol": "angstrom",
              "title": "ångström",
              "description": "The ångström unit of length.",
              "standard": {
                "name": "gnu units",
                "version": "3.09",
                "symbol": "angstrom"
              }
            }
          ]
        }
        "x-optimade-implementation": {
          "sortable": false,
          "query-support": "none"
        },
        "x-optimade-requirements": {
          "support": "should",
          "sortable": false,
          "query-support": "none"
        }
        "maxItems": 3
        "minItems": 3
        "items": {
           "type": "array",
           "x-optimade-unit": "inapplicable",
           "maxItems": 3
           "minItems": 3
           "items": {
             "type": "number",
             "x-optimade-unit": "angstrom",
             "x-optimade-implementation": {
               "sortable": true,
               "query-support": "none"
             },
             "x-optimade-requirements": {
               "sortable": false,
               "query-support": "none"
             }
           }
        }
      }
      // ... <other property descriptions>
    },
    "formats": ["json", "xml"],
    "output_fields_by_format": {
      "json": [
        "nelements",
        "lattice_vectors",
        // ...
      ],
      "xml": ["nelements"]
    }
  }
  // ...
}

This endpoint exposes information on other OPTIMADE API implementations that are related to the current implementation. The links endpoint MUST be provided under the versioned or unversioned base URL serving the API at /links.

Each link has a link_type attribute that specifies the type of the linked relation.

The link_type MUST be one of the following values:

  • child: a link to another OPTIMADE implementation that MUST be within the same provider. This allows the creation of a tree-like structure of databases by pointing to children sub-databases.
  • root: a link to the root implementation within the same provider. This is RECOMMENDED to be an Index Meta-Database. There MUST be only one root implementation per provider and all implementations MUST have a link to this root implementation. If the provider only supplies a single implementation, the root link links to the implementation itself.
  • external: a link to an external OPTIMADE implementation. This MAY be used to point to any other implementation, also in a different provider.
  • providers: a link to a List of Providers Links implementation that includes the current implementation, e.g. providers.optimade.org.

Limiting to the root and child link types, links can be used as an introspective endpoint, similar to the Info Endpoints, but at a higher level, i.e., Info Endpoints provide information on the given implementation, while the /links endpoint provides information on the links between immediately related implementations (in particular, an array of none or a single object with link type root and none or more objects with link type child, see section Internal Links: Root and Child Links).

For /links endpoints, the API implementation MAY ignore any provided query parameters. Alternatively, it MAY handle the parameters specified in section Entry Listing URL Query Parameters for entry listing endpoints.

The resource objects' response dictionaries MUST include the following fields:

  • type: MUST be "links".
  • id: MUST be unique.
  • attributes: Dictionary that MUST contain the following fields:
    • name: Human-readable name for the OPTIMADE API implementation, e.g., for use in clients to show the name to the end-user.
    • description: Human-readable description for the OPTIMADE API implementation, e.g., for use in clients to show a description to the end-user.
    • base_url: JSON API links object, pointing to the base URL for this implementation, either directly as a string, or as a links object, which can contain the following fields:
      • href: a string containing the OPTIMADE base URL.
      • meta: a meta object containing non-standard meta-information about the implementation.
    • homepage: JSON API links object, pointing to a homepage URL for this implementation, either directly as a string, or as a links object, which can contain the following fields:
      • href: a string containing the implementation homepage URL.
      • meta: a meta object containing non-standard meta-information about the homepage.
    • link_type: a string containing the link type. It MUST be one of the values listed above in section Link Types.
    • aggregate: a string indicating whether a client that is following links to aggregate results from different OPTIMADE implementations should follow this link or not. This flag SHOULD NOT be indicated for links where link_type is not child.

      If not specified, clients MAY assume that the value is ok. If specified, and the value is anything different than ok, the client MUST assume that the server is suggesting not to follow the link during aggregation by default (also if the value is not among the known ones, in case a future specification adds new accepted values).

      Specific values indicate the reason why the server is providing the suggestion. A client MAY follow the link anyway if it has reason to do so (e.g., if the client is looking for all test databases, it MAY follow the links where aggregate has value test).

      If specified, it MUST be one of the values listed in section Link Aggregate Options.

    • no_aggregate_reason: an OPTIONAL human-readable string indicating the reason for suggesting not to aggregate results following the link. It SHOULD NOT be present if aggregate has value ok.

Example:

{
  "data": [
    {
      "type": "links",
      "id": "index",
      "attributes": {
        "name": "Index",
        "description": "Index for example's OPTIMADE databases",
        "base_url": "http://example.com/optimade",
        "homepage": "http://example.com",
        "link_type": "root"
      }
    },
    {
      "type": "links",
      "id": "cat_zeo",
      "attributes": {
        "name": "Catalytic Zeolites",
        "description": "Zeolites for deNOx catalysis",
        "base_url": {
          "href": "http://example.com/optimade/denox/zeolites",
          "meta": {
            "_exmpl_catalyst_group": "denox"
          }
        },
        "homepage": "http://example.com",
        "link_type": "child"
      }
    },
    {
      "type": "links",
      "id": "frameworks",
      "attributes": {
        "name": "Zeolitic Frameworks",
        "description": "",
        "base_url": "http://example.com/zeo_frameworks/optimade",
        "homepage": "http://example.com",
        "link_type": "child"
      }
    },
    {
      "type": "links",
      "id": "testdb",
      "attributes": {
        "name": "Test database",
        "description": "A test database",
        "base_url": "http://example.com/testdb/optimade",
        "homepage": "http://example.com",
        "link_type": "child",
        "aggregate": "test"
      }
    },
    {
      "type": "links",
      "id": "internaldb",
      "attributes": {
        "name": "Database for internal use",
        "description": "An internal database",
        "base_url": "http://example.com/internaldb/optimade",
        "homepage": "http://example.com",
        "link_type": "child",
        "aggregate": "no",
        "no_aggregate_reason": "This is a database for internal use and might contain nonsensical data"
      }
    },
    {
      "type": "links",
      "id": "frameworks",
      "attributes": {
        "name": "Some other DB",
        "description": "A DB by the example2 provider",
        "base_url": "http://example2.com/some_db/optimade",
        "homepage": "http://example2.com",
        "link_type": "external"
      }
    },
    {
      "type": "links",
      "id": "optimade",
      "attributes": {
        "name": "Materials Consortia",
        "description": "List of OPTIMADE providers maintained by the Materials Consortia organisation",
        "base_url": "https://providers.optimade.org",
        "homepage": "https://optimade.org",
        "link_type": "providers"
      }
    }
  ]
}

Any number of resource objects with link_type equal to child MAY be present as part of the data list. A child object represents a "link" to an OPTIMADE implementation within the same provider exactly one layer below the current implementation's layer.

Exactly one resource object with link_type equal to root MUST be present as part of the data list. Note: the same implementation may of course be linked by other implementations via a /links endpoint with link_type equal to external.

The root resource object represents a link to the topmost OPTIMADE implementation of the current provider. By following child links from the root object recursively, it MUST be possible to reach the current OPTIMADE implementation.

In practice, this forms a tree structure for the OPTIMADE implementations of a provider. Note: The RECOMMENDED number of layers is two.

Resource objects with link_type equal to providers MUST point to an Index Meta-Database that supplies a list of OPTIMADE database providers. The intention is to be able to auto-discover all providers of OPTIMADE implementations.

A list of known providers can be retrieved as described in section Database-Provider-Specific Namespace Prefixes. This section also describes where to find information for how a provider can be added to this list.

If the provider implements an Index Meta-Database, it is RECOMMENDED to adopt a structure where the index meta-database is the root implementation of the provider.

This will make all OPTIMADE databases and implementations by the provider discoverable as links with child link type, under the links endpoint of the Index Meta-Database.

If specified, the aggregate attributed MUST have one of the following values:

  • ok (default value, if unspecified): it is ok to follow this link when aggregating OPTIMADE results.
  • test: the linked database is a test database, whose content might not be correct or might not represent physically-meaningful data. Therefore by default the link should not be followed.
  • staging: the linked database is almost production-ready, but final checks on its content are being performed, so the content might still contain errors. Therefore by default the link should not be followed.
  • no: any other reason to suggest not to follow the link during aggregation of OPTIMADE results. The implementation MAY provide mode details in a human-readable form via the attribute no-aggregate-reason.

Custom Extension Endpoints

API implementations MAY provide custom endpoints under the Extensions endpoint. Custom extension endpoints MUST be placed under the versioned or unversioned base URL serving the API at /extensions. The API implementation is free to define roles of further URL path segments under this URL.

API Filtering Format Specification

An OPTIMADE filter expression is passed in the parameter filter as a URL query parameter as specified by JSON API. The filter expression allows desired properties to be compared against search values; several such comparisons can be combined using the logical conjunctions AND, OR, NOT, and parentheses, with their usual semantics.

All properties marked as REQUIRED in section Entry list MUST be queryable with all mandatory filter features. The level of query support REQUIRED for other properties is described in Entry list.

When provided as a URL query parameter, the contents of the filter parameter is URL-encoded by the client in the HTTP GET request, and then URL-decoded by the API implementation before any further parsing takes place. In particular, this means the client MUST escape special characters in string values as described below for String values before the URL encoding, and the API implementation MUST first URL-decode the filter parameter before reversing the escaping of string tokens.

Examples of syntactically correct query strings embedded in queries:

  • http://example.org/optimade/v1/structures?filter=_exmpl_melting_point%3C300+AND+nelements=4+AND+chemical_formula_descriptive="SiO2"&response_format=xml

Or, fully URL encoded :

  • http://example.org/optimade/v1/structures?filter=_exmpl_melting_point%3C300+AND+nelements%3D4+AND+chemical_formula_descriptive%3D%22SiO2%22&response_format=xml

Lexical Tokens

The following tokens are used in the filter query component:

  • Property names: the first character MUST be a lowercase letter, the subsequent symbols MUST be composed of lowercase letters or digits; the underscore ("_", ASCII 95 dec (0x5F)) is considered to be a lower-case letter when defining identifiers. The length of the identifiers is not limited, except that when passed as a URL query parameter the whole query SHOULD NOT be longer than the limits imposed by the URI specification. This definition is similar to one used in most widespread programming languages, except that OPTIMADE limits allowed letter set to lowercase letters only. This allows to tell OPTIMADE identifiers and operator keywords apart unambiguously without consulting a reserved word table and to encode this distinction concisely in the EBNF Filter Language grammar.

    Examples of valid property names:

    • band_gap
    • cell_length_a
    • cell_volume

    Examples of incorrect property names:

    • 0_kvak (starts with a number);
    • "foo bar" (contains space; contains quotes)
    • BadLuck (contains upper-case letters)

    Identifiers that start with an underscore are specific to a database provider, and MUST be on the format of a database-provider-specific prefix (see section Database-Provider-Specific Namespace Prefixes).

    Examples:

    • _exmpl_formula_sum (a property specific to that database)
    • _exmpl_band_gap
    • _exmpl_supercell
    • _exmpl_trajectory
    • _exmpl_workflow_id
  • Nested property names A nested property name is composed of at least two identifiers separated by periods (.).
  • String values MUST be surrounded by double quote characters (", ASCII symbol 34 dec, 0x22 hex). A double quote that is a part of the value, not a delimiter, MUST be escaped by prepending it with a backslash character (\, ASCII symbol 92 dec, 0x5C hex). A backslash character that is part of the value (i.e., not used to escape a double quote) MUST be escaped by prepending it with another backslash. An example of an escaped string value, including the enclosing double quotes, is given below:

    • "A double quote character (\", ASCII symbol 34 dec) MUST be prepended by a backslash (\\, ASCII symbol 92 dec) when it is a part of the value and not a delimiter; the backslash character \"\\\" itself MUST be preceded by another backslash, forming a double backslash: \\\\"

    (Note that at the end of the string value above the four final backslashes represent the two terminal backslashes in the value, and the final double quote is a terminator, it is not escaped.)

    String value tokens are also used to represent timestamps in form of the RFC 3339 Internet Date/Time Format.

  • Numeric values are represented as decimal integers or in scientific notation, using the usual programming language conventions. A regular expression giving the number syntax is given below as a POSIX Extended Regular Expression (ERE) or as a Perl-Compatible Regular Expression (PCRE):
    • ERE: [-+]?([0-9]+(\.[0-9]\*)?|\.[0-9]+)([eE][-+]?[0-9]+)?
    • PCRE: [-+]?(?:\d+(\.\d*)?|\.\d+)(?:[eE][-+]?\d+)?

An implementation of the search filter MAY reject numbers that are outside the machine representation of the underlying hardware; in such case it MUST return the error 501 Not Implemented with an appropriate error message that indicates the cause of the error and an acceptable number range.

  • Examples of valid numbers:
    • 12345, +12, -34, 1.2, .2E7, -.2E+7, +10.01E-10, 6.03e23, .1E1, -.1e1, 1.e-12, -.1e-12, 1000000000.E1000000000, 1., .1
  • Examples of invalid numbers (although they MAY contain correct numbers as substrings):
    • 1.234D12, .e1, -.E1, +.E2, 1.23E+++, +-123
  • Note: this number representation is more general than the number representation in JSON (for instance, 1. is a valid numeric value for the filtering language specified here, but is not a valid float number in JSON, where the correct format is 1.0 instead).

While the filtering language supports tests for equality between properties of floating point type and decimal numbers given in the filter string, such comparisons come with the usual caveats for testing for equality of floating point numbers. Normally, a client cannot rely on that a floating point number stored in a database takes on a representation that exactly matches the one obtained for a number given in the filtering string as a decimal number or as an integer. However, testing for equality to zero MUST be supported.

More examples of the number tokens and machine-readable definitions and tests can be found in the Materials-Consortia API Git repository (files integers.lst, not-numbers.lst, numbers.lst, and reals.lst).

  • Boolean values are represented with the tokens TRUE and FALSE.
  • Operator tokens are represented by usual mathematical relation symbols or by case-sensitive keywords. Currently the following operators are supported: =, !=, <=, >=, <, > for tests of number, string (lexicographical) or timestamp (temporal) equality, inequality, less-than, more-than, less, and more relations; AND, OR, NOT for logical conjunctions, and a number of keyword operators discussed in the next section.

    In future extensions, operator tokens that are words MUST contain only upper-case letters. This requirement guarantees that no operator token will ever clash with a property name.

The Filter Language Syntax

All filtering expressions MUST follow the EBNF grammar of appendix The Filter Language EBNF Grammar of this specification. The appendix contains a complete machine-readable EBNF, including the definition of the lexical tokens described above in section Lexical Tokens. The EBNF is enclosed in special strings constructed as BEGIN and END, both followed by EBNF GRAMMAR Filter, to enable automatic extraction.

Basic boolean operations

The filter language supports conjunctions of comparisons using the boolean algebra operators "AND", "OR", and "NOT" and parentheses to group conjunctions. A comparison clause prefixed by NOT matches entries for which the comparison is false.

Examples:

  • NOT ( chemical_formula_hill = "Al" AND chemical_formula_anonymous = "A" OR chemical_formula_anonymous = "H2O" AND NOT chemical_formula_hill = "Ti" )

Numeric and String comparisons

Comparisons involving Numeric and String properties can be expressed using the usual comparison operators: '<', '>', '<=', '>=', '=', '!='. Implementations MUST support comparisons in the forms:

identifier <operator> constant
constant <operator> identifier

Where identifier is a property name and constant is either a numerical or string type constant.

Implementations MAY also support comparisons with identifiers on both sides, and comparisons with numerical type constants on both sides, i.e., in the forms:

identifier <operator> identifier
constant <operator> constant

However, the latter form, constant <operator> constant where the constants are strings MUST return the error 501 Not Implemented.

Note: The motivation to exclude the form constant <operator> constant for strings is that filter language strings can refer to data of different data types (e.g., strings and timestamps), and thus this construct is not unambiguous. The OPTIMADE specification will strive to address this issue in a future version.

Examples:

  • nelements > 3
  • chemical_formula_hill = "H2O" AND chemical_formula_anonymous != "AB"
  • _exmpl_aax <= +.1e8 OR nelements >= 10 AND NOT ( _exmpl_x != "Some string" OR NOT _exmpl_a = 7)
  • _exmpl_spacegroup="P2"
  • _exmpl_cell_volume<100.0
  • _exmpl_band_gap > 5.0 AND _exmpl_molecular_weight < 350
  • _exmpl_melting_point<300 AND nelements=4 AND chemical_formula_descriptive="SiO2"
  • _exmpl_some_string_property = 42 (This is syntactically allowed without putting 42 in quotation marks, see the notes about comparisons of values of different types below.)
  • 5 < _exmpl_a
  • OPTIONAL: ((NOT (_exmpl_a>_exmpl_b)) AND _exmpl_x>0)
  • OPTIONAL: 5 < 7

Substring comparisons

In addition to the standard equality and inequality operators, matching of partial strings is provided by keyword operators:

  • identifier CONTAINS x: Is true if the substring value x is found anywhere within the property.
  • identifier STARTS WITH x: Is true if the property starts with the substring value x. The WITH keyword MAY be omitted.
  • identifier ENDS WITH x: Is true if the property ends with the substring value x. The WITH keyword MAY be omitted.

OPTIONAL features:

  • Support for x to be an identifier, rather than a string is OPTIONAL.

Examples:

  • chemical_formula_anonymous CONTAINS "C2" AND chemical_formula_anonymous STARTS WITH "A2"
  • chemical_formula_anonymous STARTS "A2" AND chemical_formula_anonymous ENDS WITH "D1"

Comparisons of boolean values

Straightforward comparisons ('=' and '!=') MUST be supported for boolean values. Other comparison operators ('<', '>', '<=', '>=') MUST NOT be supported. Boolean values are only supposed to be used in direct comparisons with properties, but not compound comparisons. For example, (nsites = 3 AND nelements = 3) = FALSE is not supported.

Boolean property property MAY be compared with TRUE by omitting the = TRUE altogether: property. Conversely, it MAY be compared with FALSE by negating the comparison with TRUE: NOT property.

Examples:

  • property = TRUE
  • property != FALSE
  • _exmpl_has_inversion_symmetry AND NOT _exmpl_is_primitive

Comparisons of list properties

In the following, list is a list-type property, and values is one or more value separated by commas (","), i.e., strings or numbers. An implementation MAY also support property names and nested property names in values.

The following constructs MUST be supported:

  • list HAS value: matches if at least one element in list is equal to value. (If list has no duplicate elements, this implements the set operator IN.)
  • list HAS ALL values: matches if, for each value, there is at least one element in list equal to that value. (If both list and values do not contain duplicate values, this implements the set operator >=.)
  • list HAS ANY values: matches if at least one element in list is equal to at least one value. (This is equivalent to a number of HAS statements separated by OR.)
  • list LENGTH value: matches if the number of items in the list property is equal to value.

The HAS ONLY construct MAY be supported:

  • OPTIONAL: list HAS ONLY values: matches if all elements in list are equal to at least one value. (If both list and values do not contain duplicate values, this implements the <= set operator.)

This construct is OPTIONAL as it can be difficult to realize in some underlying database implementations. However, if the desired search is over a property that can only take on a finite set of values (e.g., chemical elements) a client can formulate an equivalent search by inverting the list of values into inverse and express the filter as NOT list HAS inverse.

Furthermore, there is a set of OPTIONAL constructs that allows filters to be formulated over the values in correlated positions in multiple list properties. An implementation MAY support this syntax selectively only for specific properties. This type of filter is useful for, e.g., filtering on elements and correlated element counts available as two separate list properties.

  • list1:list2:... HAS val1:val2:...
  • list1:list2:... HAS ALL val1:val2:...
  • list1:list2:... HAS ANY val1:val2:...
  • list1:list2:... HAS ONLY val1:val2:...

Finally, all the above constructs that allow a value or lists of values on the right-hand side MAY allow <operator> value in each place a value can appear. In that case, a match requires that the <operator> comparison is fulfilled instead of equality. Strictly, the definitions of the HAS, HAS ALL, HAS ANY, HAS ONLY and LENGTH operators as written above apply, but with the word 'equal' replaced with the <operator> comparison.

For example:

  • OPTIONAL: list HAS < 3: matches all entries for which list contains at least one element that is less than three.
  • OPTIONAL: list HAS ALL < 3, > 3: matches only those entries for which list simultaneously contains at least one element less than three and one element greater than three.

An implementation MAY support combining the operator syntax with the syntax for correlated lists in particularly on a list correlated with itself. For example:

  • OPTIONAL: list:list HAS >=2:<=5: matches all entries for which list contains at least one element that is between the values 2 and 5.

Further examples of various comparisons of list properties:

  • OPTIONAL: elements HAS "H" AND elements HAS ALL "H","He","Ga","Ta" AND elements HAS ONLY "H","He","Ga","Ta" AND elements HAS ANY "H", "He", "Ga", "Ta"
  • OPTIONAL: elements HAS ONLY "H","He","Ga","Ta"
  • OPTIONAL: elements:_exmpl_element_counts HAS "H":6 AND elements:_exmpl_element_counts HAS ALL "H":6,"He":7 AND elements:_exmpl_element_counts HAS ONLY "H":6 AND elements:_exmpl_element_counts HAS ANY "H":6,"He":7 AND elements:_exmpl_element_counts HAS ONLY "H":6,"He":7
  • OPTIONAL: _exmpl_element_counts HAS < 3 AND _exmpl_element_counts HAS ANY > 3, = 6, 4, != 8 (note: specifying the = operator after HAS ANY is redundant here, if no operator is given, the test is for equality.)
  • OPTIONAL: elements:_exmpl_element_counts:_exmpl_element_weights HAS ANY > 3:"He":>55.3 , = 6:>"Ti":<37.6 , 8:<"Ga":0

Nested property names

Everywhere in a filter string where a property name is accepted, the API implementation MAY accept nested property names as described in section Lexical Tokens, consisting of identifiers separated by periods ('.'). A filter on a nested property name consisting of two identifiers identifier1.identifier2 matches if either one of these points are true:

  • identifier1 references a dictionary-type property that contains as an identifier identifier2 and the filter matches for the content of identifier2.
  • identifier1 references a list of dictionaries that contain as an identifier identifier2 and the filter matches for a flat list containing only the contents of identifier2 for every dictionary in the list. E.g., if identifier1 is the list [{"identifier2":42, "identifier3":36}, {"identifier2":96, "identifier3":66}], then identifier1.identifier2 is understood in the filter as the list [42, 96].

The API implementation MAY allow this notation to generalize to arbitrary depth. A nested property name that combines more than one list MUST, if accepted, be interpreted as a completely flattened list.

Filtering on relationships

As described in the section Relationships, it is possible for the API implementation to describe relationships between entries of the same, or different, entry types. The API implementation MAY support queries on relationships with an entry type <entry type> by using special nested property names:

  • <entry type>.id references a list of IDs of relationships with entries of the type <entry type>.
  • <entry type>.description references a correlated list of the human-readable descriptions of these relationships.

Hence, the filter language acts as, for every entry type, there is a property with that name which contains a list of dictionaries with two keys, id and description. For example: a client queries the structures endpoint with a filter that references calculations.id. For a specific structures entry, the nested property behaves as the list ["calc-id-43", "calc-id-96"] and would then, e.g., match the filter calculations.id HAS "calc-id-96". This means that the structures entry has a relationship with the calculations entry of that ID.

Note: formulating queries on relationships with entries that have specific property values is a multi-step process. For example, to find all structures with bibliographic references where one of the authors has the last name "Schmit" is performed by the following two steps:

  • Query the references endpoint with a filter authors.lastname HAS "Schmit" and store the id values of the returned entries.
  • Query the structures endpoint with a filter references.id HAS ANY <list-of-IDs>, where <list-of-IDs> are the IDs retrieved from the first query separated by commas.

(Note: the type of query discussed here corresponds to a "join"-type operation in a relational data model.)

Filtering on Properties with an unknown value

Properties can have an unknown value, see section Properties with an unknown value.

Filters that match when the property is known, or unknown, respectively can be constructed using the following syntax:

identifier IS KNOWN
identifier IS UNKNOWN

Except for the above constructs, filters that use any form of comparison that involve properties of unknown values MUST NOT match. Hence, by definition, an identifier of value null never matches equality (=), inequality (<, <=, >, >=, !=) or other comparison operators besides identifier IS UNKNOWN and NOT identifier IS KNOWN. In particular, a filter that compares two properties that are both null for equality or inequality does not match.

Examples:

  • chemical_formula_hill IS KNOWN AND NOT chemical_formula_anonymous IS UNKNOWN

Precedence

The precedence (priority) of the operators MUST be as indicated in the list below:

  1. Comparison and keyword operators (<, <=, =, HAS, STARTS, etc.) -- highest priority;
  2. NOT
  3. AND
  4. OR -- lowest priority.

Examples:

  • NOT a > b OR c = 100 AND f = "C2 H6": this is interpreted as (NOT (a > b)) OR ( (c = 100) AND (f = "C2 H6") ) when fully braced.
  • a >= 0 AND NOT b < c OR c = 0: this is interpreted as ((a >= 0) AND (NOT (b < c))) OR (c = 0) when fully braced.

Type handling and conversions in comparisons

The definitions of specific properties in this standard define their types. Similarly, for database-provider-specific properties, the database provider decides their types. In the syntactic constructs that can accommodate values of more than one type, types of all participating values are REQUIRED to match, with a single exception of timestamps (see below). Different types of values MUST be reported as 501 Not Implemented errors, meaning that type conversion is not implemented in the specification.

As the filter language syntax does not define a lexical token for timestamps, values of this type are expressed using string tokens in RFC 3339 Internet Date/Time Format. In a comparison with a timestamp property, a string token represents a timestamp value that would result from parsing the string according to RFC 3339 Internet Date/Time Format. Interpretation failures MUST be reported with error 400 Bad Request.

Optional filter features

Some features of the filtering language are marked OPTIONAL. An implementation that encounters an OPTIONAL feature that it does not support MUST respond with error 501 Not Implemented with an explanation of which OPTIONAL construct the error refers to.

Property Definitions

An OPTIMADE Property Definition defines a specific property, which will be referred to as the defined property throughout this section. The definition uses a dictionary-based construct that, when represented in the JSON output format, is compatible with the JSON Schema standard (for more information, see Property Definition keys from JSON Schema). The format of Property Definitions defined below allows nesting inner Property Definitions to define properties that are comprised by values organized in lists and dictionaries to arbitrary depth.

To make a property definition expressible in any output format, the fields of the property definition below are specified using OPTIMADE data types. When a property definition is communicated using a specific data format (e.g., JSON), the property definition is implemented in that data format by mapping the OPTIMADE data types into the corresponding data types for that output format.

A Property Definition MUST be composed according to the combination of the requirements in the subsection Property Definition keys from JSON Schema below and the following additional requirements:

REQUIRED keys for the outermost level of the Property Definition:

  • title: String and description: String. See the subsection Property definition keys from JSON Schema for the definitions of these fields. They are defined in that subsection as OPTIONAL on any level of the Property Definition, but are REQUIRED on the outermost level.
  • x-optimade-property: Dictionary. Additional information to define the property that is not covered by fields in the JSON Schema standard.

    REQUIRED keys:

    • property-format: String. Specifies the minor version of the property definition format used. The string MUST be of the format "MAJOR.MINOR", referring to the version of the OPTIMADE standard that describes the format in which this property definition is expressed. The version number string MUST NOT be prefixed by, e.g., "v". In implementations of the present version of the standard, the value MUST be exactly 1.2. A client MUST disregard the property definition if the field is not a string of the format MAJOR.MINOR or if the MAJOR version number is unrecognized. This field allows future versions of this standard to support implementations keeping definitions that adhere to older versions of the property definition format.
    • property-uri: String. A static URI identifier that is a URN or URL representing the specific version of the property. It SHOULD NOT be changed as long as the property definition remains the same, and SHOULD be changed when the property definition changes. (If it is a URL, clients SHOULD NOT assign any interpretation to the response when resolving that URL.)

    OPTIONAL keys:

    • version: String. This string indicates the version of the property definition. The string SHOULD be in the format described by the semantic versioning v2 standard.
    • unit-definitions: List. A list of definitions of the symbols used in the Property Definition (including its nested levels) for physical units given as values of the x-optimade-unit field. This field MUST be included if the defined property, at any level, includes an x-optimade-unit with a value that is not dimensionless or inapplicable. See subsection Physical Units in Property Definitions for the details on how units are represented in OPTIMADE Property Definitions and the precise format of this dictionary.
    • resource-uris: List. A list of dictionaries that references remote resources that describe the property. The format of each dictionary is:

      REQUIRED keys:

      • relation: String. A human-readable description of the relationship between the property and the remote resource, e.g., a "natural language description".
      • uri: String. A URI of the external resource (which MAY be a resolvable URL).

REQUIRED keys for all levels of the Property Definition:

  • x-optimade-unit: String. A (compound) symbol for the physical unit in which the value of the defined property is given or one of the strings dimensionless or inapplicable. See subsection Physical Units in Property Definitions for the details on how compound units are represented in OPTIMADE Property Definitions and the precise format of this string.

OPTIONAL keys at all nested levels of the Property Definition:

  • x-optimade-implementation: Dictionary. A dictionary describing the level of OPTIMADE API functionality provided by the present implementation. If an implementation omits this field in its response, a client interacting with that implementation SHOULD NOT make any assumptions about the availability of these features. The dictionary has the following format:

    OPTIONAL keys:

    • sortable: Boolean. If TRUE, specifies that results can be sorted on this property (see Entry Listing URL Query Parameters for more information on this field). If FALSE, specifies that results cannot be sorted on this property. Omitting the field is equivalent to FALSE.
    • query-support: String. Defines a required level of support in formulating queries on this field. The string MUST be one of the following:
      • all mandatory: the defined property MUST be queryable using the OPTIMADE filter language with support for all mandatory filter features.
      • equality only: the defined property MUST be queryable using the OPTIMADE filter language equality and inequality operators. Other filter language features do not need to be available.
      • partial: the defined property MUST be queryable with support for a subset of the filter language operators as specified by the field query-support-operators.
      • none: the defined property does not need to be queryable with any features of the filter language.
    • query-support-operators: List of Strings. Defines the filter language features supported on this property. Each string in the list MUST be one of <, <=, >, >=, =, !=, CONTAINS, STARTS WITH, ENDS WITH:, HAS, HAS ALL, HAS ANY, HAS ONLY, IS KNOWN, IS UNKNOWN with the following meanings:
      • <, <=, >, >=, =, !=: indicating support for filtering this property using the respective operator. If the property is of Boolean type, support for = also designates support for boolean comparisons with the property being true that omit "= TRUE", e.g., specifying that filtering for "is_yellow = TRUE" is supported also implies support for "is_yellow" (which means the same thing). Support for "NOT is_yellow" also follows.
      • CONTAINS, STARTS WITH, ENDS WITH: indicating support for substring filtering of this property using the respective operator. MUST NOT appear if the property is not of type String.
      • HAS, HAS ALL, HAS ANY: indicating support of the MANDATORY features for list property comparison using the respective operator. MUST NOT appear if the property is not of type List.
      • HAS ONLY: indicating support for list property comparison with all or a subset of the OPTIONAL constructs using this operator. MUST NOT appear if the property is not of type List.
      • IS KNOWN, IS UNKNOWN: indicating support for filtering this property on unknown values using the respective operator.
  • x-optimade-requirements: Dictionary. A dictionary describing the level of OPTIMADE API functionality required by all implementations of this property. Omitting this field means the corresponding functionality is OPTIONAL. The dictionary has the same format as x-optimade-implementation, except that it also allows the following OPTIONAL field:
    • support: String. Describes the minimal required level of support for the Property by an implementation. This field SHOULD only appear in an x-optimade-requirements that is at the outermost level of a Property Definition, as the meaning of its inclusion on other levels is not defined. The string MUST be one of the following:

      • must: the defined property MUST be recognized by the implementation (e.g., in filter strings) and MUST NOT be null.
      • should: the defined property MUST be recognized by the implementation (e.g., in filter strings) and SHOULD NOT be null.
      • may: it is OPTIONAL for the implementation to recognize the defined property and it MAY be equal to null.

      Omitting the field is equivalent to may.

      Note: the specification by this field of whether the defined property can be null or not MUST match the value of the type field. If null values are allowed, that field must be a list where the string "null" is the second element.

Property Definition keys from JSON Schema

In addition to the requirements on the format of a Property Definition in the previous section, it MUST also adhere to the OPTIONAL and REQUIRED keys described in this subsection. The format described in this subsection forms a subset of the JSON Schema Validation Draft 2020-12 and JSON Schema Core Draft 2020-12 standards.

REQUIRED keys

  • type: String or List. The string or list specifies the type of the defined property. It MUST be one of:

    • One of the strings "boolean", "object" (refers to an OPTIMADE dictionary), "array" (refers to an OPTIMADE list), "number" (refers to an OPTIMADE float), "string", or "integer".
    • A list where the first item MUST be one of the strings above, and the second item MUST be the string "null".

    For OPTIMADE data types not covered above:

    • timestamps are represented by setting the type field to "string" and the format field to "date-time". In this case it is MANDATORY to include the field format.

    Output formats that represent these OPTIMADE data types in other ways have to recognize them and reinterpret the definition accordingly.

OPTIONAL keys

  • title: String. A short single-line human-readable explanation of the defined property appropriate to show as part of a user interface.
  • description: String. A human-readable multi-line description that explains the purpose, requirements, and conventions of the defined property. The format SHOULD be a one-line description, followed by a new paragraph (two newlines), followed by a more detailed description of all the requirements and conventions of the defined property. Formatting in the text SHOULD use Markdown in the CommonMark v0.3 format.
  • deprecated: Boolean. If TRUE, implementations SHOULD not use the defined property, and it MAY be removed in the future. If FALSE, the defined property is not deprecated. The field not being present means FALSE.
  • enum: List. The defined property MUST take one of the values given in the provided list. The items in the list MUST all be of a data type that matches the type field and otherwise adhere to the rest of the Property Description. If this key is given, for null to be a valid value of the defined property, the list MUST contain a null value and the type MUST be a list where the second value is the string "null".
  • examples: List. A list of example values that the defined property can have. These examples MUST all be of a data type that matches the type field and otherwise adhere to the rest of the Property Description.

Depending on what string the type is equal to, or contains as first element, the following additional requirements also apply:

  • "object":

    REQUIRED

    • properties: Dictionary. Gives key-value pairs where each value is an inner Property Definition. The defined property is a dictionary that can only contain keys present in this dictionary, and, if so, the corresponding value is described by the respective inner Property Definition. (Or, if the type field is the list "object" and "null", it can also be null.)

    OPTIONAL

    • required: List. The list MUST only contain strings. The defined property MUST have keys that match all the strings in this list. Other keys present in the properties field are OPTIONAL in the defined property. If not present or empty, all keys in properties are regarded as OPTIONAL.
    • maxProperties: Integer. The defined property is a dictionary where the number of keys MUST be less than or equal to the number given.
    • minProperties: Integer. The defined property is a dictionary where the number of keys MUST be greater than or equal to the number given.
    • dependentRequired: Dictionary. The dictionary keys are strings and the values are lists of unique strings. If the defined property has a key that is equal to a key in the given dictionary, the defined property MUST also have keys that match each of the corresponding values. No restriction is inferred from this field for keys in the defined property that do not match any key in the given dictionary.
  • "array":

    REQUIRED

    • items: Dictionary. Specifies an inner Property Definition. The defined property is a list where each item MUST match this inner Property Definition.

    OPTIONAL

    • maxItems: Integer. A non-negative integer. The defined property is an array that MUST contain a number of items that is less than or equal to the given integer.
    • minItems: Integer. A non-negative integer. The defined property is an array that MUST contain a number of items that is greater than or equal to the given integer.
    • uniqueItems: Boolean. If TRUE, the defined property is an array that MUST only contain unique items. If FALSE, this field sets no limitation on the defined property.
  • "integer":

    OPTIONAL

    • multipleOf: Integer. An integer is strictly greater than 0. The defined property MUST have an integer value that when divided by the given integer results in an integer (i.e., it must be even divisible by this integer without a fractional part).
    • maximum: Integer. The defined property is an integer that MUST be less than or equal to this number.
    • exclusiveMaximum: Integer. The defined property is an integer that MUST be strictly less than this number; it cannot be equal to the number.
    • minimum: Integer. The defined property is an integer that MUST be greater than or equal to this number.
    • exclusiveMinimum: Integer. The defined property is an integer that MUST be strictly greater than this number; it cannot be equal to the number.
  • "number":

    OPTIONAL

    • multipleOf: Float. An integer is strictly greater than 0. The defined property MUST have an integer value that when divided by the given integer results in an integer (i.e., it must be even divisible by this integer without a fractional part).
    • maximum: Float. The defined property is a float that MUST be less than or equal to this number.
    • exclusiveMaximum: Float. The defined property is a float that MUST be strictly less than this number; it cannot be equal to the number.
    • minimum: Float. The defined property is a float that MUST be greater than or equal to this number.
    • exclusiveMinimum: Float. The defined property is a float that MUST be strictly greater than this number; it cannot be equal to the number.
  • "string":

    OPTIONAL

    • maxLength: Integer. A non-negative integer. The defined property is a string that MUST have a length that is less than or equal to the given integer. (The length of the string is the number of individual Unicode characters it is composed of.)
    • minLength: Integer. A non-negative integer. The defined property is a string that MUST have a length that is less than or equal to the given integer. (The definition of the length of a string is the same as in the field maxLength.)
    • format: String. Choose one of the following values to indicate that the defined property is a string that MUST adhere to the specified format:
      • "date-time": the date-time production in RFC 3339 section 5.6.
      • "date": the full-date production in RFC 3339 section 5.6.
      • "time": the full-time production in RFC 3339 section 5.6.
      • "duration": the duration production in RFC 3339 Appendix A.
      • "email": the "Mailbox" ABNF rule in RFC 5321 section 4.1.2.
      • "uri": a string instance is valid against this attribute if it is a valid URI, according to RFC 3986.

Physical Units in Property Definitions

In OPTIMADE, there is no facility to allow a property to be represented in a choice of units, e.g., either ångström (Å) or meter (m). The unit is always permanently fixed by the Property Definition. Clients and servers that use other units internally thus have to do unit conversions as part of preparing and processing OPTIMADE responses.

The physical unit of a property, the embedded items of a list, or values of a dictionary, are defined with the field x-optimade-unit with the following requirements:

  • The field MUST be given with a non-null value both at the highest level in the OPTIMADE Property Definition and all inner Property Definitions.
  • If the property refers to a physical quantity that is dimensionless (often also referred to as having the dimension 1) or refers to a dimensionless count of something (e.g., the number of protons in a nucleus) the field MUST have the value dimensionless.
  • If the property refers to an entity for which the assignment of a unit would not make sense, e.g., a string representing a chemical formula or a serial number the field MUST have the value inapplicable.

A standard set of unit symbols for OPTIMADE is taken from version 3.15 of the (separately versioned) unit database definition.units included with the source distribution of GNU Units version 2.22. If the unit is available in this database, or if it can be expressed as a compound unit expression using these units, the value of x-optimade-unit SHOULD use the corresponding (compound) string symbol and a corresponding definition referring to the same symbol be given in the field standard.

A compound unit expression based on the GNU Units symbols is created by a sequence of unit symbols separated by a single multiplication * symbol. Each unit symbol can also be suffixed by a single ^ symbol followed by a positive or negative integer to indicate the power of the preceding unit, e.g., m^3 for cubic meter, m^-3 for inverse cubic meter. (Positive integers MUST NOT be preceded by a plus sign.) The unit symbols MAY be prefixed by one (but not more than one) of the prefixes defined in the definitions.units file. A prefix is indicated in the file by a trailing -, but that trailing character MUST NOT be included when using it as a prefix. If there are multiple prefixes in the file with the same meaning, an implementation SHOULD use the shortest one consisting of only lowercase letters a-z and underscores, but no other symbols. If there are multiple ones with the same shortest length, then the first one of those SHOULD be used. For example "km" for kilometers. Furthermore:

  • No whitespace, parenthesis, or other symbols than specified above are permitted.
  • If multiple string representations of the same unit exist in definition.units, the first one in that file consisting of only lowercase letters a-z and underscores, but no other symbols, SHOULD be used.
  • The unit symbols MUST appear in alphabetical order.

The string in x-optimade-unit MUST be defined in the unit-definitions field inside the x-optimade-property field in the outermost level of the Property Definition.

If provided, the unit-definitions in x-optimade-property MUST be a list of dictionaries, all adhering to the following format:

REQUIRED keys:

  • symbol: String. Specifies the symbol to be used in x-optimade-unit to reference this unit.
  • title: String. A human-readable single-line string name for the unit.
  • description: String. A human-readable multiple-line detailed description of the unit.
  • standard: Dictionary. This field is used to define the unit symbol using a preexisting standard. The dictionary has the following format:

    REQUIRED keys:

    • name: String. The abbreviated name of the standard being referenced. One of the following:
      • "gnu units": the symbol is a (compound) unit expression based on the symbols in the file definitions.units distributed with GNU Units software, created according to the scheme described above.
      • "ucum": the symbol comes from The Unified Code for Units of Measure (UCUM) standard.
      • "qudt": the symbol comes from the QUDT standard. Not only symbols strictly defined within the standard are allowed, but also other compound unit expressions created according to the scheme for how new such symbols are formed in this standard.
    • version: String. The version string of the referenced standard.
    • symbol: String. The symbol to use from the referenced standard, expressed according to that standard. This field MAY be different from symbol directly under unit-definitions, meaning that the unit is referenced in x-optimade-unit fields using a different symbol than the one used in the standard. However, the symbol fields SHOULD be the same unless multiple units sharing the same symbol need to be referenced.

OPTIONAL keys:

  • resource-uris: List. A list of dictionaries that reference remote resources that describe the unit. The format of each dictionary is:

    REQUIRED keys:

    • relation: String. A human-readable description of the relationship between the unit and the remote resource, e.g., a "natural language description".
    • uri: String. A URI of the external resource (which MAY be a resolvable URL).

Unrecognized keys in property definitions

Implementations MAY add their own keys in Property Definitions, both inside and outside of the fields x-optimade-property, x-optimade-implementation, and x-optimade-requirements in the form of x-exmpl-name where exmpl is the database-specific prefix (without underscore characters) and name is the part of the key chosen by the implementation. Implementations MUST NOT add keys to property definitions on other formats.

Client and server implementations that interpret an OPTIMADE Property Definition and encounter unrecognized keys starting with x-exmpl- where exmpl is a recognized database prefix MAY issue errors or warnings. Other unrecognized keys starting with x- MUST NOT issue errors, SHOULD NOT issue warnings, and MUST otherwise be ignored.

To allow forward compatibility with future versions of both OPTIMADE and the JSON Schema standards, unrecognized keys that do not start with x- SHOULD issue a warning but MUST otherwise be ignored.

Entry List

This section defines standard entry types and their properties.

Properties Used by Multiple Entry Types

id

  • Description: An entry's ID as defined in section Definition of Terms.
  • Type: string.
  • Requirements/Conventions:
    • Support: MUST be supported by all implementations, MUST NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • Response: REQUIRED in the response.
    • See section Definition of Terms.
  • Examples:
    • "db/1234567"
    • "cod/2000000"
    • "cod/2000000@1234567"
    • "nomad/L1234567890"
    • "42"

type

  • Description: The name of the type of an entry.
  • Type: string.
  • Requirements/Conventions:
    • Support: MUST be supported by all implementations, MUST NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • Response: REQUIRED in the response.
    • MUST be an existing entry type.
    • The entry of type <type> and ID <id> MUST be returned in response to a request for /<type>/<id> under the versioned or unversioned base URL serving the API.
  • Examples:
    • "structures"

immutable_id

  • Description: The entry's immutable ID (e.g., a UUID). This is important for databases having preferred IDs that point to "the latest version" of a record, but still offer access to older variants. This ID maps to the version-specific record, in case it changes in the future.
  • Type: string.
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
  • Examples:
    • "8bd3e750-b477-41a0-9b11-3a799f21b44f"
    • "fjeiwoj,54;@=%<>#32" (Strings that are not URL-safe are allowed.)

last_modified

  • Description: Date and time representing when the entry was last modified.
  • Type: timestamp.
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • Response: REQUIRED in the response unless the query parameter response_fields is present and does not include this property.
  • Examples:

database-provider-specific properties

  • Description: Database providers are allowed to add database-provider-specific properties in the output of both standard entry types and database-provider-specific entry types. Similarly, an implementation MAY add keys with a database-provider-specific prefix to dictionary properties and their sub-dictionaries. For example, the database-provider-specific property _exmpl_oxidation_state, can be placed within the OPTIMADE property species.
  • Type: Decided by the API implementation. MUST be one of the OPTIMADE Data types.
  • Requirements/Conventions:
    • Support: Support for database-provider-specific properties is fully OPTIONAL.
    • Query: Support for queries on these properties are OPTIONAL. If supported, only a subset of the filter features MAY be supported.
    • Response: API implementations are free to choose whether database-provider-specific properties are only included when requested using the query parameter response_fields, or if they are included also when response_fields is not present. Implementations are thus allowed to decide that some of these properties are part of what can be seen as the default value of response_fields when that query parameter is omitted. Implementations SHOULD NOT include database-provider-specific properties when the query parameter response_fields is present but does not list them.
    • These MUST be prefixed by a database-provider-specific prefix (see appendix Database-Provider-Specific Namespace Prefixes).
    • Implementations MUST add the properties to the list of properties under the respective entry listing info endpoint (see Entry Listing Info Endpoints).
  • Examples: A few examples of valid database-provided-specific property names follows:
    • _exmpl_formula_sum
    • _exmpl_band_gap
    • _exmpl_supercell
    • _exmpl_trajectory
    • _exmpl_workflow_id

Structures Entries

structures entries (or objects) have the properties described above in section Properties Used by Multiple Entry Types, as well as the following properties:

elements

  • Description: The chemical symbols of the different elements present in the structure.
  • Type: list of strings.
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • The strings are the chemical symbols, i.e., either a single uppercase letter or an uppercase letter followed by a number of lowercase letters.
    • The order MUST be alphabetical.
    • MUST refer to the same elements in the same order, and therefore be of the same length, as elements\_ratios, if the latter is provided.
    • Note: This property SHOULD NOT contain the string "X" to indicate non-chemical elements or "vacancy" to indicate vacancies (in contrast to the field chemical_symbols for the species property).
  • Examples:
    • ["Si"]
    • ["Al","O","Si"]
  • Query examples:
    • A filter that matches all records of structures that contain Si, Al and O, and possibly other elements: elements HAS ALL "Si", "Al", "O".
    • To match structures with exactly these three elements, use elements HAS ALL "Si", "Al", "O" AND elements LENGTH 3.
    • Note: length queries on this property can be equivalently formulated by filtering on the nelements property directly.

nelements

  • Description: Number of different elements in the structure as an integer.
  • Type: integer
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • MUST be equal to the lengths of the list properties elements and elements\_ratios, if they are provided.
  • Examples:
    • 3
  • Querying:
    • Note: queries on this property can equivalently be formulated using elements LENGTH.
    • A filter that matches structures that have exactly 4 elements: nelements=4.
    • A filter that matches structures that have between 2 and 7 elements: nelements>=2 AND nelements<=7.

elements_ratios

  • Description: Relative proportions of different elements in the structure.
  • Type: list of floats
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • Composed by the proportions of elements in the structure as a list of floating point numbers.
    • The sum of the numbers MUST be 1.0 (within floating point accuracy)
    • MUST refer to the same elements in the same order, and therefore be of the same length, as elements, if the latter is provided.
  • Examples:
    • [1.0]
    • [0.3333333333333333, 0.2222222222222222, 0.4444444444444444]
  • Query examples:
    • Note: Useful filters can be formulated using the set operator syntax for correlated values. However, since the values are floating point values, the use of equality comparisons is generally inadvisable.
    • OPTIONAL: a filter that matches structures where approximately 1/3 of the atoms in the structure are the element Al is: elements:elements_ratios HAS ALL "Al":>0.3333, "Al":<0.3334.

chemical_formula_descriptive

  • Description: The chemical formula for a structure as a string in a form chosen by the API implementation.
  • Type: string
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • The chemical formula is given as a string consisting of properly capitalized element symbols followed by integers or decimal numbers, balanced parentheses, square, and curly brackets (,), [,], {, }, commas, the +, -, : and = symbols. The parentheses are allowed to be followed by a number. Spaces are allowed anywhere except within chemical symbols. The order of elements and any groupings indicated by parentheses or brackets are chosen freely by the API implementation.
    • The string SHOULD be arithmetically consistent with the element ratios in the chemical_formula_reduced property.
    • It is RECOMMENDED, but not mandatory, that symbols, parentheses and brackets, if used, are used with the meanings prescribed by IUPAC's Nomenclature of Organic Chemistry.
  • Examples:
    • "(H2O)2 Na"
    • "NaCl"
    • "CaCO3"
    • "CCaO3"
    • "(CH3)3N+ - [CH2]2-OH = Me3N+ - CH2 - CH2OH"
  • Query examples:
    • Note: the free-form nature of this property is likely to make queries on it across different databases inconsistent.
    • A filter that matches an exactly given formula: chemical_formula_descriptive="(H2O)2 Na".
    • A filter that does a partial match: chemical_formula_descriptive CONTAINS "H2O".

chemical_formula_reduced

  • Description: The reduced chemical formula for a structure as a string with element symbols and integer chemical proportion numbers. The proportion number MUST be omitted if it is 1.
  • Type: string
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property. However, support for filters using partial string matching with this property is OPTIONAL (i.e., BEGINS WITH, ENDS WITH, and CONTAINS). Intricate queries on formula components are instead suggested to be formulated using set-type filter operators on the multi valued elements and elements_ratios properties.
    • Element symbols MUST have proper capitalization (e.g., "Si", not "SI" for "silicon").
    • Elements MUST be placed in alphabetical order, followed by their integer chemical proportion number.
    • For structures with no partial occupation, the chemical proportion numbers are the smallest integers for which the chemical proportion is exactly correct.
    • For structures with partial occupation, the chemical proportion numbers are integers that within reasonable approximation indicate the correct chemical proportions. The precise details of how to perform the rounding is chosen by the API implementation.
    • No spaces or separators are allowed.
  • Examples:
    • "H2NaO"
    • "ClNa"
    • "CCaO3"
  • Query examples:
    • A filter that matches an exactly given formula is chemical_formula_reduced="H2NaO".

chemical_formula_hill

  • Description: The chemical formula for a structure in Hill form with element symbols followed by integer chemical proportion numbers. The proportion number MUST be omitted if it is 1.
  • Type: string
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported.
    • The overall scale factor of the chemical proportions is chosen such that the resulting values are integers that indicate the most chemically relevant unit of which the system is composed. For example, if the structure is a repeating unit cell with four hydrogens and four oxygens that represents two hydroperoxide molecules, chemical_formula_hill is "H2O2" (i.e., not "HO", nor "H4O4").
    • If the chemical insight needed to ascribe a Hill formula to the system is not present, the property MUST be handled as unset.
    • Element symbols MUST have proper capitalization (e.g., "Si", not "SI" for "silicon").
    • Elements MUST be placed in Hill order, followed by their integer chemical proportion number. Hill order means: if carbon is present, it is placed first, and if also present, hydrogen is placed second. After that, all other elements are ordered alphabetically. If carbon is not present, all elements are ordered alphabetically.
    • If the system has sites with partial occupation and the total occupations of each element do not all sum up to integers, then the Hill formula SHOULD be handled as unset.
    • No spaces or separators are allowed.
  • Examples:
    • "H2O2"
  • Query examples:
    • A filter that matches an exactly given formula is chemical_formula_hill="H2O2".

chemical_formula_anonymous

  • Description: The anonymous formula is the chemical_formula_reduced, but where the elements are instead first ordered by their chemical proportion number, and then, in order left to right, replaced by anonymous symbols A, B, C, ..., Z, Aa, Ba, ..., Za, Ab, Bb, ... and so on.
  • Type: string
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property. However, support for filters using partial string matching with this property is OPTIONAL (i.e., BEGINS WITH, ENDS WITH, and CONTAINS).
  • Examples:
    • "A2B"
    • "A42B42C16D12E10F9G5"
  • Querying:
    • A filter that matches an exactly given formula is chemical_formula_anonymous="A2B".

dimension_types

  • Description: List of three integers describing the periodicity of the boundaries of the unit cell. For each direction indicated by the three lattice\_vectors, this list indicates if the direction is periodic (value 1) or non-periodic (value 0). Note: the elements in this list each refer to the direction of the corresponding entry in lattice\_vectors and not the Cartesian x, y, z directions.
  • Type: list of integers.
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: Support for queries on this property is OPTIONAL.
    • MUST be a list of length 3.
    • Each integer element MUST assume only the value 0 or 1.
  • Examples:
    • A nonperiodic structure, for example, for a single molecule : [0, 0, 0]
    • A unit cell that is periodic in the direction of the third lattice vector, for example for a carbon nanotube: [0, 0, 1]
    • For a 2D surface/slab, with a unit cell that is periodic in the direction of the first and third lattice vectors: [1, 0, 1]
    • For a bulk 3D system with a unit cell that is periodic in all directions: [1, 1, 1]

nperiodic_dimensions

  • Description: An integer specifying the number of periodic dimensions in the structure, equivalent to the number of non-zero entries in dimension\_types.
  • Type: integer
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
    • The integer value MUST be between 0 and 3 inclusive and MUST be equal to the sum of the items in the dimension\_types property.
    • This property only reflects the treatment of the lattice vectors provided for the structure, and not any physical interpretation of the dimensionality of its contents.
  • Examples:
    • 2 should be indicated in cases where dimension\_types is any of [1, 1, 0], [1, 0, 1], [0, 1, 1].
  • Query examples:
    • Match only structures with exactly 3 periodic dimensions: nperiodic_dimensions=3
    • Match all structures with 2 or fewer periodic dimensions: nperiodic_dimensions<=2

lattice_vectors

  • Description: The three lattice vectors in Cartesian coordinates, in ångström (Å).
  • Type: list of list of floats or unknown values.
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • MUST be a list of three vectors a, b, and c, where each of the vectors MUST BE a list of the vector's coordinates along the x, y, and z Cartesian coordinates. (Therefore, the first index runs over the three lattice vectors and the second index runs over the x, y, z Cartesian coordinates).
    • For databases that do not define an absolute Cartesian system (e.g., only defining the length and angles between vectors), the first lattice vector SHOULD be set along x and the second on the xy-plane.
    • MUST always contain three vectors of three coordinates each, independently of the elements of property dimension\_types. The vectors SHOULD by convention be chosen so the determinant of the lattice_vectors matrix is different from zero. The vectors in the non-periodic directions have no significance beyond fulfilling these requirements.
    • The coordinates of the lattice vectors of non-periodic dimensions (i.e., those dimensions for which dimension\_types is 0) MAY be given as a list of all null values. If a lattice vector contains the value null, all coordinates of that lattice vector MUST be null.
  • Examples:
    • [[4.0,0.0,0.0],[0.0,4.0,0.0],[0.0,1.0,4.0]] represents a cell, where the first vector is (4, 0, 0), i.e., a vector aligned along the x axis of length 4 Å; the second vector is (0, 4, 0); and the third vector is (0, 1, 4).

space_group_hall

  • Description: A Hall space group symbol representing the symmetry of the structure as defined in Hall, S. R. (1981), Acta Cryst. A37, 517-525 and erratum (1981), A37, 921.
  • Type: string
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • Each component of the Hall symbol MUST be separated by a single space symbol.
    • If there exists a standard Hall symbol which represents the symmetry it SHOULD be used.
    • MUST be null if nperiodic_dimensions is not equal to 3.
  • Examples:
    • P 2c -2ac
    • -I 4bd 2ab 3

space_group_it_number

  • Description: Space group number for the structure assigned by the International Tables for Crystallography Vol. A.
  • Type: integer
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • The integer value MUST be between 1 and 230.
    • MUST be null if nperiodic_dimensions is not equal to 3.

cartesian_site_positions

  • Description: Cartesian positions of each site in the structure. A site is usually used to describe positions of atoms; what atoms can be encountered at a given site is conveyed by the species_at_sites property, and the species themselves are described in the species property.
  • Type: list of list of floats
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • It MUST be a list of length equal to the number of sites in the structure, where every element is a list of the three Cartesian coordinates of a site expressed as float values in the unit angstrom (Å).
    • An entry MAY have multiple sites at the same Cartesian position (for a relevant use of this, see e.g., the property assemblies).
  • Examples:
    • [[0,0,0],[0,0,2]] indicates a structure with two sites, one sitting at the origin and one along the (positive) z-axis, 2 Å away from the origin.

nsites

  • Description: An integer specifying the length of the cartesian_site_positions property.
  • Type: integer
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: MUST be a queryable property with support for all mandatory filter features.
  • Examples:
    • 42
  • Query examples:
    • Match only structures with exactly 4 sites: nsites=4
    • Match structures that have between 2 and 7 sites: nsites>=2 AND nsites<=7

species_at_sites

  • Description: Name of the species at each site (where values for sites are specified with the same order of the property cartesian_site_positions). The properties of the species are found in the property species.
  • Type: list of strings.
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • MUST have length equal to the number of sites in the structure (first dimension of the list property cartesian_site_positions).
    • Each species name mentioned in the species_at_sites list MUST be described in the list property species (i.e. for each value in the species_at_sites list there MUST exist exactly one dictionary in the species list with the name attribute equal to the corresponding species_at_sites value).
    • Each site MUST be associated only to a single species. Note: However, species can represent mixtures of atoms, and multiple species MAY be defined for the same chemical element. This latter case is useful when different atoms of the same type need to be grouped or distinguished, for instance in simulation codes to assign different initial spin states.
  • Examples:
    • ["Ti","O2"] indicates that the first site is hosting a species labeled "Ti" and the second a species labeled "O2".
    • ["Ac", "Ac", "Ag", "Ir"] indicating the first two sites contains the "Ac" species, while the third and fourth sites contain the "Ag" and "Ir" species, respectively.

species

  • Description: A list describing the species of the sites of this structure. Species can represent pure chemical elements, virtual-crystal atoms representing a statistical occupation of a given site by multiple chemical elements, and/or a location to which there are attached atoms, i.e., atoms whose precise location are unknown beyond that they are attached to that position (frequently used to indicate hydrogen atoms attached to another element, e.g., a carbon with three attached hydrogens might represent a methyl group, -CH3).
  • Type: list of dictionary with keys:
    • name: string (REQUIRED)
    • chemical_symbols: list of strings (REQUIRED)
    • concentration: list of float (REQUIRED)
    • attached: list of strings (OPTIONAL)
    • nattached: list of integers (OPTIONAL)
    • mass: list of floats (OPTIONAL)
    • original_name: string (OPTIONAL).
  • Requirements/Conventions:
    • Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • Each list member MUST be a dictionary with the following keys:
      • name: REQUIRED; gives the name of the species; the name value MUST be unique in the species list;
      • chemical_symbols: REQUIRED; MUST be a list of strings of all chemical elements composing this species. Each item of the list MUST be one of the following:

        • a valid chemical-element symbol, or
        • the special value "X" to represent a non-chemical element, or
        • the special value "vacancy" to represent that this site has a non-zero probability of having a vacancy (the respective probability is indicated in the concentration list, see below).

        If any one entry in the species list has a chemical_symbols list that is longer than 1 element, the correct flag MUST be set in the list structure_features (see property structure_features).

      • concentration: REQUIRED; MUST be a list of floats, with same length as chemical_symbols. The numbers represent the relative concentration of the corresponding chemical symbol in this species. The numbers SHOULD sum to one. Cases in which the numbers do not sum to one typically fall only in the following two categories:

        • Numerical errors when representing float numbers in fixed precision, e.g. for two chemical symbols with concentrations 1/3 and 2/3, the concentration might look something like [0.33333333333, 0.66666666666]. If the client is aware that the sum is not one because of numerical precision, it can renormalize the values so that the sum is exactly one.
        • Experimental errors in the data present in the database. In this case, it is the responsibility of the client to decide how to process the data.

        Note that concentrations are uncorrelated between different sites (even of the same species).

      • attached: OPTIONAL; if provided MUST be a list of length 1 or more of strings of chemical symbols for the elements attached to this site, or "X" for a non-chemical element.
      • nattached: OPTIONAL; if provided MUST be a list of length 1 or more of integers indicating the number of attached atoms of the kind specified in the value of the attached key.

        The implementation MUST include either both or none of the attached and nattached keys, and if they are provided, they MUST be of the same length. Furthermore, if they are provided, the structure_features property MUST include the string site_attachments.

      • mass: OPTIONAL. If present MUST be a list of floats, with the same length as chemical_symbols, providing element masses expressed in a.m.u. Elements denoting vacancies MUST have masses equal to 0.
      • original_name: OPTIONAL. Can be any valid Unicode string, and SHOULD contain (if specified) the name of the species that is used internally in the source database.

        Note: With regard to "source database", we refer to the immediate source being queried via the OPTIMADE API implementation. The main use of this field is for source databases that use species names, containing characters that are not allowed (see description of the list property species_at_sites).

    • For systems that have only species formed by a single chemical symbol, and that have at most one species per chemical symbol, SHOULD use the chemical symbol as species name (e.g., "Ti" for titanium, "O" for oxygen, etc.) However, note that this is OPTIONAL, and client implementations MUST NOT assume that the key corresponds to a chemical symbol, nor assume that if the species name is a valid chemical symbol, that it represents a species with that chemical symbol. This means that a species {"name": "C", "chemical_symbols": ["Ti"], "concentration": [1.0]} is valid and represents a titanium species (and not a carbon species).
    • It is NOT RECOMMENDED that a structure includes species that do not have at least one corresponding site.
  • Examples:
    • [ {"name": "Ti", "chemical_symbols": ["Ti"], "concentration": [1.0]} ]: any site with this species is occupied by a Ti atom.
    • [ {"name": "Ti", "chemical_symbols": ["Ti", "vacancy"], "concentration": [0.9, 0.1]} ]: any site with this species is occupied by a Ti atom with 90 % probability, and has a vacancy with 10 % probability.
    • [ {"name": "BaCa", "chemical_symbols": ["vacancy", "Ba", "Ca"], "concentration": [0.05, 0.45, 0.5], "mass": [0.0, 137.327, 40.078]} ]: any site with this species is occupied by a Ba atom with 45 % probability, a Ca atom with 50 % probability, and by a vacancy with 5 % probability.
    • [ {"name": "C12", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [12.0]} ]: any site with this species is occupied by a carbon isotope with mass 12.
    • [ {"name": "C13", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [13.0]} ]: any site with this species is occupied by a carbon isotope with mass 13.
    • [ {"name": "CH3", "chemical_symbols": ["C"], "concentration": [1.0], "attached": ["H"], "nattached": [3]} ]: any site with this species is occupied by a methyl group, -CH3, which is represented without specifying precise positions of the hydrogen atoms.

assemblies

  • Description: A description of groups of sites that are statistically correlated.
  • Type: list of dictionary with keys:
    • sites_in_groups: list of list of integers (REQUIRED)
    • group_probabilities: list of floats (REQUIRED)
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • The property SHOULD be null for entries that have no partial occupancies.
    • If present, the correct flag MUST be set in the list structure_features (see property structure_features).
    • Client implementations MUST check its presence (as its presence changes the interpretation of the structure).
    • If present, it MUST be a list of dictionaries, each of which represents an assembly and MUST have the following two keys:
      • sites_in_groups: Index of the sites (0-based) that belong to each group for each assembly.

        Example: [[1], [2]]: two groups, one with the second site, one with the third. Example: [[1,2], [3]]: one group with the second and third site, one with the fourth.

      • group_probabilities: Statistical probability of each group. It MUST have the same length as sites_in_groups. It SHOULD sum to one. See below for examples of how to specify the probability of the occurrence of a vacancy. The possible reasons for the values not to sum to one are the same as already specified above for the concentration of each species, see property species.
    • If a site is not present in any group, it means that it is present with 100 % probability (as if no assembly was specified).
    • A site MUST NOT appear in more than one group.
  • Examples (for each entry of the assemblies list):
    • {"sites_in_groups": [[0], [1]], "group_probabilities": [0.3, 0.7]}: the first site and the second site never occur at the same time in the unit cell. Statistically, 30 % of the times the first site is present, while 70 % of the times the second site is present.
    • {"sites_in_groups": [[1,2], [3]], "group_probabilities": [0.3, 0.7]}: the second and third site are either present together or not present; they form the first group of atoms for this assembly. The second group is formed by the fourth site. Sites of the first group (the second and the third) are never present at the same time as the fourth site. 30 % of times sites 1 and 2 are present (and site 3 is absent); 70 % of times site 3 is present (and sites 1 and 2 are absent).
  • Notes:
    • Assemblies are essential to represent, for instance, the situation where an atom can statistically occupy two different positions (sites).
    • By defining groups, it is possible to represent, e.g., the case where a functional molecule (and not just one atom) is either present or absent (or the case where it is present in two conformations).
    • Considerations on virtual alloys and on vacancies: In the special case of a virtual alloy, these specifications allow two different, equivalent ways of specifying them. For instance, for a site at the origin with 30 % probability of being occupied by Si, 50 % probability of being occupied by Ge, and 20 % of being a vacancy, the following two representations are possible:
      • Using a single species:

        {
          "cartesian_site_positions": [[0,0,0]],
          "species_at_sites": ["SiGe-vac"],
          "species": [
            {
              "name": "SiGe-vac",
              "chemical_symbols": ["Si", "Ge", "vacancy"],
              "concentration": [0.3, 0.5, 0.2]
            }
          ]
          // ...
        }
      • Using multiple species and the assemblies:

        {
          "cartesian_site_positions": [ [0,0,0], [0,0,0], [0,0,0] ],
          "species_at_sites": ["Si", "Ge", "vac"],
          "species": [
            { "name": "Si", "chemical_symbols": ["Si"], "concentration": [1.0] },
            { "name": "Ge", "chemical_symbols": ["Ge"], "concentration": [1.0] },
            { "name": "vac", "chemical_symbols": ["vacancy"], "concentration": [1.0] }
          ],
          "assemblies": [
            {
              "sites_in_groups": [ [0], [1], [2] ],
              "group_probabilities": [0.3, 0.5, 0.2]
            }
          ]
          // ...
        }
    • It is up to the database provider to decide which representation to use, typically depending on the internal format in which the structure is stored. However, given a structure identified by a unique ID, the API implementation MUST always provide the same representation for it.
    • The probabilities of occurrence of different assemblies are uncorrelated. So, for instance in the following case with two assemblies:

      {
        "assemblies": [
          {
            "sites_in_groups": [ [0], [1] ],
            "group_probabilities": [0.2, 0.8]
          },
          {
            "sites_in_groups": [ [2], [3] ],
            "group_probabilities": [0.3, 0.7]
          }
        ]
      }

      Site 0 is present with a probability of 20 % and site 1 with a probability of 80 %. These two sites are correlated (either site 0 or 1 is present). Similarly, site 2 is present with a probability of 30 % and site 3 with a probability of 70 %. These two sites are correlated (either site 2 or 3 is present). However, the presence or absence of sites 0 and 1 is not correlated with the presence or absence of sites 2 and 3 (in the specific example, the pair of sites (0, 2) can occur with 0.2*0.3 = 6 % probability; the pair (0, 3) with 0.2*0.7 = 14 % probability; the pair (1, 2) with 0.8*0.3 = 24 % probability; and the pair (1, 3) with 0.8*0.7 = 56 % probability).

structure_features

  • Description: A list of strings that flag which special features are used by the structure.
  • Type: list of strings
  • Requirements/Conventions:
    • Support: MUST be supported by all implementations, MUST NOT be null.
    • Query: MUST be a queryable property. Filters on the list MUST support all mandatory HAS-type queries. Filter operators for comparisons on the string components MUST support equality, support for other comparison operators are OPTIONAL.
    • MUST be an empty list if no special features are used.
    • MUST be sorted alphabetically.
    • If a special feature listed below is used, the list MUST contain the corresponding string.
    • If a special feature listed below is not used, the list MUST NOT contain the corresponding string.
    • List of strings used to indicate special structure features:
      • disorder: this flag MUST be present if any one entry in the species list has a chemical_symbols list that is longer than 1 element.
      • implicit_atoms: this flag MUST be present if the structure contains atoms that are not assigned to sites via the property species_at_sites (e.g., because their positions are unknown). When this flag is present, the properties related to the chemical formula will likely not match the type and count of atoms represented by the species_at_sites, species, and assemblies properties.
      • site_attachments: this flag MUST be present if any one entry in the species list includes attached and nattached.
      • assemblies: this flag MUST be present if the property assemblies is present.
  • Examples:
    • A structure having implicit atoms and using assemblies: ["assemblies", "implicit_atoms"]

Calculations Entries

The calculations entries have the properties described above in section Properties Used by Multiple Entry Types.

References Entries

The references entries describe bibliographic references. The following properties are used to provide the bibliographic details:

  • address, annote, booktitle, chapter, crossref, edition, howpublished, institution, journal, key, month, note, number, organization, pages, publisher, school, series, title, volume, year: meanings of these properties match the BibTeX specification, values are strings;
  • bib_type: type of the reference, corresponding to type property in the BibTeX specification, value is string;
  • authors and editors: lists of person objects which are dictionaries with the following keys:
    • name: Full name of the person, REQUIRED.
    • firstname, lastname: Parts of the person's name, OPTIONAL.
  • doi and url: values are strings.
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., any of the properties MAY be null.
    • Query: Support for queries on any of these properties is OPTIONAL. If supported, filters MAY support only a subset of comparison operators.
    • Every references entry MUST contain at least one of the properties.

Example:

{
  "data": {
    "type": "references",
    "id": "Dijkstra1968",
    "attributes": {
      "authors": [
        {
          "name": "Edsger Dijkstra",
          "firstname": "Edsger",
          "lastname": "Dijkstra"
        }
      ],
      "year": "1968",
      "title": "Go To Statement Considered Harmful",
      "journal": "Communications of the ACM",
      "doi": "10.1145/362929.362947"
    }
  }
}

Files Entries

The files entries describe files. The following properties are used to do so:

url

  • Description: The URL to get the contents of a file.
  • Type: string
  • Requirements/Conventions:
    • Support: MUST be supported by all implementations, MUST NOT be null.
    • Query: Support for queries on this property is OPTIONAL.
    • Response: REQUIRED in the response.
    • The URL MUST point to the actual contents of a file (i.e. byte stream), not an intermediate (preview) representation. For example, if referring to a file on GitHub, a link should point to raw contents.
  • Examples:
    • "https://example.org/files/cifs/1000000.cif"

url_stable_until

  • Description: Point in time until which the URL in url is guaranteed to stay stable.
  • Type: timestamp
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • null means that there is no stability guarantee for the URL in url. Indefinite support could be communicated by providing a date sufficiently far in the future, for example, 9999-12-31.

name

  • Description: Base name of a file.
  • Type: string
  • Requirements/Conventions:
    • Support: MUST be supported by all implementations, MUST NOT be null.
    • Query: Support for queries on this property is OPTIONAL.
    • File name extension is an integral part of a file name and, if available, MUST be included.
  • Examples:
    • "1000000.cif"

size

  • Description: Size of a file in bytes.
  • Type: integer
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • If provided, it MUST be guaranteed that either exact size of a file is given or its upper bound. This way if a client reserves a static buffer or truncates the download stream after this many bytes the whole file would be received. Such provision is included to allow the providers to serve on-the-fly compressed files.

media_type

  • Description: Media type identifier (also known as MIME type), for a file as per RFC 6838 Media Type Specifications and Registration Procedures.
  • Type: string
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
  • Examples:
    • "chemical/x-cif"

version

  • Description: Version information of a file (e.g. commit, revision, timestamp).
  • Type: string
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • If provided, it MUST be guaranteed that file contents pertaining to the same combination of id and version are the same.

modification_timestamp

  • Description: Timestamp of the last modification of file contents. A modification is understood as an addition, change or deletion of one or more bytes, resulting in file contents different from the previous.
  • Type: timestamp
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • Timestamps of subsequent file modifications SHOULD be increasing (not earlier than previous timestamps).

description

  • Description: Free-form description of a file.
  • Type: string
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
  • Examples:
    • "POSCAR format file"

checksums

  • Description: Dictionary providing checksums of file contents.
  • Type: dictionary with keys identifying checksum functions and values (strings) giving the actual checksums
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • Supported dictionary keys: md5, sha1, sha224, sha256, sha384, sha512. Checksums outside this list MAY be used, but their names MUST be prefixed by database-provider-specific namespace prefix (see appendix Database-Provider-Specific Namespace Prefixes).

atime

  • Description: Time of last access of a file as per POSIX standard.
  • Type: timestamp
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.

ctime

  • Description: Time of last status change of a file as per POSIX standard.
  • Type: timestamp
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.

mtime

  • Description: Time of last modification of a file as per POSIX standard.
  • Type: timestamp
  • Requirements/Conventions:
    • Support: OPTIONAL support in implementations, i.e., MAY be null.
    • Query: Support for queries on this property is OPTIONAL.
    • It should be noted that the values of last_modified, modification_timestamp and mtime do not necessary match. last_modified pertains to the modification of the OPTIMADE metadata, modification_timestamp pertains to file contents and mtime pertains to the modification of the file (not necessary changing its contents). For example, appending an empty string to a file would result in the change of mtime in some operating systems, but this would not be deemed as a modification of its contents.

Database-Provider-Specific Entry Types

Names of database-provider-specific entry types MUST start with database-provider-specific namespace prefix (see appendix Database-Provider-Specific Namespace Prefixes). Database-provider-specific entry types MUST have all properties described above in section Properties Used by Multiple Entry Types.

  • Requirements/Conventions for properties in database-provider-specific entry types:
    • Support: Support for any properties in database-provider-specific entry types is fully OPTIONAL.
    • Query: Support for queries on these properties are OPTIONAL. If supported, only a subset of the filter features MAY be supported.

Relationships Used by Multiple Entry Types

In accordance with section Relationships, all entry types MAY use relationships to describe relations to other entries.

References

The references relationship is used to provide bibliographic references for any of the entry types. It relates an entry with any number of references entries.

If the response format supports inclusion of entries of a different type in the response, then the response SHOULD include all references-type entries mentioned in the response.

For example, for the JSON response format, the top-level included field SHOULD be used as per the JSON API 1.0 specification:

{
  "data": {
    "type": "structures",
    "id": "example.db:structs:1234",
    "attributes": {
      "formula": "Es2",
      "url": "http://example.db/structs/1234",
      "immutable_id": "http://example.db/structs/1234@123",
      "last_modified": "2007-04-07T12:02:20Z"
    },
    "relationships": {
      "references": {
        "data": [
          { "type": "references", "id": "Dijkstra1968" },
          {
            "type": "references",
            "id": "1234",
            "meta": {
              "description": "Reference for the general crystal prototype."
            }
          }
        ]
      }
    }
  },
  "included": [
    {
      "type": "references",
      "id": "Dijkstra1968",
      "attributes": {
        "authors": [
          {
            "name": "Edsger Dijkstra",
            "firstname": "Edsger",
            "lastname": "Dijkstra"
          }
        ],
        "year": "1968",
        "title": "Go To Statement Considered Harmful",
        "journal": "Communications of the ACM",
        "doi": "10.1145/362929.362947"
      }
    },
    {
      "type": "references",
      "id": "1234",
      "attributes": {
        "doi": "10.1234/1234"
      }
    }
  ]
}

Calculations

Relationships with calculations MAY be used to indicate provenance where a structure is either an input to or an output of calculations.

Note: We intend to implement in a future version of this API a standardized mechanism to differentiate these two cases, thus allowing databases a common way of exposing the full provenance tree with inputs and outputs between structures and calculations.

At the moment the database providers are suggested to extend their API the way they choose, always using their database-provider-specific prefix in non-standardized fields.

Files

Relationships with files may be used to relate an entry with any number of files entries.

{
  "data": {
    "type": "structures",
    "id": "example.db:structs:1234",
    "attributes": {
      "chemical_formula_descriptive": "H2O"
    },
    "relationships": {
      "files": {
        "data": [
          { "type": "files", "id": "example.db:files:1234" }
        ]
      }
    }
  },
  "included": [
    {
      "type": "files",
      "id": "example.db:files:1234",
      "attributes": {
        "media_type": "chemical/x-cif",
        "url": "https://example.org/files/cifs/1234.cif"
      }
    }
  ]
}

Appendices

The Filter Language EBNF Grammar

(* BEGIN EBNF GRAMMAR Filter *)
(* The top-level 'filter' rule: *)

Filter = [Spaces], Expression ;

(* Values *)

OrderedConstant = String | Number ;
UnorderedConstant = ( TRUE | FALSE ) ;

Value = ( UnorderedConstant | OrderedValue ) ;

OrderedValue = ( OrderedConstant | Property ) ;
(* Note: support for Property in OrderedValue is OPTIONAL *)

ValueListEntry = ( Value | ValueEqRhs | ValueRelCompRhs | FuzzyStringOpRhs ) ;
(* Note: support for ValueEqRhs, ValueRelCompRhs and FuzzyStringOpRhs in ValueListEntry are OPTIONAL *)

ValueList = ValueListEntry, { Comma, ValueListEntry } ;
ValueZip = ValueListEntry, Colon, ValueListEntry, { Colon, ValueListEntry } ;

ValueZipList = ValueZip, { Comma, ValueZip } ;

(* Expressions *)

Expression = ExpressionClause, [ OR, Expression ] ;

ExpressionClause = ExpressionPhrase, [ AND, ExpressionClause ] ;

ExpressionPhrase = [ NOT ], ( Comparison | OpeningBrace, Expression, ClosingBrace ) ;

Comparison = ConstantFirstComparison
           | PropertyFirstComparison ;
(* Note: support for ConstantFirstComparison is OPTIONAL *)

ConstantFirstComparison = ( OrderedConstant, ValueOpRhs
                          | UnorderedConstant, ValueEqRhs ) ;

PropertyFirstComparison = Property, [ ValueOpRhs
                                    | KnownOpRhs
                                    | FuzzyStringOpRhs
                                    | SetOpRhs
                                    | SetZipOpRhs
                                    | LengthOpRhs ] ;
(* Note: support for SetZipOpRhs in Comparison is OPTIONAL *)

ValueOpRhs = ( ValueEqRhs | ValueRelCompRhs ) ;

ValueEqRhs = EqualityOperator, Value ;

ValueRelCompRhs = RelativeComparisonOperator, OrderedValue ;

KnownOpRhs = IS, ( KNOWN | UNKNOWN ) ;

FuzzyStringOpRhs = CONTAINS, Value
                 | STARTS, [ WITH ], Value
                 | ENDS, [ WITH ], Value ;

SetOpRhs = HAS, ( ( Value | EqualityOperator, Value | RelativeComparisonOperator, OrderedValue | FuzzyStringOpRhs ) | ALL, ValueList | ANY, ValueList | ONLY, ValueList ) ;
(* Note: support for the alternatives with EqualityOperator, RelativeComparisonOperator, FuzzyStringOpRhs, and ONLY in SetOpRhs are OPTIONAL *)

SetZipOpRhs = PropertyZipAddon, HAS, ( ValueZip | ONLY, ValueZipList | ALL, ValueZipList | ANY, ValueZipList ) ;

PropertyZipAddon = Colon, Property, { Colon, Property } ;

LengthOpRhs = LENGTH, [ Operator ], Value ;
(* Note: support for [ Operator ] in LengthOpRhs is OPTIONAL *)

(* Property *)

Property = Identifier, { Dot, Identifier } ;

(* TOKENS *)

(* Separators: *)

OpeningBrace = '(', [Spaces] ;
ClosingBrace = ')', [Spaces] ;

Dot = '.', [Spaces] ;
Comma = ',', [Spaces] ;
Colon = ':', [Spaces] ;

(* Boolean relations: *)

AND = 'AND', [Spaces] ;
NOT = 'NOT', [Spaces] ;
OR = 'OR', [Spaces] ;

IS = 'IS', [Spaces] ;
KNOWN = 'KNOWN', [Spaces] ;
UNKNOWN = 'UNKNOWN', [Spaces] ;

CONTAINS = 'CONTAINS', [Spaces] ;
STARTS = 'STARTS', [Spaces] ;
ENDS = 'ENDS', [Spaces] ;
WITH = 'WITH', [Spaces] ;

LENGTH = 'LENGTH', [Spaces] ;
HAS = 'HAS', [Spaces] ;
ALL = 'ALL', [Spaces] ;
ONLY = 'ONLY', [Spaces] ;
ANY = 'ANY', [Spaces] ;

(* Comparison operator tokens: *)

Operator = ( EqualityOperator | RelativeComparisonOperator ) ;
EqualityOperator = [ '!' ], '=' , [Spaces] ;
RelativeComparisonOperator = ( '<' | '>' ), [ '=' ], [Spaces] ;

(* Boolean values *)

TRUE = 'TRUE', [Spaces] ;
FALSE = 'FALSE', [Spaces] ;

(* Property syntax *)

Identifier = LowercaseLetter, { LowercaseLetter | Digit }, [Spaces] ;

Letter = UppercaseLetter | LowercaseLetter ;

UppercaseLetter = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I'
                | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R'
                | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' ;

LowercaseLetter = 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i'
                | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r'
                | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z' | '_' ;

(* Strings: *)

String = '"', { EscapedChar }, '"', [Spaces] ;

EscapedChar = UnescapedChar | '\', '"' | '\', '\' ;

UnescapedChar = Letter | Digit | Space | Punctuator | UnicodeHighChar ;

Punctuator = '!' | '#' | '$' | '%' | '&' | "'" | '(' | ')' | '*' | '+'
           | ',' | '-' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?'
           | '@' | '[' | ']' | '^' | '`' | '{' | '|' | '}' | '~' ;

(* BEGIN EBNF GRAMMAR Number *)
(* Number token syntax: *)

Number = [ Sign ] ,
         ( Digits, [ '.', [ Digits ] ] | '.' , Digits ),
         [ Exponent ], [Spaces] ;

Exponent =  ( 'e' | 'E' ) , [ Sign ] , Digits ;

Sign = '+' | '-' ;

Digits =  Digit, { Digit } ;

Digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;

(* White-space: *)

(* Special character tokens: *)

tab = ? \t ?;
nl  = ? \n ?;
cr  = ? \r ?;
vt  = ? \v ?;
ff  = ? \f ?;

Space = ' ' | tab | nl | cr | vt | ff ;

Spaces = Space, { Space } ;

(* The 'UnicodeHighChar' specifies any Unicode character above 0x7F.
   It is specified in this grammar by an extension to EBNF that allows a
   regular expression to specify terminal symbol ranges. *)

UnicodeHighChar = ? [^\x00-\x7F] ? ;

(* END EBNF GRAMMAR Number *)
(* END EBNF GRAMMAR Filter *)

Note: when implementing a parser according this grammar, the implementers MAY choose to construct a lexer that ignores all whitespace (spaces, tabs, newlines, vertical tabulation and form feed characters, as described in the grammar 'Space' definition), and use such a lexer to recognize language elements that are described in the (* TOKENS *) section of the grammar. In that case, it can be beneficial to remove the '[Spaces]' element from the Filter = [Spaces], Expression definition as well and use the remaining grammar rules as a parser generator input (e.g., for yacc, bison, antlr).

Regular Expressions for OPTIMADE Filter Tokens

The string below contains Perl-Compatible Regular Expressions to recognize identifiers, number, and string values as specified in this specification.

#BEGIN PCRE identifiers
[a-z_][a-z_0-9]*
#END PCRE identifiers

#BEGIN PCRE numbers
[-+]?(?:\d+(\.\d*)?|\.\d+)(?:[eE][-+]?\d+)?
#END PCRE numbers

#BEGIN PCRE strings
"([^\\"]|\\.)*"
#END PCRE strings

The strings below contain Extended Regular Expressions (EREs) to recognize identifiers, number, and string values as specified in this specification.

#BEGIN ERE identifiers
[a-z_][a-z_0-9]*
#END ERE identifiers

#BEGIN ERE numbers
[-+]?([0-9]+(\.[0-9]*)?|\.[0-9]+)([eE][-+]?[0-9]+)?
#END ERE numbers

#BEGIN ERE strings
"([^\"]|\\.)*"
#END ERE strings