ASDF schemas are YAML documents that describe validations to be performed on tagged objects nested within the ASDF tree or on the tree itself. Schemas can validate the presence, datatype, and value of objects and their properties, and can be combined in different ways to facilitate reuse.
These schemas, though expressed in YAML, are structured according to the JSON Schema Draft 4 specification. The excellent Understanding JSON Schema book is a great place to start for users not already familiar with JSON Schema. Just keep in mind that the book includes coverage of later drafts of the JSON Schema spec, so certain features (constant values, conditional subschemas, etc) will not be available when writing schemas for ASDF. The book makes clear which features were introduced after Draft 4.
Here is an example of an ASDF schema that validates an object with a numeric value and corresponding unit:
%YAML 1.1
---
$schema: http://stsci.edu/schemas/yaml-schema/draft-01
id: asdf://asdf-format.org/core/schemas/quantity-2.0.0
title: Quantity object containing numeric value and unit
description: >-
An object with a numeric value, which may be a scalar
or an array, and associated unit.
type: object
properties:
value:
description: A vector of one or more values
anyOf:
- type: number
- tag: tag:stsci.edu:asdf/core/ndarray-1.0.0
unit:
description: The unit corresponding to the values
tag: tag:stsci.edu:asdf/unit/unit-1.0.0
required: [value, unit]
...
This is similar to the quantity schema, found here <asdf-standard:unit-schema>
, of the ASDF Standard, but has been updated to reflect current recommendations regarding schemas. Let's walk through this schema line by line.
%YAML 1.1
---
These first two lines form the header of the file. The %YAML 1.1
indicates that we're following version 1.1 of the YAML spec. The ---
marks the start of a new YAML document.
$schema: http://stsci.edu/schemas/yaml-schema/draft-01
The $schema
property contains the URI of the schema that validates this document. Since our document is itself a schema, the URI refers to a metaschema. ASDF comes with three built-in metaschemas:
http://json-schema.org/draft-04/schema
- The JSON Schema Draft 4 metaschema. Includes basic validators and combiners.http://stsci.edu/schemas/yaml-schema/draft-01
- The YAML Schema metaschema. Includes everything in JSON Schema Draft 4, plus additional YAML-specific validators includingtag
andpropertyOrder
.http://stsci.edu/schemas/asdf/asdf-schema-1.0.0
- The ASDF Schema metaschema. Includes everything in YAML Schema, plus additional ASDF-specific validators that check ndarray properties.
Our schema makes use of the tag
validator, so we're specifying the YAML Schema URI here.
id: asdf://asdf-format.org/core/schemas/quantity-2.0.0
The id
property contains the URI that uniquely identifies our schema. This URI is how we'll refer to the schema when using the asdf library.
title: Quantity object containing numeric value and unit
description: >-
An object with a numeric value, which may be a scalar
or an array, and associated unit.
Title and description are optional (but recommended) documentation properties. These properties can be placed multiple times at any level of the schema and do not have an impact on the validation process.
type: object
This line invokes the type
validator to check the data type of the top-level value. We're asserting that the type must be a YAML mapping, which in Python is represented as a dict.
properties:
The properties
validator announces that we'd like to validate certain named properties of mapping. If a property is listed here and is present in the ASDF, it will be validated accordingly.
value:
description: A vector of one or more values
Here we're identifying a property named value
that we'd like to validate. The description
is used to add some additional documentation.
anyOf:
The anyOf
validator is one of JSON Schema's combiners. The value
property will be validated against each of the following subschemas, and if any validates successfully, the entire anyOf
will be considered valid. Other available combiners are allOf
, which requires that all subschemas validate successfully, oneOf
, which requires that one and only one of the subschemas validates, and not
, which requires that a single subschema does not validate.
- type: number
The first subschema in the list contains a type
validator that succeeds if the entity assigned to value
is a numeric literal.
- tag: tag:stsci.edu:asdf/core/ndarray-1.0.0
The second subschema contains a tag
validator, which makes an assertion regarding the YAML tag URI of the object assigned to value
. In this subschema we're requiring the tag of an ndarray-1.0.0 object, which is how n-dimensional arrays are represented in an ASDF tree.
The net effect of the anyOf
combiner and its two subschemas is: validate successfully if the value
object is either a numeric literal or an n-dimensional array.
unit:
description: The unit corresponding to the values
tag: tag:stsci.edu:asdf/unit/unit-1.0.0
The unit
property has another bit of documentation and a tag
validator that requires it to be a unit-1.0.0 object.
required: [value, unit]
Since the properties
validator does not require the presence of its listed properties, we need another validator to do that. The required
validator defines a list of properties that need to be present if validation is to succeed.
...
Finally, the YAML document end indicator indicates the end of the schema.
The ~asdf.schema.check_schema function performs basic syntax checks on a schema and will raise an error if it discovers a problem. It does not currently accept URIs and requires that the schema already be loaded into Python objects. If the schema is already registered with the asdf library as a resource (see extending_resources
), it can be loaded and checked like this:
from asdf.schema import load_schema, check_schema
schema = load_schema("asdf://example.com/example-project/schemas/foo-1.0.0")
check_schema(schema)
Otherwise, the schema can be loaded using pyyaml directly:
from asdf.schema import check_schema
import yaml
schema = yaml.safe_load(open("/path/to/foo-1.0.0.yaml").read())
check_schema(schema)
Getting a schema to validate as intended can be a tricky business, so it's helpful to test validation against some example objects as you go along. The ~asdf.schema.validate function will validate a Python object against a schema:
from asdf.schema import validate
import yaml
schema = yaml.safe_load(open("/path/to/foo-1.0.0.yaml").read())
obj = {"foo": "bar"}
validate(obj, schema=schema)
The validate function will return successfully if the object is valid, or raise an error if not.
- JSON Schema Draft 4
- Understanding JSON Schema
Unit Schemas <asdf-standard:unit-schema>