ASDF Format Validator
Landing page for all things ASDF: https://seismic-data.org
This is Adaptable Seismic Data Format - if you are looking for the Advanced Scientific Data Format, go here: https://asdf.readthedocs.io/en/latest/
This module serves to validate
ASDF (Adaptable Seismic Data Format) files to
ensure consistency and compatibility between implementations.
While being written in Python it is completely independent of any ASDF implementation.
It requires HDF5 in a recent version to be installed (in particular the
h5ls programs must be installed and in the
Python versions 2.7 and 3.4 have been tested; others might well work. Additional required Python modules are:
Cloning the repository is currently necessary:
$ git clone https://github.com/SeismicData/asdf_validate.git $ cd asdf_validate $ pip install -v -e .
The module will install a single command:
$ asdf-validate seismo.h5 Valid ASDF File!
Any other output mean your file is not valid. The error messages should hopefully give hints how to fix it.
What Does it Do?
It performs a couple of validations:
- Checks if the file exists.
- Checks if its an HDF5 file.
- Checks if the
file_format_versionattributes are set and correspond to expected values.
- It transforms the structure of the file to a JSON respresentation which is then checked against a JSON Schema. This assures a number of things:
- The general layout and naming scheme is enforced.
- Data spaces and data types of attributes are enforced.
- Waveform data can only be 32/64 bit, little/big endian, IEEE floats or two's complement integers.
- Waveform data sets must have
sampling_rateattributes of the correct data space and type.
- Naming scheme of the auxiliary data is enforced.
- XML files are stored in a consistent manner.
- It makes sure all waveforms are in the correct station group.
- It validates the QuakeML file against the QuakeML schema.
- It validates all found StationXML files against the StationXML schema.
A number of checks that should be implemented in the future in no particular order:
- The times in the data set names of the waveforms should correspond to the actual times of the data.
- The various event resource identifiers on the waveform datasets are valid identifiers.
- StationXML files only contain information about the current station.
- Provenance is not yet validated (this has to wait until SEIS-PROV is done).