JSON schema and validation code for HEPData submissions
Switch branches/tags
Nothing to show
Clone or download
GraemeWatt schema: impose "maxLength" on strings with DB restrictions, tidy up docs
* Remove unused "associated_records" from submission schema.
* Require "description" for submission "additional_resources".
* Remove optional "name" from data schema.
* Impose "maxLength" on strings with database length restrictions.
* Tidy up docs and remove redundant files.

Signed-off-by: Graeme Watt <graeme.watt@durham.ac.uk>
Latest commit ee7c79b Jun 8, 2018

README.rst

HEPData Validator

Travis Status Coveralls Status License GitHub Releases GitHub Issues Documentation Status

JSON schema and validation code for HEPData submissions

Installation

If you can, install LibYAML (a C library for parsing and emitting YAML) on your machine. This will allow for the use of CLoader for faster loading of YAML files. Not a big deal for small files, but performs markedly better on larger documents.

Via pip:

pip install hepdata-validator

Via GitHub (for developers):

git clone https://github.com/HEPData/hepdata-validator
cd hepdata-validator
pip install -e . -r requirements.txt
py.test testsuite

Usage

To validate files, you need to instantiate a validator (I love OO).

from hepdata_validator.submission_file_validator import SubmissionFileValidator

submission_file_validator = SubmissionFileValidator()
submission_file_path = 'submission.yaml'

# the validate method takes a string representing the file path.
is_valid_submission_file = submission_file_validator.validate(file_path=submission_file_path)

# if there are any error messages, they are retrievable through this call
submission_file_validator.get_messages()

# the error messages can be printed
submission_file_validator.print_errors(submission_file_path)

Data file validation is exactly the same.

from hepdata_validator.data_file_validator import DataFileValidator

data_file_validator = DataFileValidator()

# the validate method takes a string representing the file path.
data_file_validator.validate(file_path='data.yaml')

# if there are any error messages, they are retrievable through this call
data_file_validator.get_messages()

# the error messages can be printed
data_file_validator.print_errors('data.yaml')

Optionally, if you have already loaded the YAML object, then you can pass it through as a data object. You must also pass through the file_path since this is used as a key for the error message lookup map.

from hepdata_validator.data_file_validator import DataFileValidator
import yaml

file = yaml.load(open('data.yaml', 'r'))
data_file_validator = DataFileValidator()

data_file_validator.validate(file_path='data.yaml', data=file_contents)

data_file_validator.get_messages('data.yaml')

An example offline validation script uses the hepdata_validator package to validate the submission.yaml file and all YAML data files of a HEPData submission.