Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote schema validations #241

Merged
merged 5 commits into from
Oct 5, 2020
Merged

Remote schema validations #241

merged 5 commits into from
Oct 5, 2020

Conversation

Sinclert
Copy link
Contributor

@Sinclert Sinclert commented Sep 2, 2020

This PR substituted the previous fork-based PR #208.

In a nutshell, it makes use of hepdata-validator (0.2.2) 0.2.3 new features to support remotely-defined schemas when validating submission data.

Clarifications:

  • I treated the data_schema submission field as optional (check here).
  • I have not bumped up the HEPData global version.

@coveralls
Copy link

coveralls commented Sep 2, 2020

Coverage Status

Coverage increased (+0.2%) to 81.94% when pulling 7ceb122 on remote-schemas into c1a426c on master.

@cranmer
Copy link

cranmer commented Sep 25, 2020

Hi all, I just wanted to check in on this.

@GraemeWatt
Copy link
Member

Thanks for the PR and sorry for the slow response. I'd be happy to merge to master after a few changes.

  1. I constructed a test submission using some of the test files we used for the validator. Specifically, I copied these lines into a submission.yaml file and packaged it together with valid_file_custom_remote.json in a zip file. This test submission passes validation and processing, but the data file valid_file_custom_remote.json cannot be rendered because it does not contain the usual independent_variables and dependent_variables of a normal HEPData table. Specifically, an exception KeyError: 'independent_variables' is returned by this line. I think you need to modify ACCEPTED_REMOTE_SCHEMAS so that schemas is just an empty list [] for now, i.e. we don't currently support any pyhf schemas. At a later stage, once support is added elsewhere in the HEPData code, specific pyhf schemas could be added. From the discussion in the main pyhf issue submission: provide native support for individual "pyhf" JSON files #164, none of these schemas ['jsonpatch.json', 'measurement.json', 'model.json', 'patchset.json', 'workspace.json'] correspond to what will eventually be uploaded. I think the pyhf authors will need to define a new schema called something like 'likelihood.json' corresponding to the new YAML data format they're proposing. I didn't get time yet to following up on the discussion in submission: provide native support for individual "pyhf" JSON files #164, since it's still a low priority for me compared to other outstanding tasks.

  2. For the new validators.py module, is it possible to write some tests to increase coverage?

  3. In get_submission_validator you can remove the option for old_hepdata (and the comment), since I've fixed oldhepdata: remove default None values for the data_license in submission.yaml hepdata-converter#32 and updated the hepdata/hepdata-converter-ws Docker image today. I was going to make the change directly to master, but it might be easier for you to make the change in this PR, even though it's not strictly related.

  4. Please merge the latest changes from master into your branch.

@Sinclert
Copy link
Contributor Author

Sinclert commented Oct 2, 2020

Hello @GraemeWatt ,

Thanks for the clarification on the renderable conditions for new schemas, that was really useful.

This PR have been updated, including:

  • Rebase the PR branch with the latest version from master.
  • Comment the listed pyhf schemas, until they comply the necessary conditions to be renderable. (d2eebf6)
  • Remove the unnecessary check and comment on the get_submission_validator function (4d201f3).
  • Define validators_test.py module (which derived into a separate PR within hepdata-validator) (ce05ac2).
  • Bump up hepdata-validator requirement version to 0.2.3 (7ceb122).

The CI validation of this PR (✅ / ❌) will be subject to the acceptance of the hepdata-validator derived PR, and the publication of its resulting Python package (hepdata-validator 0.2.3).

@GraemeWatt
Copy link
Member

Thanks for addressing all my comments. I reran the Travis jobs now that hepdata-validator 0.2.3 is released and all checks have passed, with test coverage increased by 0.2%. I'll merge to master now.

@cranmer
Copy link

cranmer commented Oct 5, 2020

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants