Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: HiFa v1.1.0 schema development #1978

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open

Conversation

kratsg
Copy link
Contributor

@kratsg kratsg commented Sep 1, 2022

Pull Request Description

The following is happening:

  • resolves Update validator for jsonschema #1980
  • start a HiFa JSON v1.1.0, but not release it.
  • allow for defining a default schema version for the different schemas for pyhf to support in the current implementation
  • provide a utility to migrate (upgrade/downgrade) provided specifications (workspace, patchset) to a different version
  • provide command-line interface for upgrading/downgrading
  • flesh out the type-hints for pyhf.schema

To-Do:

  • agree on how upgrade/downgrade works internally: separate classes, or single class like suggested by @matthewfeickert in comments, or more like functional form similar to what nbconvert does (see my comment below feat: HiFa v1.1.0 schema development #1978 (comment))
  • determine whether we allow workspace/model and patchset to have different versions or not, or require them to always be synced (this changes what we do for pyhf.schema.versions to pyhf.schema.version depending...)

Schema updates:

  • Update from Draft 6 to 2020-12 Draft instead
  • Support serialization of custom function (another PR)
  • Support serialization of multiple POIs (another PR)
  • Support serialization of bin-wise fixed/constant parameters (another PR)
  • Support HS3's HistFactory schema (another PR)

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR

@kratsg kratsg added feat/enhancement New feature or request schema and spec JSON schema type checking Related to types and type checking labels Sep 1, 2022
@kratsg kratsg self-assigned this Sep 1, 2022
@kratsg kratsg marked this pull request as draft September 1, 2022 22:10
@matthewfeickert matthewfeickert changed the base branch from master to main September 21, 2022 20:52
@kratsg kratsg marked this pull request as ready for review December 6, 2022 21:56
@kratsg kratsg changed the title feat: Update JSON Schema draft feat: HiFa v1.1.0 schema development Dec 6, 2022
@matthewfeickert
Copy link
Member

@kratsg you have this slated for v0.7.1 at the moment. Does this mean that you feel we have a problem that needs to be patched? If not, we should probably place this in v0.8.0.

@kratsg
Copy link
Contributor Author

kratsg commented Dec 6, 2022

@kratsg you have this slated for v0.7.1 at the moment. Does this mean that you feel we have a problem that needs to be patched? If not, we should probably place this in v0.8.0.

It's going into the next release though. I don't know if we want to start juggling development HiFa v1.0.0 and v1.1.0 at the same time? But this will need to go in, so that we can deal with multiple schema versions.

@matthewfeickert
Copy link
Member

It's going into the next release though. I don't know if we want to start juggling development HiFa v1.0.0 and v1.1.0 at the same time? But this will need to go in, so that we can deal with multiple schema versions.

Okay, then I've switched it to be v0.8.0. After the issues with getting v0.7.0 out and also accepting PRs I think it makes sense (for me to figure out how to do release branches) to start using patch releases for when we actually have a bug we need to go back and patch and start having other releases be minor (until we hit v1.0.0).

@codecov
Copy link

codecov bot commented Dec 7, 2022

Codecov Report

Base: 98.30% // Head: 98.30% // No change to project coverage 👍

Coverage data is based on head (2b168b7) compared to base (2b168b7).
Patch has no changes to coverable lines.

❗ Current head 2b168b7 differs from pull request most recent head 178b3db. Consider uploading reports for the commit 178b3db to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1978   +/-   ##
=======================================
  Coverage   98.30%   98.30%           
=======================================
  Files          69       69           
  Lines        4531     4531           
  Branches      645      645           
=======================================
  Hits         4454     4454           
  Misses         45       45           
  Partials       32       32           
Flag Coverage Δ
contrib 97.88% <0.00%> (ø)
doctest 61.15% <0.00%> (ø)
unittests-3.10 96.31% <0.00%> (ø)
unittests-3.8 96.33% <0.00%> (ø)
unittests-3.9 96.35% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kratsg for this nice work. Overall this looks good. 👍

I've added a question and some small formatting suggestions, but as we need to fix up the coverage thing I could add those in this weekend once I am at CERN so that you don't waste time having to poke at coverage's decisions. It is probably faster to locally run

$ nox --session tests --python 3.10 -- coverage
$ nox --session coverage
$ xdg-open ./htmlcov/index.html

to debug coverage than to have GHA do all of that and then have Codecov report it back to us.

If you can also fill out the PR body with some additional context that would be helpful, given that you've done a lot of nice work here (I can also summarize this later, but am not able to succinctly do so now given a lot was added to support this).

Comment on lines +7 to +12
class Upgrade_1_0_1:
"""
Used for testing functionality of upgrade.
"""

version: SchemaVersion = '1.0.1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class Upgrade_1_0_1:
"""
Used for testing functionality of upgrade.
"""
version: SchemaVersion = '1.0.1'
class Upgrade(version: SchemaVersion = "1.0.1"):
"""
Used for testing functionality of upgrade.
"""

Maybe I'm missing something about what the purpose of upgrader is, but is there a reason why we need to make this class schema v1.0.1 hardcoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an Upgrade_X_Y_Z class for every single version we plan to support upgrading to. When we add a new version of the spec, we have to have a new upgrade class that knows how to upgrade from an old version to the version the class represents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kratsg Ah okay, so the real reason is that we can't deterministically know what version we're upgrading from in advance?

Do we have a plan for how to handle upgrading multiple versions in one go? e.g. Let's say we have v1.0.0, v1.1.0, v1.2.0, and v2.0.0. If at this point someone comes along with a v1.0.0schema and wants to move fromv1.0.0tov2.0.0` what's the plan for doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have ideas on how to do this with python's mro

Comment on lines +57 to +59
def upgrade(*, to_version: SchemaVersion | None = None) -> type[UpgradeProtocol]:
if to_version is None or to_version == '1.0.1':
return Upgrade_1_0_1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def upgrade(*, to_version: SchemaVersion | None = None) -> type[UpgradeProtocol]:
if to_version is None or to_version == '1.0.1':
return Upgrade_1_0_1
def upgrade(*, to_version: SchemaVersion | None = None) -> type[UpgradeProtocol]:
if to_version is None or to_version == "1.0.1":
return Upgrade("1.0.1")

Follow up to the above.

tests/test_upgrade.py Outdated Show resolved Hide resolved
@matthewfeickert
Copy link
Member

matthewfeickert commented Dec 9, 2022

(Ignore the docs failing as this is Springer's fault: https://twitter.com/HEPfeickert/status/1600915112403304449?s=20&t=kjTW5TNbU_Jukxb0ALd0uQ. We can either fix this up locally in another PR or just wait for them to fix it.)

Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kratsg sorry just a few more questions to help me understand this better. As this will go in this weekend can I also ask you sometime before Monday to write up a summary in the PR body and the squash and merge summary commit? This is a huge amount of work (:bow:) and I want to make sure that we have a summary of this as I think if I had to try to sit down and reread all of this again late at night I might be pressed to do so without some accompanying text.

Comment on lines +7 to +12
class Upgrade_1_0_1:
"""
Used for testing functionality of upgrade.
"""

version: SchemaVersion = '1.0.1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kratsg Ah okay, so the real reason is that we can't deterministically know what version we're upgrading from in advance?

Do we have a plan for how to handle upgrading multiple versions in one go? e.g. Let's say we have v1.0.0, v1.1.0, v1.2.0, and v2.0.0. If at this point someone comes along with a v1.0.0schema and wants to move fromv1.0.0tov2.0.0` what's the plan for doing this?

@@ -453,6 +453,7 @@ def test_combine_workspace_incompatible_poi(workspace_factory, join):
def test_combine_workspace_diff_version(workspace_factory, join):
ws = workspace_factory()
ws.version = '1.0.0'
ws['version'] = '1.0.0'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing the point of this test, but if you put a breakpoint just after ws = workspace_factory()

(Pdb) ws.version
'1.0.0'
(Pdb) ws["version"]
'1.0.0'

so is the reason to explicitly set this a means to solidify the versions that will be used in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a test to enforce absolutely different versions.

@kratsg
Copy link
Contributor Author

kratsg commented Mar 23, 2023

Look into how nbformat handles converting between versions: (thanks @agoose77 ) which is done by defining a v#/convert.py such as for v2 here.

@kratsg kratsg mentioned this pull request Dec 8, 2023
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/enhancement New feature or request schema and spec JSON schema type checking Related to types and type checking
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

Update validator for jsonschema
2 participants