Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New metadata API: add support for ADR 0008 #1345

Merged
merged 1 commit into from
Apr 22, 2021

Conversation

MVrachev
Copy link
Collaborator

@MVrachev MVrachev commented Apr 9, 2021

Description of the changes being introduced by the pull request:

In order to support ADR 0008 we would want to accept unrecognized
fields in all metadata classes.
Input that contains unknown fields in the 'signed' dictionary should
successfully deserialize into a Metadata object, and that object should
successfully serialize with the unknown fields intact.

Also, we should test that we support unrecognized fields when adding
new classes or modifying existing ones to make sure we support
ADR 0008.

Signed-off-by: Martin Vrachev mvrachev@vmware.com

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit out of my depth with metadata API still but let's try:

  • Please don't expect commit message reader to read the ADR: mentioning it is good but especially the title should explain what the commit does
  • I wonder if the repetition could be fixed by making the implementation part of Signed -- I left more details in the code

Other comments are more trivial

tests/test_api.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 16, 2021

Thank you @jku for your great review!
I amended my last commit to fix the commit message and used the opportunity to apply your suggestions.

I particularly found that comment really useful which simplified the code a lot!
Also, it seems to me that this comment is no longer applicable because I don't have unrecognized_fields as an argument for the Timestamp constructor.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some code comments to get them out of the way:

  • The derived classes can just set the known_fields class variable in the class definition, no need to have the new code in _get_unrecognized_fields()
  • _get_unrecognized_fields should probably be _unrecognised_fields_from_dict() to match the existing function for common fields.

But before you get to those: there may be an alternative solution that means we don't need to track known fields... If the code uses pop() consistently when looking up fields from dict (as it seems to) then whatever is left after collecting all known fields is unrecognised_fields.

@MVrachev
Copy link
Collaborator Author

But before you get to those: there may be an alternative solution that means we don't need to track known fields... If the code uses pop() consistently when looking up fields from dict (as it seems to) then whatever is left after collecting all known fields is unrecognised_fields.

Now, it makes sense why Lukas decided to use pop versus just accessing the fields.
I used that suggestion in my latest changes squashed everything into one commit.
It does look better now, even though I am not such a fan of the side effects when using from_dict.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, some small comments inline. Also this part of the commit message:

including the classes that would be
added representing a subportion of a role like "meta", "delegations"
and "roles".

I'm not quite sure what this means... maybe something like "Input that contains unknown fields in the 'signed' dictionary should succesfully deserialize into a Metadata object, and that object should succesfully serialize with the unknown fields intact".

The only other thing I'm wondering is if we need more substantial tests (more complex content?) I don't know what this would accomplish so I guess I'm fine with the tests as they are...

tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@MVrachev
Copy link
Collaborator Author

Fixed all of the comments you mentioned and the commit message.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on the iterations even when feedback came in dripping... result looks nice to me.

I still feel like the testing is pretty light... but I'm also not sure what to suggest so ✔️

@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 20, 2021

Thanks for working on the iterations even when feedback came in dripping... result looks nice to me.

I still feel like the testing is pretty light... but I'm also not sure what to suggest so ✔️

Not sure what other tests should I add.
I make sure that between calling from_dict and to_dict there is no change and I thought this is the idea right?
Is there another property we want to check for?

@jku
Copy link
Member

jku commented Apr 21, 2021

Not sure what other tests should I add.

Just to clarify: this is just fine for me.

@jku
Copy link
Member

jku commented Apr 22, 2021

So after I was schooled yesterday on how the unrecognised fields are planned to operate (that unrecognised fields are allowed to appear at many levels of the file format hierarchy -- e.g. tap15 adds a field into delegations) I've started thinking the testing could reflect that from the beginning: the testing proposed here is very focused on a specific place in the file format

Is it possible to add a test that inserts extra fields at every possible location in our file format and does a basic serialize/deserialize test? The naive attempt would be to go through the json and add a bogus field to every dictionary... but there are exceptions: METAFILES in timestamp.json is explicitly not allowed to have extra fields. But maybe you could add extra fields to every dict except a hand-crafted list of exceptions?

This could be a new issue if it looks like a major change... but implementing this might make the future object creation PRs easier (as test would exist already)

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the iteration here, the end result looks good.

I left a couple of optional suggestions that would reduce the lines of code in metadata.py by ~7LOC. Nothing major.

tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@joshuagl
Copy link
Member

Is it possible to add a test that inserts extra fields at every possible location in our file format and does a basic serialize/deserialize test? The naive attempt would be to go through the json and add a bogus field to every dictionary... but there are exceptions: METAFILES in timestamp.json is explicitly not allowed to have extra fields. But maybe you could add extra fields to every dict except a hand-crafted list of exceptions?

This could be a new issue if it looks like a major change... but implementing this might make the future object creation PRs easier (as test would exist already)

This sounds like a good test to add indeed. I'd be happy to see it in a future PR, does not have to be this one.

In order to support ADR 0008 we would want to accept unrecognized
fields in all metadata classes.
Input that contains unknown fields in the 'signed' dictionary should
successfully deserialize into a Metadata object, and that object should
successfully serialize with the unknown fields intact.

Also, we should test that we support unrecognized fields when adding
new classes or modifying existing ones to make sure we support
ADR 0008.

Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
@MVrachev
Copy link
Collaborator Author

So after I was schooled yesterday on how the unrecognised fields are planned to operate (that unrecognised fields are allowed to appear at many levels of the file format hierarchy -- e.g. tap15 adds a field into delegations) I've started thinking the testing could reflect that from the beginning: the testing proposed here is very focused on a specific place in the file format

Is it possible to add a test that inserts extra fields at every possible location in our file format and does a basic serialize/deserialize test? The naive attempt would be to go through the json and add a bogus field to every dictionary... but there are exceptions: METAFILES in timestamp.json is explicitly not allowed to have extra fields. But maybe you could add extra fields to every dict except a hand-crafted list of exceptions?

This could be a new issue if it looks like a major change... but implementing this might make the future object creation PRs easier (as test would exist already)

Yes, that's true and I am adding unrecognized fields and tests for each of the subfields I am replacing with a new class.
See #1360.

Also, I addressed the two comments by @jku.
Thank you both for your patience and detail reviews!

@jku
Copy link
Member

jku commented Apr 22, 2021

Yes, that's true and I am adding unrecognized fields and tests for each of the subfields I am replacing with a new class.
See #1360.

So just to make it clear: I'm trying to propose that we don't try to cover each case individually, but instead do it in an automated manner. No need to block this PR though.

@jku jku merged commit feb340f into theupdateframework:develop Apr 22, 2021
@MVrachev MVrachev deleted the implement-adr-8 branch April 29, 2021 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants