Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add layout metadata compiler and examples #5

Merged
merged 8 commits into from
Sep 24, 2019
Merged

Conversation

SantiagoTorres
Copy link
Member

Hello, this commit adds the metadata compiler and a couple of metadata samples (debian grep and seattle).
Don't review the link metadata files, just the readme and the compiler please.

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the extremely late review, this really slipped under my radar. As requested, I read over compile-examples.py and I like it. I wonder if it's worth building this into in-toto? AFAICS, the larger part of the script takes care of traversing the metadata, in order to sort fields and truncate long values, which largely aligns with the feature request in in-toto/in-toto#18. What do you think?

If we do want to merge this here, there are a couple of things that need to be fixed:

  • The script is not compatible with the current metadata specification (nor are the ~30K added lines of metadata)
  • It shouldn't be required that the layout is called root.layout
  • Links are loaded by globbing for *.link files and showed in the order glob returns them. Should we not rather load them as they are defined in layout, as e.g. in_toto.verifylib.load_links_for_layout does?
  • Link file globbing did not work for me (I didn't troubleshoot though)
  • Sublayout's are not handled
  • IMO it comes unexpected that the displayed materials and products are random samples if there are more than 9 of them
  • Adding ellipses as last item (to show that a dict is truncated), does not guarantee that it is indeed printed as last item. Python does not guarantee to keep the order of dict items as they were inserted.
  • A bare minimum of documentation would be nice

Let me know if I should help.

@lukpueh
Copy link
Member

lukpueh commented Apr 17, 2019

Btw. here's a list of required metadata schema changes:

  • metablock
    • all top level fields are now under the signed field, and the signatures field is a sibling of the signed field
  • layout
    • expected_command and run must be lists
    • expires must not have milliseconds
    • material_matchrules are now expected_materials
    • product_matchrules are now expected_products
    • MATCH rule syntax has change
    • threshold is a mandatory field
  • link
    • _type value is lowercase
    • return_value is now part of byproducts
    • environment is a mandatory field

@SantiagoTorres
Copy link
Member Author

SantiagoTorres commented Apr 17, 2019

Hi!

I wonder if it's worth building this into in-toto? AFAICS, the larger part of the script takes care of traversing the metadata, in order to sort fields and truncate long values, which largely aligns with the feature request in in-toto/in-toto#18. What do you think?

I'm not entirely sure if this is what's to be addressed on this side. I intentionally avoided using in-toto as a dependency (or any templating library for that matter) so as to keep this script-y. I believe that that issue has been floating around and having different meanings every time we revisit it, I'm afraid.

As for the point issues:

The script is not compatible with the current metadata specification (nor are the ~30K added lines of metadata)

Yes, unfortunately time has gone by, we may want to update it to conform to the latest spec.

It shouldn't be required that the layout is called root.layout

Probably not, but considering we're the ones that will be running this to update our examples in the docs I don't see why we should care much about user interface right away (this could be also ticketized)

Links are loaded by globbing for *.link files and showed in the order glob returns them. Should we not rather load them as they are defined in layout, as e.g. in_toto.verifylib.load_links_for_layout does?

This requires a dependency on in-toto or smarter parsing of json objects, which I tried to avoid. Again, we could make the tool smarter if we'd like by adding more deps/code, but we may want to think about it for just a docs repo.

Link file globbing did not work for me (I didn't troubleshoot though)

I suspect it's because the keyid prefix,

Sublayout's are not handled

No, as we don't have any examples on the docs repo that use sublayouts. We can always add support for this as the need arises.

IMO it comes unexpected that the displayed materials and products are random samples if there are more than 9 of them

I can't personally think of any other way to keep things succint, but I'm open to suggestions. I do agree this is not a perfect solution to overly verbose metadata.

Adding ellipses as last item (to show that a dict is truncated), does not guarantee that it is indeed printed as last item. Python does not guarantee to keep the order of dict items as they were inserted.

This is true, and it's something I bailed on working on back then. We could use a frozendict or serialize and then append on the printout after-the-fact (which would be messy).

A bare minimum of documentation would be nice

Agreed. Let's decide on whether this goes here and work accordingly.

SantiagoTorres and others added 3 commits May 14, 2019 13:12
- metablock
  - all top level fields are now under the `signed` field, and the
    `signatures` field is a sibling of the `signed` field
- layout
  - `expected_command` and `run` must be lists
  - `expires` must not have milliseconds
  - `material_matchrules` are now `expected_materials`
  - `product_matchrules` are now `expected_products`
  - `MATCH` rule syntax has change
  - `threshold` is a mandatory field
- link
  - `_type` value is lowercase
  - `return_value` is now part of `byproducts`
  - `environment` is a mandatory field
Add function that recursively traverses a passed python object,
e.g. in-toto metadata, allowing to truncate long strings, lists and
dicts, and to reorder dict keys, using OrderedDict.
Update and merge template populating functions, make them use
the newly added metadata beaufifier (truncate and order) and
rename to create_markdown_summary.
Update markdown summaries for debian, polypasswordhasher and
seattle sample in-tot metadata.
lukpueh added a commit to lukpueh/in-toto.github.io that referenced this pull request May 15, 2019
Update sample metadata summaries for pph, seattle and debian
supply chain metadata, using latest version of Santiago's
metadata compiler script (see
in-toto/specification#5).

This commit also adds jekyll frontmatter to auto convert markdown
to html and make the metadata sample pages part of the website
layout.
@lukpueh
Copy link
Member

lukpueh commented May 15, 2019

@SantiagoTorres, I updated the sample layout and link metadata to meet the latest version of the spec, cleaned-up the compiler script, and used it to create new versions of the markdown-formatted summaries, which I have already published on our website (see pph, seattle and debian).

Let me know what you think.

@lukpueh lukpueh closed this May 15, 2019
@lukpueh lukpueh reopened this May 15, 2019
@lukpueh
Copy link
Member

lukpueh commented May 15, 2019

Btw. 088277b is my stab at in-toto/in-toto#18. IMO especially the dict field ordering for pretty printing is very useful.

if len(obj) > kw["max_str_len"]:
obj = obj[:kw["max_str_len"] - 3] + "..."

# Truncate list and recurse into _beautify for each item
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space after each

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eyes! Thanks for the review. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants