Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major rewrite of artifact rule verification #262

Merged
merged 1 commit into from
Mar 22, 2019

Conversation

lukpueh
Copy link
Member

@lukpueh lukpueh commented Mar 11, 2019

Fixes issue #:
#261, takes over #259

Description of the changes being introduced by the pull request:

This PR includes several changes to verify_item_rules and the from there called verify_*_rule functions (plus corresponding tests):

  • Harmonization of artifact queue modification

    Before, verify_item_rules used a material queue, a product queue and an artifact queue to keep track of the artifacts, consumed in the course of verifying all rules (of a type of an item). These queues were modified in different places (in different functions passed to by reference) and then re-assigned to each other, sometimes as copy, sometimes as reference, which led to unexpected states of queues.

    This commit removes the materials and products queue (the generic artifacts queue is enough for keeping track of consumed artifacts).

    It further focuses all artifact queue updates in a single place in verify_item_rules and changes the verify_*_rule functions to return the consumed artifacts instead of the updated queue, which also aligns with the pseudo code in the specification. This change should make the rule processing a lot more transparent to the reader.

  • Additional code simplifications
    This commit also makes better use of set operations in verify_*_rule functions and removes clutter like

    • redundant rule unpacking code
    • unnecessary if/elses related to "materials" and "products"
    • unnecessarily long variable names
      the source_ prefix only makes sense in the scope of the match rule, where there is also a destination.
      The commit also updates and simplifies a lot of related code comments and docstrings.
  • Full adoption of Fix firewall rules #204
    Remove raised exceptions in verify_match_rule and verify_delete_rule, where they should just not consume the corresponding artifacts.

  • Rethink tests
    Some of them were still tailored to pre-Fix firewall rules #204 logic. This commit makes them more meaningful (and table driven).

  • Adopt queue traceback
    There is only one queue now.

Please fill in the fields below to submit a pull request. The more information
that is provided, the better.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

This commit includes several changes to `verify_item_rules` and
the from there called `verify_*_rule` functions (plus corresponding
tests):

- Harmonization of artifact queue modification

Before, `verify_item_rules` used a material queue, a product queue
and an artifact queue to keep track of the artifacts, consumed in
the course of verifying all rules (of a type of an item).  These
queues were modified in different places (in different functions
passed to by reference) and then re-assigned to each other,
sometimes as copy, sometimes as reference, which led to unexpected
states of queues.

This commit removes the materials and products queue (the generic
artifacts queue is enough for keeping track of consumed artifacts).

It further focuses all artifact queue updates in a single place in
`verify_item_rules` and changes the `verify_*_rule` functions to
return the consumed artifacts instead of the updated queue, which
also aligns with the pseudo code in the specification. This change
should make the rule processing a lot more transparent to the
reader.

- Additional code simplifications
This commit also makes better use of set operations in
`verify_*_rule` functions and removes clutter like
  - redundant rule unpacking code
  - unnecessary `if/else`s related to "materials" and "products"
  - unnecessarily long variable names
    the `source_` prefix only makes sense in the scope of the match
    rule, where there is also a destination.
The commit also updates and simplifies a lot of related code
comments and docstrings.

- Full adoption of in-toto#204
Remove raised exceptions in `verify_match_rule` and
`verify_delete_rule`, where they should just not consume the
corresponding artifacts.

- Rethink tests
Some of them were still tailored to pre-in-toto#204 logic. This commit
makes them more meaningful (and table driven).

- Adopt queue traceback
There is only one queue now.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.0009%) to 99.947% when pulling f23618d on lukpueh:rewrite-rule-verification into ab1e904 on in-toto:develop.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.0009%) to 99.947% when pulling f23618d on lukpueh:rewrite-rule-verification into ab1e904 on in-toto:develop.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.0009%) to 99.947% when pulling f23618d on lukpueh:rewrite-rule-verification into ab1e904 on in-toto:develop.

@lukpueh
Copy link
Member Author

lukpueh commented Mar 13, 2019

It might be worth cross-checking with the corresponding section in the specification.

@lukpueh
Copy link
Member Author

lukpueh commented Mar 15, 2019

Re simplification:
I just ported this to our golang version of in-toto's verifylib and made a slight modification.

In go I calculate sets of created, modified and deleted artifacts only once and before even looking at the rules, which pays off as soon as there is more than one rule of the corresponding type.

Here I re-calculate the created, deleted and modified sets each time I encounter the corresponding rule.

The ideal solution would probably be to calculate each set once the first time the corresponding rule is encountered. Any thoughts?

@SantiagoTorres
Copy link
Member

The ideal solution would probably be to calculate each set once the first time the corresponding rule is encountered. Any thoughts?

I actually like this "artifact queue compression" idea. Let me walk through it. For now I think this re-write makes things clearer to read and as such is valuable by itself...

Copy link
Member

@SantiagoTorres SantiagoTorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I wanted to leave these three comments yet they don't affect the functionality of this.

if matched_product in source_materials_queue:
unmatched_materials.add(matched_product)
# Consume filtered artifacts that are products but not materials
consumed = set(filtered_artifacts) & (products - materials)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love set operations 👍

for trace_entry in RULE_TRACE["trace"]:
traceback_str += "Queue after '{0}':\n".format(
" ".join(trace_entry["rule"]))
traceback_str += "{}\n".format(trace_entry["queue"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought: should we encode RULE_TRACE as part of a decorator for function that need it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? To avoid having a global variable? There are only two places that access it -- verify_item_rules writes to it, and _get_artifact_rule_traceback reads from it. I think a decorator might be a bit of an overkill.

On a side note, we could easily remove the rule queue trace and instead update the logging statements to provide the same information. See #243 (comment) ff.


else: # pragma: no cover (unreachable)
raise securesystemslib.exceptions.FormatError(
"Invaldid rule type '{}'.".format(_type))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've coalesced this interfaces almost there, wouldn't it be worth moving this into something more akin to a dispatch table??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not, as long as it enhances readability. How would you go about different parameters? Use some *args/**kwargs wizardry? For comparison, here's how we do it in the go implementation.

@SantiagoTorres
Copy link
Member

Tested with the some integrator's pipeline and against old metadata. Merging 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants