Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to configure which hooks to run per scan #757

Merged
merged 7 commits into from
Nov 10, 2021

Conversation

EndPositive
Copy link
Contributor

@EndPositive EndPositive commented Oct 22, 2021

Description

This PR, if applied, closes #728, mainly by adding the field hookSelector in the scan spec. Besides, this PR adds the cascading scans field inheritHookSelector (default false) for more control over the hookSelector field in cascading scans.

Related documentation PR secureCodeBox/documentation#152

Example usage

To select all hooks, leave the hookSelector field undefined or specify:

hookSelector: {}

To select specific hooks:

hookSelector:
  matchExpressions:
   - key: app.kubernetes.io/instance
     operator: In
     values: [ "defectdojo", "update-field" ]

To ignore certain hooks:

hookSelector:
  matchExpressions:
  -  key: app.kubernetes.io/instance
     operator: NotIn
     values: [ "defectdojo" ]

Available operators are:
In, NotIn, Exists, DoesNotExist

You may also use matchLabels for exact matches on labels:

hookSelector:
  matchLabels:
    app.kubernetes.io/instance: "defectdojo"

matchLabels and matchExpressions may be used simultaneously.

Default Hook labels

To select a certain hook deployment, use the label app.kubernetes.io/instance (e.g. update-severity for the update-field hook). To select a certain hook type, use the label app.kubernetes.io/name (e.g. update-field).

Extra hook labels may be added by deploying the hook with the hook.labels field set in values.yaml.

Usage in combination with Cascading Scans

Besides existing fields (e.g. inheritVolumes, inheritLabels, etc.), the field inheritHookSelector is now available with the same behavior as the other existing fields.

Note: to use cascading scans in combination with hookSelector, ensure that you also select the cascading scans hook. You can use the existing label securecodebox.io/internal to select core features like cascading scans. You should manually ensure that your other selectors override this selector in behavior.

hookSelector:
  matchLabels:
    securecodebox.io/internal: "true"

Checklist

  • Test your changes as thoroughly as possible before you commit them. Preferably, automate your test by unit/integration tests.
  • Make sure npm test runs for the whole project.
  • Make codeclimate checks happy

Signed-off-by: Jop Zitman <jop.zitman@secura.com>
@malexmave
Copy link
Member

malexmave commented Oct 27, 2021

Thank you very much for the pull request! I've done some local testing, so here's my thoughts on the current state of the PR and on the open questions that we need to answer before we can proceed with it (because I really want to proceed with it, I love the idea and it elegantly solves an issue I have been struggling with):

Default Behavior

At the moment, the new code seems to change the default behavior: If I do not provide any hookSelector, no hooks are run (previous behavior: all hooks are run). Is this intentional / desired?

Both options seem a bit difficult:

  • If we leave it as it is right now in this PR, it is going to break a bunch of people's workflows because suddenly their hooks are no longer running.
  • If we change it ("no explicitly specified selector means all hooks are run"), how do we specify that for a specific scan, we don't want any hooks to be run?

Interactions with CascadingScans

Cascading Scans are implemented as a hook. Thus, we would have two distinct ways of enabling / disabling cascading scans for a single scan: Using the cascades definition, and using the hookSelector. Is this desireable? Should the cascading scan hook always be run, regardless of the provided spec (let its behavior be handled only by the cascades definition)? Is hiding the implementation detail desired here, or not?

Labels

Where are the labels for the selections coming from? By default, the defectDojo hook has the following labels:

$ kubectl describe scancompletionhooks.execution.securecodebox.io dd-persistence-defectdojo 
Name:         dd-persistence-defectdojo
Namespace:    default
Labels:       app.kubernetes.io/instance=dd
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=persistence-defectdojo
              app.kubernetes.io/version=1.12.0
              helm.sh/chart=persistence-defectdojo-3.3.0
              type=Unstructured

What label do we base our selection on for the scans? At least for the documentation, we need to give some examples. Do we choose app.kubernetes.io/name=persistence-defectdojo? (what if there are multiple hooks of the same type installed? Is that even possible?).

Also: Is there an easy way to add new labels to a hook? If so, we should document it somewhere so people can also have engagement- or scope-specific hooks etc., which brings me to...

Documentation

A fairly obvious point, but at the moment, this PR does not update the docs and does not reference a PR in the docs repository that does so. It would be really helpful for reviewing the changes if there was some draft documentation that explicitly states the expected behavior of the code, so that it is easier both to discuss if this specified behavior is the one we want, and to compare it against the reality implemented in the code. Bonus points for including a couple of examples, which make testing a lot easier as well :).

@malexmave malexmave added CRD Improvements or additions to CRDs enhancement New feature or request labels Oct 27, 2021
@malexmave malexmave added this to In progress in secureCodeBox v3 via automation Oct 27, 2021
@EndPositive
Copy link
Contributor Author

Hey @malexmave, thanks for the comments. Our eagerness for fixing this issue is shared 😃 .

At the moment, the new code seems to change the default behavior: If I do not provide any hookSelector, no hooks are run (previous behavior: all hooks are run). Is this intentional / desired? Both options seem a bit difficult:

Interesting, this seems to be another side-effect of including omitEmpty in the operator's CRD definition. If the pipeline would work on PRs, I would have noticed 😉. Specifying hookSelector: {} would select all hooks. I'll update behavior such that not defining hookSelector will also result in all hooks being selected. If you don't want to run any hooks, simply specify a selector which will never match. I think that way you're as close to cli as possible.

Cascading Scans are implemented as a hook. Thus, we would have two distinct ways of enabling / disabling cascading scans for a single scan: Using the cascades definition, and using the hookSelector. Is this desireable? Should the cascading scan hook always be run, regardless of the provided spec (let its behavior be handled only by the cascades definition)? Is hiding the implementation detail desired here, or not?

Good question. I believe this problem stems from the fact that the Cascading Scans hook is implemented both as a Hook and in the Operator.. Notice that all other hooks only have deploy-time configurations (except some envs).

I don't think there's an easy way to let the Cascading Scans hook always run. If a user adds a labelSelector which filters all Hooks but update-field, there's no easy way for the operator to know what selector to add such that the Cascading Scans hook is also matched (app.kubernetes.io/name is user-defined). @J12934 what do you think?

Where are the labels for the selections coming from? By default, the defectDojo hook has the following labels:

User-defined. It's up to the users to add labels (or re-use labels added by tools like Helm) and select them. @malexmave If you'd like, we could add default labels to all secureCodeBox managed hooks that match their CI/directory names.

What if there are multiple hooks of the same type installed? Is that even possible?.

Yes, just like the default label selectors. Think of them like a filter instead of a selector.

A fairly obvious point, but at the moment, this PR does not update the docs and does not reference a PR in the docs repository that does so.

Thanks for bringing this up. There are no public guidelines concerning documentation for contributions. I've brought this up earlier in secureCodeBox/documentation#122. Within Secura, we try to add requirements within the related issue and implementation details in the PR. The former is properly done in #728 and the latter could show improvements. However with such few changes and its common use on cli, I thought it unnecessary. We usually add documentation after the first review / merge to prevent writing documentation on unfinished/poc/controversial implementations, but I'd be happy to adhere to your methods of working.

I'll fix the omitEmpty thing tomorrow. Let me know what you think about adding default labels for managed hooks. Let's wait for comment by Jannik about cascades.

@malexmave
Copy link
Member

Thanks for the quick reply. :)

If the pipeline would work on PRs, I would have noticed 😉.

I think the pipeline should work, there were some issues with GitHub actions last friday if I recall correctly. It should run once you have pushed another commit.

If you'd like, we could add default labels to all secureCodeBox managed hooks that match their CI/directory names.

I think it would be good to have some form of consistent default label that we can use in the documentation - I don't particularly mind if that label is autogenerated by helm (as long as helm does so in a consistent way). Additionally, we should at least have a specified, recommended way to add additional labels to hooks during the install / configuration of the hook. I don't have enough experience with k8s to recommend a specific way, so this is more of an open question to everyone how we can do this.

A possible scenario for this: We have several different networks / deployments that we are scanning, and want to do different things with the data collected from each deployment (push them to different DefectDojos / notification hooks / cascading scans / ...). How do I configure the hooks and the scans to achieve this? Once the feature is merged and released, I'd be happy to write an article for the secureCodeBox blog about this that explains how to use this feature in practice (similar to my forthcoming semgrep article) - or you can of course also write the article yourself if you want, it's your feature after all, I don't want to steal your contributions, I just want to avoid giving you extra work :).

There are no public guidelines concerning documentation for contributions.

True! We've been discussing this internally as well during the last retro, but have not yet come to a conclusion on how we want to address this. My personal approach is to document as I go, but I completely understand your point about not spending a lot of time on documentation while it is still unclear what the final version is going to look like. I don't think that we (should) expect complete documentation on a PR that is in a state intended to serve as a basis for further discussions, so it's definitely not a problem that you submitted without docs, it's just something that, once the implementation is settled, should be added quickly and ideally before merging (but at least before releasing) the feature. But this has not always been true for our own code, either :).

Perhaps a good compromise would be that PRs with new features should specify one complete example (in the PR or in the issue, I don't mind either way) on how to use the feature that can easily be tested and adapted locally, so that the reviewer has a basis to work on? And then the in-depth documentation can be added once the reviewers are satisfied and it is clear that the feature is not going to require basic changes? But this is getting a bit off topic for this pull request, and may also be a bit controversial inside the team, so don't take this as any kind of policy :).

@J12934
Copy link
Member

J12934 commented Oct 28, 2021

I think the pipeline should work, there were some issues with GitHub actions last friday if I recall correctly. It should run once you have pushed another commit.

Not unfortunately not, pipelines are still disabled for forks as the hooks build / tests are not completly migrated yet :(

Cascading hook stuff

Interesting questions, never thought about that before 😕

I dont think we should develop some special exception for that. I'd add a specific label to that hook maybe securecodebox.io/internal=true and add to the documentation that if you are adding hookSelectors that you should include this selector to ensure that the SCB features relying on hooks (currently only cascadingScans) will not work correctly without it.

Jop Zitman added 4 commits October 28, 2021 12:34
Signed-off-by: Jop Zitman <jop.zitman@secura.com>
Signed-off-by: Jop Zitman <jop.zitman@secura.com>
Signed-off-by: Jop Zitman <jop.zitman@secura.com>
Signed-off-by: Jop Zitman <jop.zitman@secura.com>
@EndPositive EndPositive changed the title Add hookSelector to ScanSpec in CRD and use it to retrieve hooks Add ability to configure which hooks to run per scan Oct 28, 2021
@EndPositive
Copy link
Contributor Author

@J12934 @malexmave I've updated this PR. See commit log for specific changes.

Some specifics:

I don't particularly mind if that label is autogenerated by helm (as long as helm does so in a consistent way)

Unlike I expected, it actually is autogenerated in each hook's _helpers.tpl and seems very consistent. I've added some explanation on the labels in PR body.

Documentation stuff

I've added loads of examples and context in this PR body. Once this PR is approved I'll add a related PR in the docs repo to update the CRDs. It would be nice if @malexmave could write a blog-post with your specific use case. You can always use the example given in #728 😄.

Can someone run helm-docs and trigger the pipeline?

Copy link
Member

@malexmave malexmave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that the changes lead to hooks running normally if no selectors are provided, as expected. I added two comments (see below), both once again in the area of system behavior vs. user expectations.

Jop Zitman added 2 commits October 28, 2021 17:23
Signed-off-by: Jop Zitman <jop.zitman@secura.com>
…nd volumes)

Signed-off-by: Jop Zitman <jop.zitman@secura.com>
Copy link
Member

@malexmave malexmave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for your work on this. I think that this state is ready to merge. For documentation, I think we will need doc changes in the following places:

  • Explanations of the CRDs (API reference)
  • Explanations of how a scan is created (adding the selectors)
  • Explanations for the different hooks (adding labels)
  • Explanations for contributing a new hook (updating the examples given in the docs for values.yaml and the template)
  • Explanations of cascading scans (new inheritance fields, behavior as discussed)
  • Ideally, we could add another article in the "How to" section on "Select which hooks to run", which explains the whole process in an end-to-end way (installing hooks with custom labels, selecting which labels to run, ...).

Since #695 will require significant documentation changes in very similar areas as well, it may be easier to combine the documentation updates for both features. In that case, you could also make a more general how-to on "controlling hook execution" which covers the features introduced in both PRs together.

But this can be done in a separate PR (or two, if it covers both repos).

@J12934 is there an easy way to get the CI to run on this, or do we merge through the warning (I don't have the necessary rights for this at the moment). Also, at the moment CodeClimate seems a bit miffed about the new code. I don't really see an easier way to implement the part it complains about though.

secureCodeBox v3 automation moved this from In progress to Reviewer approved Nov 1, 2021
@EndPositive
Copy link
Contributor Author

@malexmave see the new documentation PR secureCodeBox/documentation#152. Feel free to add/correct wherever necessary. Did you want to make a new blogpost too?

@malexmave
Copy link
Member

Thanks! I'll take a look at the PR later today or tomorrow at the latest.

For the blog post: I understood that you aren't eager to write the blogpost yourself? Because if you want, you can definitely do so (they are your features, after all :). If you don't want to write the post, I will see if I can find time to write one, but I don't consider it an absolute requirement to have a blog post, especially if we have a good documentation article (I see that you wrote one, I just haven't read it in detail yet).

In general: I consider this PR ready to merge, I'm just holding off on pressing the button until I've had time to also read the documentation PR, in case anything comes up while going through it.

@EndPositive
Copy link
Contributor Author

I understood that you aren't eager to write the blogpost yourself?

Indeed. I'm personally not a big fan of writing blog-posts (or documentation in general 😬 ), and we don't really have the capacity to work on these, also considering their value for Secura. Though, feel free to write any blog posts about anything we contribute, especially if you're so closely involved in the review process.

@malexmave malexmave merged commit 26e7605 into secureCodeBox:main Nov 10, 2021
secureCodeBox v3 automation moved this from Reviewer approved to Done Nov 10, 2021
@J12934 J12934 moved this from Done to counter in secureCodeBox v3 Nov 19, 2021
@EndPositive EndPositive deleted the hook-selectors branch November 22, 2021 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CRD Improvements or additions to CRDs enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configure Hooks to run per scan
3 participants