[operator] Add Helm chart #204

consideRatio · 2020-10-14T18:29:16Z

I understood that you are accepting contributions where a Helm chart is defined for the starboard-operator, so this is meant to fix #187. Me and @krol3 both started working on this in parallel, but as discussed in #197 (comment) continue with a PR here that @krol3 will review!

PR ambition

In this PR, I've tried to follow the current state of the evolving best practices of Helm charts and created a foundation that will be relatively easy to maintain in the future. As an example, I've tried to avoid the anti-patterns of hardcoding support for a limited set of environment variables which would have needed to be updated as the starboard-operator binary added more configuration options.

Part of this PR

Create a Helm chart to install the starboard-operator
Expose the /metrics endpoint and make it easy for prometheus to scrape it.
Provide a message in NOTES.txt to show after helm install|upgrade
Verify functionality in a k8s cluster in all its operational states.
- Activating it cluster wide led to ~70 pods created -> [operator] Ability to throttle scanning job creation #209
- Scanner jobs didn't cache the vulnerability database, and 403 rate limit errors from GitHub API followed.
- Some jobs ended up resulting in such a large VulnerabilityReport that registering it with the k8s api-server was refused -> [operator] Creation of VulnerabilityReport request to k8s api-server can become too large #208
- Can I configure the pods created by starboard-operator, or specifically, can I provide them with a github token for trivy to use in order to avoid being rate limited by GitHub? -> [operator] Can I pass a GitHub API Token to the trivy scanner Pods? #210

Undecided if part of PR

Create a GitHub Actions CI test running helm template which only triggers when the chart folder is changed.
Document the experimental state of the Helm chart somewhere

Not part of this PR

Create a Helm chart registry where this Helm chart be published
Define CI jobs to package and publish the Helm chart to a Helm chart registry
Define CI jobs to start a local k8s cluster, install it, and verify it functions
Create a values.schema.json that automatically validates passed Helm values to helm template|install|upgrade

Things to consider

Are configuration options named sensibly?
Are configuration options documented in values.yaml good enough?
I've assumed that starboard-operator does not support being run alongside another starboard-operator, and that it wouldn't make sense to require it to be running in a highly available configuration. Is this correct?

codecov · 2020-10-14T18:32:24Z

Codecov Report

Merging #204 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #204   +/-   ##
=======================================
  Coverage   35.46%   35.46%           
=======================================
  Files          37       37           
  Lines        1844     1844           
=======================================
  Hits          654      654           
  Misses       1079     1079           
  Partials      111      111

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad6b9b9...a46df45. Read the comment docs.

consideRatio · 2020-10-14T23:24:41Z

@krol3 I consider this ready for review at this point! I'm running low on time to do everything I planned (basic CI test for example), but this is now in a functional state it seems from my local testing.

deploy/helm/templates/rbac.yaml

krol3 · 2020-10-15T17:22:55Z

Hi @consideRatio I tested and It's working the helm chart. I will close my PR.

krol3 · 2020-10-15T17:25:19Z

deploy/helm/templates/deployment.yaml

+            - name: OPERATOR_TARGET_NAMESPACES
+              value: {{ tpl .Values.targetNamespaces . | quote }}
+            - name: OPERATOR_METRICS_BIND_ADDRESS
+              value: ":8080"


We could use: {{ print ":" .Values.image.metricsPort | quote }}

I think we can avoid needing to add configuration of this because users will only access this through the k8s Service, which in turn points to the Pod's port named metrics.

Hmmm, the k8s Service port can currently be configured with service.port, but I think we should name that metricsPort like you suggest here as that probably makes it less confusing about what the service is meant for currently, and also in case we want to expose something else that isn't metrics on another port.

Users of another Helm chat I've worked with have been fine without configuring the container/pod port for a very long time, so we opted there to not add it at any point in time: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/jupyterhub/templates/hub/deployment.yaml#L199-L201

I added 9143ca1 to rename service.port to service.metricsPort - do you agree this is sufficient?

krol3 · 2020-10-15T17:25:46Z

deploy/helm/templates/deployment.yaml

+            # have annotations which will help prometheus as a target for
+            # scraping of metrics
+            - name: metrics
+              containerPort: 8080


Also here: {{ .Values.image.metricsPort }}

A readability improvement.

consideRatio · 2020-10-15T18:35:27Z

Thank you @krol3 for your review and testing that things seem to work on your end as well ❤️ 🎉

danielpacak

Great job @consideRatio ! I really appreciate your contribution. I also tested the Helm chart in my cluster and it works as expected.

I left a few comments / questions and in general I do agree with your assumptions that the follow up tasks, such as CI integration and updates to the README, can be done in separate PRs.

Before we merge this one, I'm wondering whether we should add a template to define the VulnerabilityReports custom resource. Some people say that the operator itself should not install the CRDs that it manages, but here we have a Helm chart, which provides means of installation.

Beyond that, I assume that the Helm chart is self contained, i.e. does not require $ starboard init or $ kubectl starboard init command to be run.

That said, do you think we can somehow symbolically link to https://github.com/aquasecurity/starboard/blob/master/deploy/crd/vulnerabilityreports.crd.yaml and send it to Kubernetes API along with other Helm templates? Otherwise we may assume that a deployer defines CRs with kubectl create command.

I also realized that the operator should (programmatically) check if CRs are defined before it spawns any scan job. Otherwise we're waisting resources just to find out that we cannot save a report because of an unknown resource.

deploy/helm/values.yaml

deploy/helm/templates/rbac.yaml

deploy/helm/values.yaml

consideRatio · 2020-10-16T19:55:00Z

Thank you for your review @danielpacak! 🎉 ❤️

Before we merge this one, I'm wondering whether we should add a template to define the VulnerabilityReports custom resource. Some people say that the operator itself should not install the CRDs that it manages, but here we have a Helm chart, which provides means of installation.

Helm 3 supports this, but it is an evolving best practice. I think its the right call to bundle the CRDs with the helm chart. They won't be templates that render with values, and they won't be managed by Helm after install in general I think. For more info, see: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/

CRDs added
container securityContext runAsUser/runAsGroup updated to match predefined user/group in Dockerfile
Note about OPERATOR_NAMESPACE being inferred added.
Decide if rbac.create: true is to remain or be hardcoded as true. I'm not confident about the matter myself, but I suggest to let it remain unless better understanding about why so many helm chart has this setup and its no longer relevant for this helm chart.

danielpacak · 2020-10-16T20:45:41Z

Thank you for your review @danielpacak! 🎉 ❤️

Before we merge this one, I'm wondering whether we should add a template to define the VulnerabilityReports custom resource. Some people say that the operator itself should not install the CRDs that it manages, but here we have a Helm chart, which provides means of installation.

Helm 3 supports this, but it is an evolving best practice. I think its the right call to bundle the CRDs with the helm chart. They won't be templates that render with values, and they won't be managed by Helm after install in general I think. For more info, see: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/

CRDs added

container securityContext runAsUser/runAsGroup updated to match predefined user/group in Dockerfile

Note about OPERATOR_NAMESPACE being inferred added.

Decide if rbac.create: true is to remain or be hardcoded as true. I'm not confident about the matter myself, but I suggest to let it remain unless better understanding about why so many helm chart has this setup and its no longer relevant for this helm chart.

Thanks for adding the crds to the chart. I didn't know that it's supported by Helm 3! Regarding caveats, I think we're good. If someone wants to managed CRD upgrades we'd suggest installing with OLM / https://operatorhub.io/operator/starboard-operator anyway.

danielpacak

Once again great job @consideRatio ! I'm going to merge the PR. As mentioned in the conversation we can follow up with in dedicated PR for such tasks as automated integration tests run as part of our CI workflow.

Regarding the support for multiple Helm releases running in the same cluster, I think we cannot do much about that. This problem is addressed by Operator Lifecycle Manager, where you define an OperatorGroup to configure multi-tenancy support of the operator. For example, if target namespaces specified by two different OperatorGroup instances intersect, the OLM won't validate such configuration.

I'll update docs / installation guide with Helm chart in this PR #201

consideRatio · 2020-10-16T21:44:22Z

Wieee! Thank you for your review and encouragement @danielpacak ❤️ 🎉 🌻

consideRatio mentioned this pull request Oct 14, 2020

feat: helm chart #197

Closed

consideRatio force-pushed the pr/add-helm-chart branch from 1150e1c to 0f30d4d Compare October 14, 2020 23:21

consideRatio added 8 commits October 15, 2020 01:22

helm: initial adjustment to helm create template

5b4c9da

helm: expose /metrics endpoint through a k8s Service for scraping

f158ef1

helm: final fixes

b7b6669

helm: pin replicacount to 1 as starboard-operator don't support HA

4281052

helm: hardcode k8s Secret's name to starboard-operator

3816f3c

helm: update inline note about image.tag

80143a0

helm: bump starboard-operator to 0.6.0 and update Chart.yaml

136b6c5

helm: restart on change to environment variabels

80e4264

consideRatio force-pushed the pr/add-helm-chart branch from 0f30d4d to 80e4264 Compare October 14, 2020 23:22

consideRatio commented Oct 14, 2020

View reviewed changes

deploy/helm/templates/rbac.yaml Show resolved Hide resolved

consideRatio added 2 commits October 15, 2020 17:13

helm: upgrade strategy set to recreate before HA is supported

62b6bba

helm: sts/ds inspection RBAC permissions added as needed in 0.6.0

1e886d3

krol3 reviewed Oct 15, 2020

View reviewed changes

consideRatio added 2 commits October 15, 2020 20:21

helm: update trivy to 0.12.0

160c657

helm: rename service.port to service.metricsPort

9143ca1

A readability improvement.

danielpacak self-requested a review October 16, 2020 15:13

danielpacak reviewed Oct 16, 2020

View reviewed changes

deploy/helm/values.yaml Outdated Show resolved Hide resolved

deploy/helm/templates/rbac.yaml Show resolved Hide resolved

deploy/helm/values.yaml Outdated Show resolved Hide resolved

deploy/helm/values.yaml Show resolved Hide resolved

consideRatio added 3 commits October 16, 2020 21:38

helm: add vulnerabilityreports CRD

9f8b62d

helm: add note about OPERATOR_NAMESPACE being inferred

412249d

helm: change runAsUser/Group to match Dockerfile's predefined user/group

a46df45

consideRatio changed the title ~~Add Helm chart~~ [operator] Add Helm chart Oct 16, 2020

danielpacak added the hacktoberfest-accepted label Oct 16, 2020

danielpacak self-requested a review October 16, 2020 21:18

danielpacak approved these changes Oct 16, 2020

View reviewed changes

danielpacak merged commit 2c0bcbc into aquasecurity:master Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[operator] Add Helm chart #204

[operator] Add Helm chart #204

consideRatio commented Oct 14, 2020 •

edited by danielpacak

codecov bot commented Oct 14, 2020 •

edited

consideRatio commented Oct 14, 2020

krol3 commented Oct 15, 2020

krol3 Oct 15, 2020

consideRatio Oct 15, 2020

consideRatio Oct 15, 2020

consideRatio Oct 15, 2020

krol3 Oct 15, 2020

consideRatio commented Oct 15, 2020

danielpacak left a comment •

edited

consideRatio commented Oct 16, 2020 •

edited

danielpacak commented Oct 16, 2020

danielpacak left a comment •

edited

consideRatio commented Oct 16, 2020

[operator] Add Helm chart #204

[operator] Add Helm chart #204

Conversation

consideRatio commented Oct 14, 2020 • edited by danielpacak

PR ambition

Part of this PR

Undecided if part of PR

Not part of this PR

Things to consider

codecov bot commented Oct 14, 2020 • edited

Codecov Report

consideRatio commented Oct 14, 2020

krol3 commented Oct 15, 2020

krol3 Oct 15, 2020

Choose a reason for hiding this comment

consideRatio Oct 15, 2020

Choose a reason for hiding this comment

consideRatio Oct 15, 2020

Choose a reason for hiding this comment

consideRatio Oct 15, 2020

Choose a reason for hiding this comment

krol3 Oct 15, 2020

Choose a reason for hiding this comment

consideRatio commented Oct 15, 2020

danielpacak left a comment • edited

Choose a reason for hiding this comment

consideRatio commented Oct 16, 2020 • edited

danielpacak commented Oct 16, 2020

danielpacak left a comment • edited

Choose a reason for hiding this comment

consideRatio commented Oct 16, 2020

consideRatio commented Oct 14, 2020 •

edited by danielpacak

codecov bot commented Oct 14, 2020 •

edited

danielpacak left a comment •

edited

consideRatio commented Oct 16, 2020 •

edited

danielpacak left a comment •

edited