-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add a Strimzi Helm Chart #565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello Strimzi committers. I've been working on adding this feature, but I have a number of outstanding questions that should be addressed before I proceed. I've summarized these questions and other comments below. Outstanding questions:
By making this the primary installation means we could remove duplicate install examples from
The current implementation simply bundles the chart into the release process by calling TODO for adding to Kubeapps
Other TODO
|
b7abeee to
578486f
Compare
.travis.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not an Helm expert but doesn't it install just the client but not the Tiller server? Should we test in some way that the chart works so the server is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ppatierno Yes, the helm init --client-only was to prepare the CLI chart release management activities like those described in the Makefile. We can add a system test which tests that the generated chart works (which would require us to initialize Tiller on the test cluster).
I haven't looked at the Strimzi systemtests project in detail, but we could also think about making helm the default deployment mechanism (see question 1 in my last comment). I'll get the systemtests running locally so I can test the feasibility of this. Do you think that is worth exploring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure that we want Helm Charts as default deployment mechanism but for sure one of the available mechanisms (so without removing what we have in place today). Regarding the systemtests I really think that it's worth exploring having them for helm deployment. @scholzj @tombentley wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about having an automated test that deploying through helm results in a functioning Strimzi deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add a system test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the chart version or should be the "software" version (in this case Strimzi) ? Should they be aligned ? Just asking ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
appVersion is the "software" version. version is the version of the actual chart.
Often the charts in the kubeapps repo are managed by different people from the upstream project. This is because the upstream project committers may not be concerned with how their software is deployed so somebody volunteers to do this on their behalf. This could lead to an inefficient release process whenever you want to publish a new version of the chart, because the chart on kubeapps may significantly lag the latest release of the software. IMO since Strimzi is a K8s specific project it makes sense to integrate the chart release into the overall release process. A chart artifact could even be hosted by you immediately on a github pages site (a strimzi chart repo, basically). Once the chart gets approved by kubeapps they have processes that let chart authors fast track new versions of the chart based on the github user details in the OWNERS file. This essentially allows chart authors to create a PR that gets automatically merged once an owner approves it. See this documentation on Owning and Maintaining a Chart for more details. I think these steps could be added to your release process too after initial approval of the chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, appVersion and version are hardcoded in the Chart.yaml, but they're substituted with the actual RELEASE_VERSION when the chart is packaged during release (see the release_helm_pkg Make target). Because version must comply with semantic versioning I used sed to extract the version component from RELEASE_VERSION and use that. Example) if RELEASE_VERSION is 0.5.0-SNAPSHOT we use 0.5.0 for the semantic version of the chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change the metadata.name ? I see that you have done something similar for all the others templates, is there any specific reason ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. In the Review Guidelines they state that charts should have the following metadata for every resource.
Resources and labels should follow some conventions. The standard resource metadata should be this:
name: {{ template "myapp.fullname" . }} labels: app: {{ template "myapp.name" . }} chart: {{ template "myapp.chart" . }} release: {{ .Release.Name }} heritage: {{ .Release.Service }}
In this case myapp.fullname (strimzi.fullname in this PR) refers to a templating function in _helpers.tpl which concatenates the Release.name and Chart.name. I think the reasoning for this is so that you can install multiple instances of the same chart in the same namespace. I initially applied this convention to all metadata.name's, but then I discovered that the operator depends on certain names to be constant, such as the service account created for the Kafka broker pods. In the this PR I only use the templated name for the deployment resource.
I took a look at other charts in kubeapps and noticed a mix of charts enforcing and not enforcing this requirement. I haven't received an answer yet to my questions about when it's appropriate (i.e. what type of resource in what context) to use the templated name, but I'll follow up by asking on the helm mailing list and get back to you.
.travis.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about having an automated test that deploying through helm results in a functioning Strimzi deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
topic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we parameterize this so it points people to the right version of the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky from the build perspective because the CRD resources are themselves generated from the Java model in api. We have to avoid having screeds of YAML which needs to be hand-maintained. We could perhaps generate the examples resources from the helm charts, but to do that for the CRD resources we'd need to be generating the resources with these extra labels. Which is doable. I assume apart from these Helm-specific labels the rest of the CRD templates are just a verbatim copy of what's currently in examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tombentley Yes I just copied the latest from ./examples/install/cluster-operator/. The main differences are the additions to metadata and Deployment in 08-deployment.yaml.
Can you point me to the process in the build that generates the CRD? I can take a crack at making these steps of the build process work together.
We're aware of users who do this, but I wouldn't consider it necessary to have a chart for the topic operator in this PR. |
@tombentley The reason I suggested creating the chart for the topic-operator as well was to remain consistent with how the resources are templated during the build process under |
e3650ef to
c06e5e4
Compare
|
Hi @ppatierno @tombentley. I added a system test, but I have some new questions. Another review would be good too.
It looks like some tests in |
api/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can consolidate this later. I just added the 2nd execution to get fresh CRD's in the helm chart.
b3c57c3 to
db71fcc
Compare
|
Hi @seglo, I'll do a proper review tomorrow, but I'll answer this one now:
Setting up a CO instance takes quite a while, so the idea with putting the annotation at the class level is that all the test methods in the class share a CO instance, thus sharing the cost of setting it up. This is important as we've found over time that we can easily hit the 50 minute limit on Travis. If you want to set up the CO in a different way then maybe you can achieve the same thing by subclassing the existing test, but changing the class annotation in the subclass. It's not ideal, but it might be enough for the time being. |
|
@tombentley Got it. That makes sense. |
Yes, that would be a good place to end up, but I think it will be simpler to get there in steps. |
tombentley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we need to avoid is duplicating the installation resources (currently in examples and helm-charts/strimzi-kafka-operator/templates). If you tweak the CrdGenerator to be able to to able to supply arbitrary metadata.labels to be included in the generated output you could then generate the template CRDs. Then we could see whether we can use helm template to generate the examples. Wdyt?
HACKING.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since helm is now used by the build process that's something we should document. I know we've not documented the existing build requirements here until now, but could to add a section for that to this HACKING.md: make, mvn, helm, asciidoctor for docs I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a section for pre-req's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"512Mi" etc should probably be constants, so they get modified in tandem between the helm and examples installations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make them constants.
d35a395 to
1ca710c
Compare
|
@tombentley @ppatierno I think I have all the pieces in place now and ready for a full review. Some notes.
|
|
@seglo thanks! Regarding the message you got for failing tests. That line is related to the logger method which is not able to print the log when an error occurs due to the addition of one more container (as you said) in a pod; btw it's not the cause of the test failure. |
|
@seglo actually the problem about the log was already fixed by @tombentley few days ago, maybe you should rebase against latest master. |
6402000 to
893d781
Compare
7ca79b5 to
c0f92df
Compare
|
@ppatierno The |
|
@seglo I'm sorry but due to new changes in the master there are new conflicts :( |
438e60e to
97157bc
Compare
|
@ppatierno I rebased and resolved conflicts, but I'll need to do more troubleshooting to figure out the new |
|
@seglo The I would like to merge this quickly to avoid another conflicts and rebasing :-). So I had a look at this and noticed two things:
I think I fixed both of these in my branch. Maybe you can have a look at it and use it for your PR. |
|
@scholzj Thanks a lot for taking the time to get the tests passing. It helped me fix the underlying issue I had. I would also like to get this merged in ASAP because it touches on a big chunk of your release process and management of the cluster operator resource files that see a lot of changes. Regarding the two things you noticed:
I had originally subclassed
It looks like the underlying issue was related to not deploying the new The Tiller service account isn't necessary to run the tests on Minikube, but I think it would be useful to include (though won't be required when we Helm 3 is released since Tiller will no longer exist). The reason I didn't keep that wasn't merged in was due to an issue with loading the resource file properly ( |
|
Thanks for the explanations. We will have a look at it right in the morning to get it merged without any further rebasing. However what makes me curious is why the service account should not be needed. Without that it was using just the default account which should IMHO have no rights by default. However you are right that RBAC the error I saw could have been caused also by the missing CRD. As I noticed that only after adding the service account. So as long as the tests pass it should be fine. Thanks for the PR. |
ppatierno
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @seglo thanks for this PR!
tombentley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's not delay mering this any longer. @seglo if you could open a PR to fix the typo and possibly move the KafkaUser to the 'normal' CrdGenerator execution that would be great.
| <argument>io.strimzi.api.kafka.model.KafkaConnectAssembly=${pom.basedir}${file.separator}..${file.separator}helm-charts${file.separator}strimzi-kafka-operator${file.separator}templates${file.separator}04-Crd-kafkaconnect.yaml</argument> | ||
| <argument>io.strimzi.api.kafka.model.KafkaConnectS2IAssembly=${pom.basedir}${file.separator}..${file.separator}helm-charts${file.separator}strimzi-kafka-operator${file.separator}templates${file.separator}04-Crd-kafkaconnects2i.yaml</argument> | ||
| <argument>io.strimzi.api.kafka.model.KafkaTopic=${pom.basedir}${file.separator}..${file.separator}helm-charts${file.separator}strimzi-kafka-operator${file.separator}templates${file.separator}04-Crd-kafkatopic.yaml</argument> | ||
| <argument>io.strimzi.api.kafka.model.KafkaUser=${pom.basedir}${file.separator}..${file.separator}helm-charts${file.separator}strimzi-kafka-operator${file.separator}templates${file.separator}04-Crd-kafkauser.yaml</argument> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think KafkaUser needs to be here (@scholzj?)
| | `image.repository` | Cluster Operator image repository | `strimzi` | | ||
| | `image.name` | Cluster Operator image name | `cluster-operator` | | ||
| | `image.tag` | Cluster Operator image tag | `latest` | | ||
| | `image.imagePullPolicy` | Cluster Operator image pull policy | `IfNotPrsent` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IfNotPresent
|
Great. Thanks! @tombentley I'll create a PR to fix that typo. There are some follow up tasks I would like to suggest.
WDYT? @ppatierno @scholzj @tombentley |
|
+1 on hosting a helm repo on the strimzi website. I'm just reading about kubeapps contribution guidelines to understand that better. |
Type of change
Enhancement / new feature
Description
Support Helm Charts as a means to install Strimzi into a Kubernetes cluster. #539
Checklist
Please go through this checklist and make sure all applicable tasks have been done
Check RBAC rights for Kubernetes / OpenShift rolesTry your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally