-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#9 Use a C* build from the source as the target cluster image #34
#9 Use a C* build from the source as the target cluster image #34
Conversation
…fly based on a git hash.
…for overriding the C* image used by the Operator.
…r-overriding-cass-version
…some recent change upstream had broken the build); Added missing entry-point script.
…ion for the target cluster; Fixed the `jvm-server-options` param in the target CassandraDatacenter template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable for what it does, but I have some serious concerns about bringing all this code into the code base rather than referencing external sources. I think I have an even larger concern about the increase in work that we're now asking the k8s cluster to do; we're now adding logic to download, build and package C* into Docker containers. This cost is paid every time we run the Helm chart (unless there's a bypass or some kind of opt-out logic I missed). I'm very concerned that this general approach just won't scale over time as more and more features are added.
args: | ||
- git clone https://github.com/apache/cassandra.git cassandra; | ||
cd cassandra; | ||
git checkout {{`{{inputs.parameters.commit_hash}}`}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: in the case of, say, "cassandra-4.0-beta2" (which looks to be the default given values.yaml) this represents a tag rather than a commit hash. "tree-ish" is probably the right name here (since that's ultimately what's being passed to "git checkout") but that name... isn't great. This name could be a branch, it could be a tag, or it could indeed be a commit. Maybe... "git_identifier" or something?
That said, the term "tree-ish" is reasonably well understood, at least within the git community. So maybe we just go with that... ?
I guess I lean a bit more towards "git_identifier"... but I could be swayed either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@absurdfarce I gave this some more thought as well, I wasn't really happy with "hash" either for the same reasons you mentioned.
Tree-ish is a bit confusing to me actually (it makes me think of a path) even though that's what I see in the "git checkout" command help. Strictly speaking, it seems like "commit" and "tree" are two different object types in git: https://matthew-brett.github.io/curious-git/git_object_types.html.
Having said that, I'm also leaning towards "git_identifier" or maybe even "git_checkout" to be self-explanatory (either way, this will be documented).
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think "git_identifier" is the way to go here. We're talking about something in the git log so "checkout" doesn't really seem right since that's not really what the user is specifying with that value. That value specifies a tag or a specific commit, something like that... and "identifier" seems general enough to account for all of those options.
name: cassandra-dockerfile-configmap | ||
namespace: {{ .Values.namespace }} | ||
data: | ||
Dockerfile: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remainder of this file represents an implicit import of the logic in https://github.com/docker-library/cassandra/blob/b6554329fe112243d16861b441067227eedcbdf9/4.0/Dockerfile and https://github.com/docker-library/cassandra/blob/c03e9828c699b8c22bcb17f82356ae90123d1d10/4.0/docker-entrypoint.sh into the project. I'm not sure this is ideal for the following reasons:
- Fixes/enhancements to these files over time won't be automatically picked up and re-used here
- These files aren't easily identifiable in the context in which they serve, which makes them harder to find and review. If we had a file named "Dockerfile" in a directory named "cassandra-4.0" the intent is pretty clear, but as it stands we have to get all of this from a config map.
Is there some reason we can't grab these files directly (from github, via git, or via some other mechanism) and then build the YAML programmatically to include them here? I'm just concerned we're seeing ourselves up for a maintenance nightmare by bringing this stuff in in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Dockerfile in Adelphi is a modified version from the one you linked in docker-library. The original own downloads a prebuilt tarball from the Apache site, while ours copies it from the volume where we buid C* from the source.
Usually these files are "templates" that we modify to our needs, the DataStax Management API follows a similar approach: https://github.com/datastax/management-api-for-apache-cassandra/blob/master/upstream-4.0/Dockerfile
Adelphi is all about building images and orchestrating containers, so I think it makes sense to have those things under our control instead of having it change under our feet.
Also, that Dockerfile is pretty straightforward and looks stable, it hasn't changed much in docker-library since older versions of C*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example of this approach are the CRDs manifests for C* Operator and Argo.
We can't import them as subcharts in Helm (because they're CRDs) and they are also tweaked for changing RoleBindings and stuff like that. The recommendation I heard from Jim Dickinson (from the C* Operator project) is that these files are mostly templates that have to be modified depending on one's needs.
@@ -1,19 +1,44 @@ | |||
apiVersion: argoproj.io/v1alpha1 | |||
kind: Workflow | |||
metadata: | |||
name: workflow-nosqlbench-cassdiff-{{ .Release.Revision }} | |||
name: workflow-adelphi-{{ .Release.Revision }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
args: | ||
- --dockerfile=/dockerfile/Dockerfile | ||
- --context=dir:///workspace | ||
- {{`--destination={{inputs.parameters.registry_ip}}:30000/cassandra-quality/cassandra`}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want this to be "adelphi" rather than "cassandra-quality"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the follow-up PR #35
args: | ||
- --dockerfile=/generated/Dockerfile | ||
- --context=dir:///build | ||
- {{`--destination={{inputs.parameters.registry_ip}}:30000/cassandra-quality/management-api-for-apache-cassandra`}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: s/cassandra-quality/adelphi/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the follow-up PR #35
I tried to address that concern in the comments above, but simply put the idea is that IMO Adelphi should own the image building process and we can't use the referenced resources directly because they do other things that we don't want to do. We have modified versions.
For serious testing, most people will run this on some cloud environment. In my GKE test, it took 4 minutes to build C* from the source and 3 minutes to build the C* Management API, I think that's mostly negligible for workload generation tests that will run for maybe 30min on each anonymized schema (so it could be a few hours total in CI). The image has to come from somewhere before running the workflow and it has to be built from the source since we'll want to test trunk or a dev branch. In a follow-up work we can let the user specify a pre-built tarball for local testing (let's investigate that in #32), but I still think there's value in building C* in the k8s cluster for real testing for the following reasons: @absurdfarce Let me know if that clarifies things a bit or if you still have any questions or suggestions for improvement. |
…r-overriding-cass-version # Conflicts: # helm/adelphi/templates/spark-image-builder.yaml
@absurdfarce I just fixed a conflict with the base branch (it turns out I had also added that same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing this with @grighetto recently the plan is to move towards Docker images (where possible) for C* versions falling back to source builds only when necessary. Approving this PR to get source builds in place; we'll add the logic around using the image (if available) later.
Thanks @absurdfarce. For future reference, these are the follow-up improvement tickets: |
┆Issue is synchronized with this Asana task by Unito