New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support containerd
as the container runtime
#1048
Conversation
It is tested with microk8s and minikube with containerd. k3s will be tested |
Not sure I agree with the deletion of the node agent:
I guess part of the reason for this is the same reason behind the fact that we are using different images to interface with different CRIs? #1048 (comment) Is the requirement for the cluster nodes to have a homogeneous runtime dictated by a lack of time (i.e. to keep scope down?) or a technical requirement? |
e921a7b
to
c5c567d
Compare
I agree, this has this downside, but if we want to use node-agent we need to have some mechanism to delay pod deployment, until node-agent informed us about pulling.
It is a technical requirement, we don't know on which node the pod will end up, so in not-homogeneous cluster, orchest does not know which image to use. |
This would only affect the "puller" part of the node-agent, so I think we should at least retain the "deleter" part of it and aim to keep the puller part of it if possible. To be more precise, it only affects environment images and custom jupyter images (essentially all images that are pushed to the internal registry), so other images that we are pre-pulling are not part of the issue and would still benefit from being pre-pulled. I think we have some architectural issues/questions here to solve:
Note that, assuming we keep the node-agent around, the new images introduced by this PR should be part of the set of images that the node puller and/or deleter considers.
I see, makes sense. I think there might be some hacky workarounds for this but I'd rather keep the abstraction, I guess not many people have heterogeneous clusters anyway (?). |
Note that since users can create custom jupyter images that can be customized and that are pushed to the internal registry, so the same kind of pre-pull behaviour should take place (you should look into the sessions |
Would jupyter kernels actually work? I think |
Another thing that might need looking into, user services allow you to use an environment image built in Orchest as a service, the logic happens at |
Another thing I would like to mention is the path of the init container directories, e.g. |
@@ -164,13 +164,13 @@ controller: | |||
|
|||
# -- Specifies the container runtime interface to use (one of: `docker`, `kubelet`, `k8sapi`, `pns`, `emissary`) | |||
## Ref: https://argoproj.github.io/argo-workflows/workflow-executors/ | |||
containerRuntimeExecutor: docker | |||
containerRuntimeExecutor: emissary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user does orchest update
, does it automatically pick up the emissary
executor? I guess so, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not, the third parties only checked on installation time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that changes to third parties would also be used in orchest update
? Otherwise orchest update
would actually 100% update Orchest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the update should indeed take care of updating third parties as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent quite some time on it, it is quite complex to solve, as we use OrchestCluster hash to detect if the cluster needs updating or not, and this part is not part of OrchestCluster. so the way we have to detect upgrading needs to be changed, I implemented that part but I'm afraid it might take more time.
as it does not affect our current users, I suggest creating a ticket for it to properly fix it later.
services/orchest-controller/pkg/controller/orchestcluster/cluster_utils.go
Outdated
Show resolved
Hide resolved
Some questions I have (in no particular order, but I figured the numbers could help to reference resp. question): (1) I am not so sure about the deletion of the
(2) @nhaghighat You said in #1048 (comment) that you tested (3) The newly added
(4) As mentioned in #1048 (comment), we might want to change the path from (5) Indeed the init container needs to be added to more places:
|
@fruttasecca : You mean concurrent pull of different images or same image?
I also like the idea of pre-pulling, but I think at the moment it does not bring us much, most of our users have a limited amount of nodes and the images will be pulled by the first step container, so for the rest of the steps, the image is probably already present on the node.
If we decide to keep node-agent that would also be my preference.
Kubelet has the functionality to remove unused, un-tagged images from the nodes, regularly.
No, I have changed the OrchestCluster CRD, will add an example and a section to the doc about it. |
The image size will not grow that much if we have both in the same image. now it is like 50MB each, then it would be 100MB. I think the ultimate solution to solve all the mentioned issues is to eventually have our own scheduler. |
same, was thinking about race conditions etc.
I don't feel very strongly about the need for a pre-puller like the current state of the node-agent, although I think it still brings a significant UX improvement on first use(s) and first impressions matter, although I don't have anything quantifiable. About pulling images where they are not needed, I get the sentiment but any multi node (serious) use of Orchest will eventually lead to all "types" of pods being scheduled on different nodes, so I am not sure we should consider this point very much.
same :)
I think it might help us in reducing complexity by offloading more logic to k8s, I've skimmed through the docs very quickly and I think it has some downsides:
So imo even if this kubelet functionality would have came up during the implementation/review of the node-deleter I am not sure we would have gone with that
Agree |
I agree, or at least I agree on the fact that the current solution is not the final one, but when it comes to managing scope/roadmap/timelines I think this init container "hack" might be the right call in order to save time |
agree, it was the right call. |
garbage collecting of images and containers is the default behavior of kubelet, so it is safe to assume users have not disabled it.
There are different TTL and threshold configurations. I think the default one is fine for us.
This is true, but it can still happen with For example, there is more information about it can be found here: |
containerd
as the container runtime
Change two-column design to one-column design.
So regarding #1048 (comment) on Thus, for the time being, I guess we should not make changes to third-party helm charts as existing users won't get the update. Luckily adding this functionality to the |
services/orchest-controller/deploy/examples/example-orchestcluster-micrik8s.yaml
Outdated
Show resolved
Hide resolved
That is true, this change is not reflected on existing users
I agree, but that requires detecting changes of the thrird-parties, which is not that easy, as we use in container values file for that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhyeahhhh!! Finally 😉 🕺
Great job 💯
Description
This PR enables working with containerd runtime by introducing an init container to pull images, the controller detects the runtime and configures orchest accordingly”
In order to test this PR you need to have a cluster with contained runtime, microk8s is suggested, after microk8s is installed (here), the following addons need to be enabled.
In order to be able to push images to microk8s node, after rebuilding all the images with some valid tag (for example
v2022.06.4
), you can save them to a tar file via following command:docker save $(docker images | awk '{if ($1 ~ /^orchest\//) new_var=sprintf("%s:%s", $1, $2); print new_var}' | grep v2022.06.4 | sort | uniq) -o orchest-images.tar
Then this tar file can be shipped to microk8s node via scp.
scp ./orchest-images.tar {your_user}@${microk8s node ip}:~/
then inside the microk8s node, you can import the images via following command (note
ctr
has to be installed, binaries can be found here)sudo ctr -n k8s.io -a /var/snap/microk8s/common/run/containerd.sock i import orchest-images.tar # Or use microk8s ctr microk8s ctr --namespace k8s.io --address /var/snap/microk8s/common/run/containerd.sock image import orchest-images.tar
then orchest can be installed with orchest-cli, with following command
Note:
the manifests must be generated via
make manifestgen
in the orchest-controller directory.Checklist
dev
instead ofmaster
.requirements.in
I have runpip-compile
to update the correspondingrequirements.txt
.models.py
I have performed the appropriate database migrations (refer to the DB migration docs).orchest-sdk
I followed its release checklist.orchest-cli
I followed its release checklist.update thirdparties on update