-
Notifications
You must be signed in to change notification settings - Fork 655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm chart for Flyte #550
Helm chart for Flyte #550
Conversation
@rstanevich i have been looking at Helm now and I am liking it. I will review your PR and we can build on it I feel. One of the problem seems to be how to use from remote configurations like - pytorch operator etc |
@@ -0,0 +1,136 @@ | |||
{{- if .Values.contour.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on EKS now you can use alb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes I've just found this announce https://aws.amazon.com/blogs/aws/new-application-load-balancer-support-for-end-to-end-http-2-and-grpc/
looks like for now it is possible, also it requires aws-load-balancer-controller 2.0+
installed in kubernetes.
thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya I did set it up already on a personal
Account and works really well. I will be updating the eks manifests
Since we're stuck with Helm for the time being we'd like to contribute here. We could help test the chart and perhaps work on the GKE config. |
So, I still didn't try new feature of AWS ALB with gRPC support. For provisioning it in K8s it requires new AWS loadbalancer controller for Kubernetes. I need some time to setup own devbox with new controller for testing this stuff. |
@sbrunk i would love to help you with some testing as well |
@kumare3 @rstanevich what do you think about an approach that minimizes the diff between That way we can make sure the helm installation is on par with what we have right now and it could also provide a smoother upgrade path. My first crude try looks like this: gh pr checkout 550
kustomize build kustomize/base/single_cluster/complete > base_deployment.yaml
kubectl apply -f base_deployment.yaml
helm template . -f values-sandbox.yaml | kubectl diff -f - Then incrementally work through the errors (and the diff later), change the helm chart accordingly to minimize the diff and run A slightly better approach could be using a structural diff of the rendered yaml output. That's because |
@sbrunk, do you mean we just need to compare the generated helm manifest and
If the main goal is just to check smooth update from kustomize to helm installation I can check it out. And an obvious note: If we'd like using |
@rstanevich yes I meant to use the diff only to help during development of the chart. It actually came up when I was looking into this PR to see how far you got compared with the kustomize based deployment, i.e. is the sandbox on par. I guess this is something you can answer, too. 😉 For us the upgrade path is actually not important because we don't run Flyte in prod yet but I guess for most people running Flyte in prod it's quite important. |
resolved in #916 |
Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
* fix tag issue in ci Signed-off-by: Yuvraj <evalsocket@users.noreply.github.com> * remove welcome bot from boilerplate config Signed-off-by: Yuvraj <evalsocket@users.noreply.github.com> Co-authored-by: Yuvraj <evalsocket@users.noreply.github.com>
* Infer GOOS and GOARCH from environment Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Multiarch builds for flytescheduler Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * fix makefile to read variables from environment and overrides Signed-off-by: Jeev B <jeevb@users.noreply.github.com> --------- Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
* updated flyteidl to local to get ArrayNode Signed-off-by: Daniel Rammer <daniel@union.ai> * added boilerplate to support ArrayNode Signed-off-by: Daniel Rammer <daniel@union.ai> * pushing forward Signed-off-by: Daniel Rammer <daniel@union.ai> * refactored node executor interfaces to fix dependency cycle Signed-off-by: Daniel Rammer <daniel@union.ai> * refactoring almost complete Signed-off-by: Daniel Rammer <daniel@union.ai> * refactor complete Signed-off-by: Daniel Rammer <daniel@union.ai> * supporting environment variables Signed-off-by: Daniel Rammer <daniel@union.ai> * minimum viable product Signed-off-by: Daniel Rammer <daniel@union.ai> * update print statements for debugging Signed-off-by: Daniel Rammer <daniel@union.ai> * massive refactor fixing NodeExecutionContext override for ArrayNode Signed-off-by: Daniel Rammer <daniel@union.ai> * refactoring TODOs Signed-off-by: Daniel Rammer <daniel@union.ai> * subnode retries working Signed-off-by: Daniel Rammer <daniel@union.ai> * parallelism working Signed-off-by: Daniel Rammer <daniel@union.ai> * cache and cache_serialize working - first new functionality in maptask Signed-off-by: Daniel Rammer <daniel@union.ai> * adding implementation notes Signed-off-by: Daniel Rammer <daniel@union.ai> * removed eventing from subtasks Signed-off-by: Daniel Rammer <daniel@union.ai> * adding correct requirements Signed-off-by: Daniel Rammer <daniel@union.ai> * working end-2-end with flytekit Signed-off-by: Daniel Rammer <daniel@union.ai> * reporting output directory on success Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed output directory append Signed-off-by: Daniel Rammer <daniel@union.ai> * mocking TaskTemplate interface to enable caching Signed-off-by: Daniel Rammer <daniel@union.ai> * capture failure reasons Signed-off-by: Daniel Rammer <daniel@union.ai> * wrapped up abort and finalize functionality Signed-off-by: Daniel Rammer <daniel@union.ai> * mocking initialization events Signed-off-by: Daniel Rammer <daniel@union.ai> * sending all events Signed-off-by: Daniel Rammer <daniel@union.ai> * minor refactoring of debug prints and formatting Signed-off-by: Daniel Rammer <daniel@union.ai> * intratask checkpointing working Signed-off-by: Daniel Rammer <daniel@union.ai> * support for and Signed-off-by: Daniel Rammer <daniel@union.ai> * setting node log ids correctly Signed-off-by: Daniel Rammer <daniel@union.ai> * reporting cache status Signed-off-by: Daniel Rammer <daniel@union.ai> * correctly setting subnode abort phase Signed-off-by: Daniel Rammer <daniel@union.ai> * removing dead code Signed-off-by: Daniel Rammer <daniel@union.ai> * cleaned up most random TODO items Signed-off-by: Daniel Rammer <daniel@union.ai> * refactored into new files Signed-off-by: Daniel Rammer <daniel@union.ai> * refactoring for ArrayNode unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * refactored for unit testing to allow creation of NodeExecutor in array package Signed-off-by: Daniel Rammer <daniel@union.ai> * first unit test for handling ArrayNodePhaseNone Signed-off-by: Daniel Rammer <daniel@union.ai> * most of executing unit tests completed Signed-off-by: Daniel Rammer <daniel@union.ai> * finished executing unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * finished succeeding unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * wrote failing phase unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * moving towards complete unit_test success Signed-off-by: Daniel Rammer <daniel@union.ai> * unit tests passing Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed lint issues Signed-off-by: Daniel Rammer <daniel@union.ai> * updated flyteidl dep Signed-off-by: Daniel Rammer <daniel@union.ai> * added unit tests for Abort Signed-off-by: Daniel Rammer <daniel@union.ai> * adding unit test for Finalize Signed-off-by: Daniel Rammer <daniel@union.ai> * added utils unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * moved state structs to handler package Signed-off-by: Daniel Rammer <daniel@union.ai> * added docs Signed-off-by: Daniel Rammer <daniel@union.ai> * cleaned up abort event reporting Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed RecordNodeEvent unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * removed taskEventRecorder from nodes package Signed-off-by: Daniel Rammer <daniel@union.ai> * adding interface checking for arraynode Signed-off-by: Daniel Rammer <daniel@union.ai> * added transform unit test Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed input bindings issue Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * go generate Signed-off-by: Daniel Rammer <daniel@union.ai> * addressing random TODO Signed-off-by: Daniel Rammer <daniel@union.ai> * fixed unit tests Signed-off-by: Daniel Rammer <daniel@union.ai> * addressing pr comments Signed-off-by: Daniel Rammer <daniel@union.ai> --------- Signed-off-by: Daniel Rammer <daniel@union.ai>
* Infer GOOS and GOARCH from environment Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Multiarch builds for flytescheduler Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * fix makefile to read variables from environment and overrides Signed-off-by: Jeev B <jeevb@users.noreply.github.com> --------- Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
This PR contains Helm chart for Flyte with sandbox and EKS configurations.
The configuration for sandbox (
values-sandbox.yaml
) is ready for deploying in Minikube. But EKS config (values-eks.yaml
) should be edited before installation in the cloud: s3 bucket, RDS hosts, iam roles, secrets and etc need to be configured and modified.