-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to improve run time on larger desired state files? #22
Comments
Thank you @bgeesaman for reporting this. I have been facing this situation recently as well and I was planning to focus on performance improvement in the next release.
In the meantime, the way I currently deal with this situation is by splitting my desired state into multiple files and configuring how they are run through the CI pipeline. For example, desired state for production apps on a release-branch , desired state for dev apps on another branch and so on .. I sometime also use scripting in the pipeline to check if a desired state file has changed in the last commit or not before running helmsman on it. |
@sami-alajrami Thanks for that. It's pretty much what I was thinking, too. Reducing the number of calls to helm where possible is great, but the real pain is how long helm takes to run each command. If I was to break up my deploys, say, per namespace, so that I'd have 5-6 apps instead of 30, it'd be a good bit faster, but I think I'd still have 4+ second calls to helm because of a shared Tiller deployment. What about supporting the That would also have the benefit of making Helmsman be available to "soft multi-tenant" clusters. |
That's a good idea @bgeesaman .. how does the format below sound?
Then, when running helm commands, it implicitly either use the shared Tiller (if the desired namespace does not have its own), or use the specific Tiller of the namespace (using |
I was thinking it would be an [apps]
[apps.jenkins]
name = "jenkins" # should be unique across all apps
description = "jenkins"
namespace = "staging" # maps to the namespace as defined in environments above
tillerNamespace = "staging"
enabled = true # change to false if you want to delete this app release [empty = flase]
chart = "stable/jenkins" # changing the chart name means delete and recreate this chart
version = "0.14.3" # chart version
valuesFile = "" # leaving it empty uses the default chart values
purge = false # will only be considered when there is a delete operation
test = false # run the tests when this release is installed for the first time only
protected = true
[apps.nginx]
name = "nginx" # should be unique across all apps
description = "nginx"
namespace = "staging" # maps to the namespace as defined in environments above
enabled = true # change to false if you want to delete this app release [empty = flase]
chart = "stable/nginx" # changing the chart name means delete and recreate this chart
version = "0.14.3" # chart version
valuesFile = "" # leaving it empty uses the default chart values
purge = false # will only be considered when there is a delete operation
test = false # run the tests when this release is installed for the first time only
protected = true
wait = true Where the |
Also, I'm curious to know what it would take to be able to provide the TLS configuration items to be able to communicate securely. Thanks! |
before using the [namespaces]
[namespaces.staging]
protected = false
installTiller= true And then , the [apps.jenkins]
name = "jenkins"
description = "jenkins"
namespace = "staging" # maps to the namespace as defined in namespaces above
tillerNamespace = "staging" # optional and can point to another namespace e.g, dev becomes optional. I.e.
does that make sense? @bgeesaman |
eliminating the definition of |
That seems reasonable as it would skip over an already deployed tiller in that namespace and not re-init it unnecessarily. I deploy |
@bgeesaman could you please try [settings]
# other options
serviceAccount = "foo" # the service account used to deploy tiller into namespaces if they do not have a specific service account defined for them in the namespaces section below. If this one is not defined, the namespace default service account is used
storageBackend = "secret" # default is configMap
[namespaces]
[namespaces.production]
protected = true
installTiller = true # the foo service account above is used
[namespaces.staging]
protected = false
installTiller = true
tillerServiceAccount = "tiller-staging" # should already exist in the staging namespace
caCert = "secrets/ca.cert.pem" # or an env var, e.g. "$CA_CERT_PATH"
tillerCert = "secrets/tiller.cert.pem" # or S3 bucket s3://mybucket/tiller.crt
tillerKey = "secrets/tiller.key.pem" # or GCS bucket gs://mybucket/tiller.key
clientCert = "secrets/helm.cert.pem"
clientKey = "secrets/helm.key.pem"
[apps]
# jenkins will be deployed using the Tiller deployed in the staging namespace
# if the staging namespace wasn't configured to deploy Tiller, the kube-system Tiller will be used
# Tiller will always be deployed into kube-system regardless of defining it in the namespaces section or not. You can ,however, configure it to use a specific service account and TLS
[apps.jenkins]
name = "jenkins"
description = "jenkins"
namespace = "staging"
enabled = true
chart = "stable/jenkins"
version = "0.14.3"
valuesFile = ""
Regarding performance:
It would be great if you could test this release in your context and let me know how the performance improves when using multiple Tillers. Thanks! |
@sami-alajrami This is great to hear! I'll try this out today and let you know shortly. |
I have 28 applications deployed via a single Tiller in Version
Using version Version
Next, I created 5 new namespaces: Version
Using separate Tillers per namespace and limiting them to about 5 applications each, deploying 28 applications went from Kudos and THANK YOU @sami-alajrami |
Awesome! Glad to hear that .. And thanks for the greet tests/stats 👍 |
First, thank you for sharing
helmsman
with the community. When I kicked the tires on it using a smallish toml configuration with 2 apps, performance of a full run took a handful of seconds. Everything was good. But now, with ~30 helm charts defined and installed in a 10-node cluster, a CI run takes well over 8 minutes. For bumping, say, a container version in avalues.yaml
for one app, this greatly slows iteration speed. It appears to be ever-growing problem with regard to how many calls tohelm
there are and the increased time eachhelm
call takes. The logs show quite a fewhelm
calls per chart, and each one takes 4 seconds to return.Also, with so many long runs, there is now a much higher likelihood of a race condition between multiple overlapping CI runs even 2+ minutes apart.
helmsman
seems to gather its state in the beginning and then implement the work, so the second CI job will notice a chart is not there and then later go to implement it, only to error out because the first CI run placed it there in the meantime.Any ideas/strategies for improving overall performance of each run? I'd love to see things get under a minute, if possible. Thanks!
The text was updated successfully, but these errors were encountered: