Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watch hooks logs and log release-checking status in helm install/upgrade #3481

Closed
distorhead opened this issue Feb 9, 2018 · 25 comments
Closed
Labels

Comments

@distorhead
Copy link
Contributor

distorhead commented Feb 9, 2018

There are 2 issues related to logs streaming, that we are experiencing using helm.

Watch hooks logs

Generally we have a project, that uses helm for deploy (https://github.com/flant/dapp). And in our scheme deploy run as follows:

  • run helm install/upgrade in main thread (blocking);
  • parse hooks-jobs templates and show logs for jobs with annotation dapp/watch-logs = true in separate thread.

It works ok, but we have a strong belief, that it is a job of the helm to show logs.
We have done some work on that in ruby (https://github.com/flant/dapp/blob/master/lib/dapp/kube/dapp/command/deploy.rb#L119) and now we can use that experience to improve helm.

So, there is an idea to use hook-job annotation like watch-logs = true|false.
Say, by default none of the hooks is logged, so it is supposed that watch-logs = false for all hooks. In that case tiller will stream only logs related to the hooks where watch-logs = true.
And as the next step add option to helm client & tiller to log all hooks by default. With that option tiller will suppose watch-logs = true for all hooks by default.

And that logs streaming could occur directly in install / upgrade release grpc-action using server-side streaming.

Check release state and show status during checking

There is related to the logging issue about checking release state (--wait in helm).

Release checking may also log some status info about "what is being checked" and "what we are waiting for". It is not necessary pod's logs, it could be container status, information about replicas count, that needed to be reached and current replicas count for deployment.

Again, in our project we have done some work on Deployment status watching (https://github.com/flant/dapp/blob/master/lib/dapp/kube/kubernetes/manager/deployment.rb#L33). After helm install / upgrade is done the thread that watches hooks logs is also done, we run a deployments-watching procedure.

It is similar to what --wait is doing, but it gives interactive info about what is being checked and shows, among other things, pod's logs. That logging is not "ideal", but there is an idea, that release readiness check could work that way: making release checks polling and at the same time give some info to the user.

The main purpose of that logging is to show enough info for the end user during install / upgrade operation, so that user can diagnose the problem and make changes to templates without running kubectl to dig what is wrong.

And again we have a strong belief, that helm in --wait mode could do something like that to become a trully end-user kubernetes deployment tool.


Also there is a related issue #2298, but separate helm command to show logs will not make helm more user-friendly deployment tool, that works out-of-the box.

@distorhead distorhead changed the title Watch hooks logs and print release checking status Watch hooks logs and log release checking status Feb 9, 2018
@distorhead distorhead changed the title Watch hooks logs and log release checking status Watch hooks logs and log release-checking status in helm install/upgrade Feb 9, 2018
@distorhead
Copy link
Contributor Author

distorhead commented Feb 16, 2018

Here is update to related PR list about logging:

Here is experiments from my team with first implementation: https://github.com/kubernetes/helm/pull/3479/files.

It contains all needed layers to pass information through the chain: kube-client - KubeClient interface - tiller ReleaseServer - helm-client. Implementation of job-watching in kube-client is the most complex part and and contains experimental code for now.

This feature aimed to bring following possibilities for helm users:

  • Give user enough info for debugging templates without additional kubectl invocation
  • User can enable stream of information about hooks with new annotation helm/watch-logs. Tiller will stream events and logs only for annotated hooks during helm install/helm upgrade. No unnecessary data will be shown and also will not be sent over network.

This feature is useful for CI/CD where helm used as end-user deployment tool. Streaming all sets logs and kubernetes system-events could also be useful for some purposes, but that scope of release monitoring is not addressed in this PR.

Implementation details:

  • Helm-client will receive some events like LogChunk or userspace-error and will make logs from this events.
  • The stream is oriented not to give only logs or some low-level server info, but to give userspace-level events about resources being deployed, including log-chunks for containers. And give these events not a randomly, but in an logically ordered manner.

The way it works is the same as in #3263 -- streaming response to install/upgrade request. The main problem with this way of streaming is backward compatibility between different versions of helm-client and tiller (https://github.com/kubernetes/helm/pull/3263/files#diff-e337e5a949100848dad15963f6fc8b02R64). Actually I began experiments to get info from tiller in the same way and then I received a link to that issue from the people who already worked on that problem.

Here is my team proposal to that problem:

  • Firstly make api version negotiation between helm-client and tiller in backward-compatible manner. Somehow.
  • Then add new extended InstallReleaseV2 and UpgradeReleaseV2 grpc methods with response streaming.
  • Make new helm-client use InstallReleaseV2/UpgradeReleaseV2 instead of InstallRelease/UpgradeRelease when available.

@distorhead
Copy link
Contributor Author

Another alternative could be some api-version parameter for helm-cli to enable api with some feature-set. In that case: helm/watch annotation will simply be ignored till helm install/upgrade called with --api-version SomeVersionId.

@distorhead
Copy link
Contributor Author

Implemented complex part related to job/pod/container watching. The result is a JobWatchMonitor, PodWatchMonitor primitives.

On the client side helm prints logs and errors the same way as tail -f do on multiple files. If hook-job spawns multiple pods or multiple containers within pods, helm client will toggle log-header and print information chunk related to (job, pod, container) in realtime. It is acceptable logging behavior for hook-job, because typically hook-job doing something sequential. But if hook-job is doing something in parallel, that case is supported too, it will give an output as-is.

Overall job-watch procedure is separated from WatchUntilReady into WatchJobsUntilReady (https://github.com/kubernetes/helm/pull/3479/files#diff-73ee28a4d39ab9ab84ba5a6f9dee3867R365), because JobWatchMonitor primitive makes code more specific and focused on the end-goal of sequential watching of hook-jobs. And this primitive should watch job-related events by itself.

Passing logs and events from kube-client to release-server is implemented through WatchFeed interface. Actually it is a simple set of callbacks, because there is really no need to create golang-channels at this level.

For now this feature is works only for install request. "Copy-paste" only needed to add streaming to upgrade and rollback, but it is not the main problem. Annotation helm/watch=true is also not implemented yet, but it is not a problem too. Also there is a need to do more debug on cleanup for all spawned monitor-primitive goroutines (WatchContainerLogs could hang now when container remains in Waiting state).

The main problem for now is api-compatibility and I think I need some feedback to make changes to integrate this work properly. What do you think about this feature generally speaking? It is about 80% done and needs some adjustions, maybe something is done wrong.

What about extending install/upgrade/rollback grpc procedures to the server-side streamed versions and give these procedures different names. Helm client could use extended version only if they are available on tiller? What is wrong with that idea?

Also I think some parts of logs streaming can be reused for this PR #2342 too.

@distorhead
Copy link
Contributor Author

@bacongobbler Hey! You set a proposal tag on this, could you answer me? Half a month passed and no reaction.

If this work goes totally against your plans could you say that directly? And whether there are any plans?

@lawrencejones
Copy link

Really keen on seeing this, it's pretty essential for us to have this debug output.

@unicell
Copy link

unicell commented Apr 27, 2018

@distorhead I just cherrypicked your patches to latest 2.9 release, and added pre/post upgrade hook support[1]. It works like a charm! Really looking forward seeing the feature incorporated into master.

[1] https://github.com/unicell/helm/tree/upgrade-hook-log-streaming

@distorhead
Copy link
Contributor Author

Cool, thanks for trying out!

Actually jobs watch is only the first step. The next steps could be support for watching for all other resource besides Job. And this watch code is also should be written as to be used not only to Watch hooks logs, but also for Check release state and show status during checking.

But for the first step current implementation already can be useful for users to watch jobs-hooks-only. Main problems to solve for this step:

  • more debug on cleanup for all spawned monitor-primitive goroutines (WatchContainerLogs could hang now when container remains in Waiting state);
  • watch only when annotation helm/watch=true is set, this experimental code should be optional and non-default;
  • make this change api-compatible, maybe use extended install/upgrade/delete grpc requests with server-side streaming (?).

Actually my team is here and we are ready to solve all that, for now any works on this were just frozen before any feedback is received, because someone responsible from helm maintainers should agree or disagree or make corrections for this direction of works.

@billylindeman
Copy link

billylindeman commented Aug 3, 2018

Any progress on this? I'd love to see it make it to public release.

@bacongobbler
Copy link
Member

bacongobbler commented Aug 14, 2018

Apologies for the delay on a response. The core team (including myself) has been focusing solely on getting an alpha release of Helm 3 out the door, so any development on significant contributions to Helm 2 have been much harder to dedicate resources for peer review and design work at this time. We highly suggest bringing proposals/discussions to the weekly public dev calls if you need feedback: https://github.com/helm/community/blob/master/communication.md

Firstly make api version negotiation between helm-client and tiller in backward-compatible manner. Somehow.

If this endpoint would break existing grpc endpoints, what about holding off until we have an alpha release of Helm 3 available and rebase this work against Helm 3? That way we won't have to write an API negotiation layer between helm and tiller... especially since tiller's being removed.

@distorhead
Copy link
Contributor Author

distorhead commented Nov 28, 2018

Hi, People!

The code from the PR #3479 related to this issue has been ported to separate project https://github.com/flant/kubedog.

We need this functionality, so we had to create a separate project and start using it. This code has been debugged more, tested and improved alot (for example now it uses kubernetes informers, which is a reliable primitive from kubernetes library, instead of raw watch kubernetes api).

This kubedog project is mainly the library that provides API to enable resources tracking in some application. There is also cli interface to this tracking functions. Library can track Jobs, Deployments, StatefulSets, DaemonSets. We have plans to support DaemonSets, StatefulSets and other resources tracking.

So if someone needs these functions right now -- try it out, and give some feedback ;)

As for the future of this issue: when Helm 3 will be released we can try to reintegrate these functions.

@distorhead
Copy link
Contributor Author

Happy to announce, that kubedog library has been improved a lot since last notification. There are some new features which available out of the box right now:

  • Track multiple resources at the same time.
  • Give user realtime logs and other service messages about all resources being deployed.
  • Give periodical "status reports" about overall rollout picture and answers question "what we are waiting for?".
  • The output is designed as the simple text stream, so it can be used in any CI/CD or other tools.
  • There is so called termination modes configurable for each resource. By default kubedog will wait until resource is ready, optionally kubedog can be configured to track resource only until other resources are not ready (non-blocking mode).
  • There is different failure detection modes. By default there is allowed failures count for each resource. And as it is exceeded tracker will fail whole deploy. There is also so called "hope" failure mode in which resource errors will be ignored till all other resources become ready.

@StupidScience
Copy link

@distorhead @bacongobbler any chances to get this feature in helm 3?

@michael-bouvy
Copy link

michael-bouvy commented Apr 14, 2020

Thanks @distorhead for the great work!

I'm currently using Helm 3 and have no intention switching to Werf (too many changes): what would be the best option to get kubedog's output (removing --wait flag from Helm), but still leverage --cleanup-on-fail when kubedog times out?

Edit

Tried the following and it works as expected:

helm upgrade --install my-app chart/
cat kubedog-config.json | kubedog multitrack || helm rollback my-app

Edit 2

Problem now is: how to guess resources names to track with Kubedog ; it may require an additional kubectl -n <ns> get deployments -l app.kubernetes.io/instance=my-app to build the JSON.

Eventually achieved getting something usable with the following (dirty) command which uses yq:

helm -n <ns> status my-app -o json | \
  yq r --prettyPrint - 'manifest' | \
  yq r -d'*' -c - | \
  yq r - "(kind==Deployment).metadata.name"

Which prints 1 deployment name per line.

@distorhead
Copy link
Contributor Author

Hi!

I think there may appear possibility to use kubedog with helm 3 by doing some small refactor in helm mainline code: move all resource watching code into some "backend" object, which have a default implementation — current helm --wait implementation (just as example https://github.com/flant/helm/pull/37/files#diff-d18c4d83690679deed0c84f55bf476bdR45).

Then we could add "kubedog" implementation, which will be isolated from the rest of the helm code. And then user can switch to this implementation or some another implementation or use default implmentation on own needs.

@abdennour
Copy link

Hello @distorhead please elaborate more about kubedog in this context .

@bridgetkromhout
Copy link
Member

Hi, @distorhead - Thank you for your ideas to improve Helm. We now have a defined process so that if you'd like, you can submit a Helm Improvement Proposal so as to work in the community to make these ideas a reality: https://github.com/helm/community/blob/master/hips/hip-0001.md - meanwhile, I will close this issue. Thanks!

@rafis
Copy link

rafis commented Mar 24, 2021

Any update?
#2298

17:56:45  client.go:258: [debug] Starting delete for "db-migration" Job
17:56:45  client.go:108: [debug] creating 1 resource(s)
17:56:45  client.go:467: [debug] Watching for changes to Job db-migration with timeout of 5m0s
17:56:45  client.go:495: [debug] Add/Modify event for db-migration: ADDED
17:56:45  client.go:534: [debug] db-migration: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
17:56:45  client.go:495: [debug] Add/Modify event for db-migration: MODIFIED
17:56:45  client.go:534: [debug] db-migration: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
17:57:12  client.go:495: [debug] Add/Modify event for db-migration: MODIFIED
17:57:12  client.go:173: [debug] checking 28 resources for changes

Would be nice if helm client could stream db-migration job logs back to his own STDOUT.

@nuved
Copy link

nuved commented Jul 8, 2021

Please add this feature at least for pre-install and pre-update hooks.

To show a simple log we have to use other tools and tricks along side helm and it's not very well.

Also using kubectl directly is not possible in helm so we can't have any simple way for showing logs even in pre-delete hooks of helm.

If someone using this policy , it means the job or pod is deleted by helm and logs of job or pod are missed forever.
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed

@felipecrs
Copy link

I think that unless this goes as a HIP, it's not going to get any attention.

@felipecrs
Copy link

Just as reference, I created a script called helm-upgrade-logs in my dotfiles which does the job for me:

WindowsTerminal_QivAJWLmZL.mp4

@SamuelGarratt
Copy link

I needed something also that could show logs of pods associated to a release when testing a helm chart via KIND.
I made a simple ruby gem at https://github.com/SamuelGarrattIqa/helm_upgrade_logs .

Basically you can install it with

gem install helm_upgrade_logs

and then install a chart with the syntax

helm_upgrade_logs --install nginx bitnami/nginx --wait --debug --set service.type=ClusterIP

You can see it running in a Github action pipeline at https://github.com/SamuelGarrattIqa/helm_upgrade_logs/runs/6049884913

@flaneur2020
Copy link

any update?

1 similar comment
@yuchao4x
Copy link

any update?

@felipecrs
Copy link

I don't think that's ever gonna happen. If someone needs this, the best solution I found was to use Helmwave.

By the way asking for updates does nothing, this issue is closed.

@felipecrs
Copy link

Some people mentioned werf here, but just to provide a quick start:

# install the werf binary. it comes with helm
wget https://tuf.werf.io/targets/releases/1.2.288/linux-amd64/bin/werf

chmod +x werf

./werf version

./werf helm version

./werf helm upgrade --install --wait jenkins https://github.com/jenkinsci/helm-charts/releases/download/jenkins-5.0.12/jenkins-5.0.12.tgz

The outcome is truly amazing:

WindowsTerminal_WffbuDaqqs.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests