New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repair 'helm init --wait': context deadline exceeded #3506

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@alexey-igrychev
Copy link
Contributor

alexey-igrychev commented Feb 14, 2018

Wait now works that way:

  • Connect to kubernetes api and wait till tiller pod appeared:
    • Poll every half a second for tiller pod name.
    • Timeout is 5 minutes by default (--wait-timeout).
  • Create connection over grpc and execute PingTiller.

Also:

  • setupConnection function is not taking arguments now to be used for helm init --wait.
  • Hook for cmd renamed to setupConnectionCobraPreRunHook.
  • PingTiller grpc method is now generated from tiller.proto.
  • github.com/golang/protobuf in glide has been updated to the commit where generated comment DO NOT EDIT is in the right place.

Accepted PR #3238 problems:

  • PingTiller method adds directly into generated tiller.pb.go file (with DO NOT EDIT comment). Thus after running make protoc PingTiller disappears from tiller.pb.go file leading to make build errors.
  • Also this code is not working because client object doesn't have information about TillerHost so pingTiller uses default host and port and always returns error Error: could not ping Tiller: context deadline exceeded.

fixes #3379

@k8s-ci-robot

This comment has been minimized.

Copy link

k8s-ci-robot commented Feb 14, 2018

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@alexey-igrychev alexey-igrychev force-pushed the alexey-igrychev:fix_ping_service branch from 6bd48fb to b3db3fe Feb 14, 2018

@alexey-igrychev alexey-igrychev changed the title WIP: Fix pingTiller service WIP: Generate PingTiller rpc using protobuf Feb 14, 2018

@alexey-igrychev alexey-igrychev force-pushed the alexey-igrychev:fix_ping_service branch 5 times, most recently from 6dca812 to 63e9453 Feb 15, 2018

@alexey-igrychev alexey-igrychev changed the title WIP: Generate PingTiller rpc using protobuf Repair 'helm init --wait': context deadline exceeded Feb 16, 2018

@alexey-igrychev alexey-igrychev force-pushed the alexey-igrychev:fix_ping_service branch 6 times, most recently from c2adc13 to 87d03e7 Feb 16, 2018

@bacongobbler bacongobbler added this to the 2.8.2 - Bugfix milestone Feb 27, 2018

c, err := h.connect(ctx)
if err != nil {
return err
}
defer c.Close()

rlc := rls.NewReleaseServiceClient(c)
return rlc.PingTiller(ctx)
_, err = rlc.PingTiller(ctx, &rls.PingTillerRequest{})

This comment has been minimized.

@bacongobbler

bacongobbler Feb 27, 2018

Member

Hey @TheOriginalAlex, would you mind taking a look at this? Was there a way in proto code that we could actually send a request over gRPC without the need to send a request body? That would retain backwards compatibility, but I'm not too concerned given that this is a relatively new feature to Helm.

Thanks!

This comment has been minimized.

@TheOriginalAlex

TheOriginalAlex Mar 6, 2018

Contributor

I was just sending a "Ping" string earlier, sending this empty struct does the same job, I think.
https://github.com/kubernetes/helm/blob/master/pkg/proto/hapi/services/tiller.pb.go#L1084

And the server currently just returns "Pong" no matter what the input is, so I don't think this breaks backwards compatibility.
https://github.com/kubernetes/helm/blob/master/pkg/proto/hapi/services/tiller.pb.go#L1314

@michelleN
Copy link
Member

michelleN left a comment

I tried to test this manually but I got an Image Pull Err from Kubernetes when trying to install Tiller on the cluster.

  24s		8s		2	kubelet, minikube	spec.containers{tiller}	Warning		Failed			Failed to pull image "gcr.io/kubernetes-helm/tiller:v2.8": rpc error: code = Unknown desc = Error response from daemon: manifest for gcr.io/kubernetes-helm/tiller:v2.8 not found

I don't understand why it's trying to pull 2.8 vs 2.8.0 which exists. cc/ @bacongobbler

@@ -166,7 +166,28 @@ func markDeprecated(cmd *cobra.Command, notice string) *cobra.Command {
return cmd
}

func setupConnection(c *cobra.Command, args []string) error {
func waitTillerPodAndSetupConnection(waitTimeout int) error {
_, client, err := getKubeClient(settings.KubeContext)

This comment has been minimized.

@michelleN

michelleN Feb 28, 2018

Member

I think my only nit is that we're grabbing a kube client here and again during setupConnection() but I can understand how this is unavoidable given the way things are. Let me know if you have any ideas for improving that in the future given you're now familiar with the code.

This comment has been minimized.

@alexey-igrychev

alexey-igrychev Feb 28, 2018

Contributor

@michelleN I propose the following solution, c4a01b7.

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented Feb 28, 2018

@michelleN I've seen that issue happen before and that seems to happen with custom client builds IIRC. You need to set the tiller image manually via helm init --tiller-image for it to work on dev builds :(

@bacongobbler
Copy link
Member

bacongobbler left a comment

the connection handling refactor looks a LOT better. Thanks! I'll just give this a whirl before I give a final LGTM. :)

@bacongobbler
Copy link
Member

bacongobbler left a comment

Works as intended! Thanks!

><> helm init --wait --tiller-image quay.io/bacongobbler/tiller:canary
$HELM_HOME has been configured at /home/bacongobbler/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Happy Helming!

I observed the tiller pod come up in another window while waiting for Helm to receive the response.

@helgi

This comment has been minimized.

Copy link
Contributor

helgi commented Mar 6, 2018

Needs a rebase

Repair 'helm init --wait': context deadline exceeded
Wait now works that way:
* Connect to kubernetes api and wait till tiller pod appeared:
    * Poll every 0.5s for tiller pod name.
    * Timeout is 5 minutes.
* Create connection over grpc and execute PingTiller.

Also:
* `setupConnection` function is not taking arguments now to be used for `helm init --wait`.
* Hook for cmd renamed to `setupConnectionCobraPreRunHook`.
* PingTiller grpc method is now generated from `tiller.proto`.
* `github.com/golang/protobuf` in glide has been updated to the commit where generated comment `DO NOT EDIT` is in the right place.

Accepted PR #3238 problems:
* PingTiller method adds directly into generated tiller.pb.go file (with `DO NOT EDIT` comment). Thus after running `make protoc` PingTiller disappears from `tiller.pb.go` file leading to `make build` errors.
* Also this code is not working because client object doesn't have information about TillerHost so pingTiller uses default host and port and always returns error `Error: could not ping Tiller: context deadline exceeded`.

fixes #3379

@alexey-igrychev alexey-igrychev force-pushed the alexey-igrychev:fix_ping_service branch from c4a01b7 to b3e05f9 Mar 6, 2018

@alexey-igrychev

This comment has been minimized.

Copy link
Contributor

alexey-igrychev commented Mar 6, 2018

@helgi It's done

@fibonacci1729

This comment has been minimized.

Copy link
Member

fibonacci1729 commented Mar 6, 2018

As an alternative to generating the PingTiller gRPCs, it might be worth leveraging the health checking support builtin into gRPC. (see: https://github.com/grpc/grpc/blob/master/doc/health-checking.md)

Example:

// In k8s.io/helm/cmd/tiller/health.go
package main

import (
    healthpb "google.golang.org/grpc/health/grpc_health_v1"
    "google.golang.org/grpc/health"
)

func WithHealth(setup func() error) {
      s := health.NewServer()
      s.SetServingStatus("Tiller", healthpb.NOT_SERVING)
     if err := setup(); err != nil {
          // handler err....
     }
     s.SetServingStatus("Tiller", healthpb.SERVING)
}
@helgi

This comment has been minimized.

Copy link
Contributor

helgi commented Mar 6, 2018

@fibonacci1729 does that approach impact any of the functionalities in this PR?

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented Mar 6, 2018

@helgi from our own testing, no. The end result is the same. It's a much cleaner interface overall. We're hacking on the PR here: https://github.com/bacongobbler/helm/tree/init-wait-fix

@helgi

This comment has been minimized.

Copy link
Contributor

helgi commented Mar 6, 2018

@bacongobbler much cleaner, yeah. Nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment