Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release "prometheus-operator" failed: rpc error: code = Canceled #6130

Open
rnkhouse opened this issue Jul 31, 2019 · 33 comments

Comments

@rnkhouse
Copy link

commented Jul 31, 2019

Describe the bug
When I try to install prometheus operator on AKS with helm install stable/prometheus-operator --name prometheus-operator -f prometheus-operator-values.yaml I am getting this error:

prometheus-operator" failed: rpc error: code = Canceled

I checked with history:

helm history prometheus-operator -o yaml
- chart: prometheus-operator-6.3.0
  description: 'Release "prometheus-operator" failed: rpc error: code = Canceled desc
    = grpc: the client connection is closing'
  revision: 1
  status: FAILED
  updated: Tue Jul 30 12:36:52 2019

Chart
[stable/prometheus-operator]

Additional Info
I am using below configurations to deploy a chart:

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/prometheusrule.crd.yaml
 kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/servicemonitor.crd.yaml

In values file: createCustomResource is set to false,

Output of helm version:
Client: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}

Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:13:03Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.7", GitCommit:"4683545293d792934a7a7e12f2cc47d20b2dd01b", GitTreeState:"clean", BuildDate:"2019-06-06T01:39:30Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform (AKS, GKE, Minikube etc.):
AKS

@janvdvegt

This comment has been minimized.

Copy link

commented Aug 9, 2019

We have the same issue on minikube so it does not seem to be specific to AWS.

@robinelfrink

This comment has been minimized.

Copy link

commented Aug 23, 2019

We have the same issue on kubespray-deployed clusters.

@dlevene1

This comment has been minimized.

Copy link

commented Sep 2, 2019

I'm also seeing the issue on both k8s 12.x and 13.x k8s kubespray deployed clusters in our automated pipeline - 100% failure rate. The previous version of prometheus-operator(0.30.1) works without issues.
Funny things is - that if I run the command manually instead of via the CD pipeline it works - so i'm a little confused as to what would be the cause.

@dlevene1

This comment has been minimized.

Copy link

commented Sep 2, 2019

Saw there was an update to promethus chart today. I bumped it to

NAME                            CHART VERSION   APP VERSION
stable/prometheus-operator      6.8.0           0.32.0     

and i'm no longer seeing the issue.

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2019

@rnkhouse Can you check with the latest chart version as mentioned by @dlevene1 in #6130 (comment)?

@dpnl87

This comment has been minimized.

Copy link

commented Sep 2, 2019

I have this same issue with version 6.8.1 on AKS.

NAME                      	CHART VERSION	APP VERSION
stable/prometheus-operator	6.8.1        	0.32.0
❯ helm version 
Client: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
 ❯ helm install -f prd.yaml --name prometheus --namespace monitoring stable/prometheus-operator 
Error: release prometheus failed: grpc: the client connection is closing
>>> elapsed time 1m56s
@ccc13

This comment has been minimized.

Copy link

commented Sep 4, 2019

We have the same issue on kubespray-deployed clusters.

Kubernete version: v1.4.1
Helm version:

Client: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}

Prometheus-operator version:

NAME                            CHART VERSION   APP VERSION
stable/prometheus-operator      6.8.1           0.32.0  
@will-beta

This comment has been minimized.

Copy link

commented Sep 6, 2019

I have the same issue on aks.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented Sep 6, 2019

Can anyone reproduce this issue in Helm 3, or does it propagate as a different error? My assumption is that with the removal of tiller this should no longer be an issue.

@will-beta

This comment has been minimized.

Copy link

commented Sep 7, 2019

@bacongobbler This is still an issue in Helm 3.

bash$ helm install r-prometheus-operator stable/prometheus-operator --version 6.8.2 -f prometheus-operator/helm/prometheus-operator.yaml

manifest_sorter.go:179: info: skipping unknown hook: "crd-install"
Error: apiVersion "monitoring.coreos.com/v1" in prometheus-operator/templates/exporters/kube-controller-manager/servicemonitor.yaml is not available
@bacongobbler

This comment has been minimized.

Copy link
Member

commented Sep 7, 2019

That seems to be a different issue than the issue raised by the OP, though.

description: 'Release "prometheus-operator" failed: rpc error: code = Canceled desc
= grpc: the client connection is closing'

Can you check and see if you're using the latest beta release as well? That error was seemingly addressed in #6332 which was released in 3.0.0-beta.3. If not can you open a new issue?

@will-beta

This comment has been minimized.

Copy link

commented Sep 7, 2019

@bacongobbler i'm using the latest Helm v3.0.0-beta.3.

@k8s-class

This comment has been minimized.

Copy link

commented Sep 8, 2019

I had to go back to --version 6.7.3 to get it to install properly

@robinelfrink

This comment has been minimized.

Copy link

commented Sep 9, 2019

Our workaround is to keep prometheus operator image on v0.31.1.

@pyadminn

This comment has been minimized.

Copy link

commented Sep 10, 2019

helm.log
Also just encountered this issue on DockerEE kubernetes install

After some fiddling with install options --debug and such, am now getting:

Error: release prom failed: context canceled

Edit: May try updating my helm versions, currently at v2.12.3
Edit2: Updated to 2.14.3 and still problematic
grpc: the client connection is closing
Edit3: Installed version 6.7.3 per above suggestions to get things going again
Edit4: Attached tiller log for a failed install as helm.log

related: helm/charts#15977

@vsliouniaev

This comment has been minimized.

Copy link

commented Sep 12, 2019

After doing some digging with @cyp3d it appears that the issue could be caused by a helm delete timeout that's too short for some clusters. I cannot reproduce the issue anywhere, so if someone who is experiencing this could validate a potential fix in the linked pull request branch I would much appreciate it!

helm/charts#17090

@xvzf

This comment has been minimized.

Copy link

commented Sep 13, 2019

Same here on several Clusters created with kops on AWS.
No issues when running on K3S though.

@vsliouniaev

This comment has been minimized.

Copy link

commented Sep 13, 2019

@xvzf

Could you try the potential fix in this PR? helm/charts#17090

@pyadminn

This comment has been minimized.

Copy link

commented Sep 13, 2019

I gave the PR a run through and still the same Error: release prom failed: context canceled
tiller.log

@xvzf

This comment has been minimized.

Copy link

commented Sep 13, 2019

@vsliouniaev Nope, does not fix the issue here

@vsliouniaev

This comment has been minimized.

Copy link

commented Sep 14, 2019

Thanks for checking @xvzf and @pyadminn. I have made another change in the same PR. Could you see if this helps?

@pyadminn

This comment has been minimized.

Copy link

commented Sep 16, 2019

Just checked the updated PR still seeing the following on our infra: Error: release prom failed: rpc error: code = Canceled desc = grpc: the client connection is closing

FYI we are on Kuber 1.14.3
Helm vers v2.14.3

@lethalwire

This comment has been minimized.

Copy link

commented Sep 20, 2019

I was able to get around this issue by following the 'Helm fails to create CRDs' section in readme.md. I'm not sure how they're related, but it worked.

Step 1: Manually create the CRDS

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/servicemonitor.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/example/prometheus-operator-crd/podmonitor.crd.yaml

Step 2:
Wait for CRDs to be created, which should only take a few seconds

Step 3:
Install the chart, but disable the CRD provisioning by setting prometheusOperator.createCustomResource=false

$ helm install --name my-release stable/prometheus-operator --set prometheusOperator.createCustomResource=false
@xvzf

This comment has been minimized.

Copy link

commented Sep 23, 2019

@vsliouniaev Still same issue! Though the workaround from lethalwire works.

@pyadminn

This comment has been minimized.

Copy link

commented Sep 25, 2019

The lethalwire workaround has me resolved as well.

@Typositoire

This comment has been minimized.

Copy link

commented Oct 2, 2019

So 4 days a part the workaround worked and stopped working I had to use the CRDs file from 0.32.0 not master.

@JBosom

This comment has been minimized.

Copy link

commented Oct 3, 2019

I just now experienced the same issue with the CRDs currently on master. Thanks @Typositoire for your suggestion to use the currently previous version. Adapting CRDs install to the following worked for me:

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/servicemonitor.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/podmonitor.crd.yaml

That's why fixing the version is often a good practice.

@cu12

This comment has been minimized.

Copy link

commented Oct 3, 2019

Also had this issue, try to disable admissionWebhooks. It helped in my case.

@FreezB

This comment has been minimized.

Copy link

commented Oct 3, 2019

Install prometheus-operator chart 6.0.0 and do an helm upgrade --force --version 6.11.0, this seems to work on rancher kubernetes 1.13.10 and helm v2.14.3

@alex-hempel

This comment has been minimized.

Copy link

commented Oct 10, 2019

The workaround suggested by @Typositoire worked fine for me on a kops-generated 1.13.10 cluster.

@iMacX

This comment has been minimized.

Copy link

commented Oct 15, 2019

Same issue here trying to install on Azure AKS with kubernetes 1.13.10 and helm v2.14.3 with prometheus-operator-6.18.0. Any suggestion?

CRD installed manually.

This command failed:
helm install --name prometheus-operator stable/prometheus-operator --namespace=monitoring --set prometheusOperator.createCustomResource=false

give the error

Error: release prometheus-operator failed: rpc error: code = Canceled desc = grpc: the client connection is closing

EDIT: installing the version 6.11.0 (as well as the 6.7.3) of the chart is working:

helm install --name prometheus-operator stable/prometheus-operator --namespace=monitoring --set prometheusOperator.createCustomResource=false --version 6.11.0

@waynekhan

This comment has been minimized.

Copy link

commented Oct 16, 2019

@poochwashere

This comment has been minimized.

Copy link

commented Oct 16, 2019

I was fighting the same issue, I had to manually install the crds specified by @JBosom and install with the web hook disabled.

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/servicemonitor.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/podmonitor.crd.yaml

helm --tls --tiller-namespace=tiller install --namespace=monitoring --name prom-mfcloud stable/prometheus-operator --set prometheusOperator.createCustomResource=false --set prometheusOperator.admissionWebhooks.enabled=false --values values.yaml --versi on 6.18.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.