Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

metrics-server no longer working since upgrade #195

Closed
limed opened this issue Feb 6, 2019 · 5 comments
Closed

metrics-server no longer working since upgrade #195

limed opened this issue Feb 6, 2019 · 5 comments
Assignees

Comments

@limed
Copy link
Contributor

limed commented Feb 6, 2019

Since upgrading the cluster to v1.11.7 the metrics-cluster deployment is no longer working. It mostly revolves around the apiserver and I'm seeing errors like this

no response from https://100.65.46.196:443: Get https://100.65.46.196:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Figure out what is going on here

@limed limed added the infra label Feb 7, 2019
@limed
Copy link
Contributor Author

limed commented Feb 7, 2019

Looking through and debugging I found out that the metrics-server is able to scrape kubelet information from all the nodes however when we call the kubectl top node or kubectl top pod command it fails because of the API service the metrics-server creates

I have narrowed this down to the metrics-server APIService and I have tried looking this issue up and found similar issues folks was having here kubernetes-sigs/metrics-server#45 but none of the fixes seem to work

 kubectl describe apiservice v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2019-02-06T20:12:41Z
  Resource Version:    40554336
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  UID:                 8ea03d29-2a4b-11e9-a8a6-06e3f5cda842
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2019-02-06T20:12:41Z
    Message:               no response from https://100.65.46.196:443: Get https://100.65.46.196:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

The service for this exists and appears to be correct, still investigating at this moment.

@limed
Copy link
Contributor Author

limed commented Feb 7, 2019

Another data point, this seems to be working on the Frankfurt cluster but not on the Oregon cluster. They are both running the same version

@jwhitlock
Copy link
Contributor

This could be useful for some other tasks, like #180, but it is still a mystery why it is broken. It is currently disabled because it appeared to interfere with a backup process. We agreed that @limed should work on it for a couple more days, and then we'll assume something is busted on the Kubernetes side, and maybe the next update will get it working. (Please edit if my understanding is wrong).

@limed
Copy link
Contributor Author

limed commented Mar 20, 2019

I looked briefly again and still can't fix this, I believe it might be something to do with a networking issue. I would like to propose we backburner this and fix this outside of the current sprint.

@limed limed removed this from the Grace Jones (S3 Q1 2019) milestone Mar 20, 2019
@limed
Copy link
Contributor Author

limed commented Sep 28, 2019

After a couple of kops update and image updates I managed to get metrics-server by merely updating the image.

@limed limed closed this as completed Sep 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants