Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to fetch metrics from node c2: request failed - "403 Forbidden" #990

Closed
stevenlii opened this issue Mar 24, 2022 · 33 comments
Closed
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@stevenlii
Copy link

stevenlii commented Mar 24, 2022

What happened?
I execute this:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
and args:

      - args:
            - --cert-dir=/tmp
            - --secure-port=4443
            - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
            - --kubelet-use-node-status-port
            - --metric-resolution=15s
            - --kubelet-insecure-tls
       

log shows:
`unable to fetch metrics from node c2: request failed - "403 Forbidden"

`What did you expect to happen?
How to resolve it?

Anything else we need to know?
kubectl describe apiservice v1beta1.metrics.k8s.io

Status:
  Conditions:
    Last Transition Time:  2022-03-24T09:26:42Z
    Message:               all checks passed
    Reason:                Passed
    Status:                True
    Type:                  Available
Events:                    <none>

I've been stuck with this problem for a day..

kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.8", GitCommit:"4", GitTreeState:"clean", BuildDate:"2021", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.8", GitCommit:"4", GitTreeState:"clean", BuildDate:"2021", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}

/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Mar 24, 2022
@stevenlii
Copy link
Author

stevenlii commented Mar 24, 2022

add command doesn't work either

        - args:
            - --cert-dir=/tmp
            - --secure-port=4443
            - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
            - --kubelet-use-node-status-port
            - --metric-resolution=15s
            - --kubelet-insecure-tls
          command:
            - /metrics-server
            - --metric-resolution=15s
            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP

@yangjunmyfm192085
Copy link
Contributor

yangjunmyfm192085 commented Mar 24, 2022

Sorry, can you print out more logs of metrics-server? I want to know where is the fault

@stevenlii
Copy link
Author

stevenlii commented Mar 24, 2022

Sorry, can you print out more logs of metrics-server? I want to know where is the fault

Thank for your reply. Actually, the logs are the same.
Here is more logs

unable to fully scrape metrics: [unable to fully scrape metrics from node c3: unable to fetch metrics from node c3: request failed - "403 Forbidden"., unable to fully scrape metrics from node d6: unable to fetch metrics from node d6: request failed - "403 Forbidden"., unable to fully scrape metrics from node c2: unable to fetch metrics from node c2: request failed - "403 Forbidden"., unable to fully scrape metrics from node d4: unable to fetch metrics from node d4: request failed - "403 Forbidden"., unable to fully scrape metrics from node b3: unable to fetch metrics from node b3: request failed - "403 Forbidden"., unable to fully scrape metrics from node c1: unable to fetch metrics from node c1: request failed - "403 Forbidden"., unable to fully scrape metrics from node d5: unable to fetch metrics from node d5: request failed - "403 Forbidden".]
E0324 11:37:46.780321       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node c1: unable to fetch metrics from node c1: request failed - "403 Forbidden"., unable to fully scrape metrics from node d5: unable to fetch metrics from node d5: request failed - "403 Forbidden"., unable to fully scrape metrics from node d4: unable to fetch metrics from node d4: request failed - "403 Forbidden"., unable to fully scrape metrics from node c2: unable to fetch metrics from node c2: request failed - "403 Forbidden".,

@stevenlii
Copy link
Author

stevenlii commented Mar 24, 2022

Sorry, can you print out more logs of metrics-server? I want to know where is the fault

And I change service to NodePort. So I clould access by
https://myIp:nodePort/stats/summary
and it shows:
{ kind: "Status", apiVersion: "v1", metadata: { }, status: "Failure", message: "forbidden: User "system:anonymous" cannot get path "/stats/summary"", reason: "Forbidden", details: { }, code: 403 }
Does the reason is : forbidden: User "system:anonymous ?

@yangjunmyfm192085
Copy link
Contributor

From the log, there is no permission to access metrics(Of course, you can't access it directly below.)
Could you try to follow the steps below, see if it solves your problem
https://github.com/kubernetes-sigs/metrics-server/blob/master/README.md#requirements

@stevenlii
Copy link
Author

stevenlii commented Mar 24, 2022

From the log, there is no permission to access metrics(Of course, you can't access it directly below.) Could you try to follow the steps below, see if it solves your problem https://github.com/kubernetes-sigs/metrics-server/blob/master/README.md#requirements

I have been verified the requirements
The kube-apiserver must enable an aggregation layer.
my kube-apiserver configuration

containers:
  - command:
    - kube-apiserver
    - --advertise-address=10.xx.xx.xx
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-bootstrap-token-auth=true
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --etcd-servers=https://127.0.0.1:2379
    - --insecure-port=0
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --secure-port=6443
    - --service-account-issuer=https://kubernetes.default.svc.cluster.local
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.1.0.0/16
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

Nodes must have Webhook authentication and authorization enabled.
my kubelet config:

apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.1.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s

Kubelet certificate needs to be signed by cluster Certificate Authority (or disable certificate validation by passing --kubelet-insecure-tls to Metrics Server)
I had --kubelet-insecure-tls in components.yaml
Container runtime must implement a [container metrics RPCs]
I have not check it because it seems like has nothing about this.
Network should support
Network should be okay because I could access it by NodePort , althrough it return { kind: "Status", apiVersion: "v1", metadata: { }, status: "Failure", message: "forbidden: User "system:anonymous" cannot get path "/stats/summary"", reason: "Forbidden", details: { }, code: 403 }

@yangjunmyfm192085
Copy link
Contributor

yangjunmyfm192085 commented Mar 24, 2022

ok, thanks for the verification. What is your environment? EKS?
Alternatively, you can get metrics with the following command

kubectl proxy
kubectl get --raw /api/v1/nodes/$NODE_NAME/proxy/metrics/resource

@yangjunmyfm192085
Copy link
Contributor

yangjunmyfm192085 commented Mar 24, 2022

NODE_NAME=<Name of node in your cluster>

@stevenlii
Copy link
Author

ok, thanks for the verification. What is your environment? EKS? Alternatively, you can get metrics with the following command

kubectl proxy
kubectl get --raw /api/v1/nodes/$NODE_NAME/proxy/metrics/resource

Thank you very much. I run it and get returns like this, it seems metrics ok

container_cpu_usage_seconds_total{container="csi-cephfsplugin",namespace="rook-ceph",pod="csi-cephfsplugin-jvh2m"} 190.716201251 1648172450949
container_cpu_usage_seconds_total{container="csi-rbdplugin",namespace="rook-ceph",pod="csi-rbdplugin-8wppl"} 193.595610781 1648172462061
container_cpu_usage_seconds_total{container="xxx",namespace="xxx",pod="xxx-vkkdx"} 1301.340016177 1648172464222
container_cpu_usage_seconds_total{container="xxx-server-cpp-demo-v1",namespace="xxx-service-test",pod="xxx-server-cpp-demo-v1-64cd4bd84f-wmfqz"} 31.737046687 1648172457744

@stevenlii
Copy link
Author

stevenlii commented Mar 25, 2022

ok, thanks for the verification. What is your environment? EKS? Alternatively, you can get metrics with the following command

kubectl proxy
kubectl get --raw /api/v1/nodes/$NODE_NAME/proxy/metrics/resource

My environment:computer room self-build, which based cloud platform openstack, centos 7。
This issue #95 seems met the same question with me. I followed the last budy yueyongyue‘s description , changed kubelet from --authorization-mode=Webhook to --authorization-mode=AlwaysAllow, then 403 disappeared and everything is ok!
but I still not clear what's wrong in my configuration.
And of course AlwaysAllow is not the best practices, does yueyongyue's second way: - --kubeconfig=/key/kubeconfig is a good idea? So I wonder what's the official recommendation?
Thanks~

@manfredlift
Copy link

I have the same issue. The only thing that changed was that I upgraded from 0.4.1 to 0.6.1. When I revert back to the old version everything works.

I am passing --kubelet-insecure-tls flag.

@rmendal
Copy link

rmendal commented Jun 15, 2022

Ran into the 403 forbidden issue in our eks cluster. Upgrading from 0.5.1 to 0.6.1 and noticed the breaking change in the release notes for 0.6.0 that changes a resource in the cluster role. I added nodes/stats back to the cluster role and things work for me now.

We have multiple clusters, most running 1.19 and one running 1.21 and this made it work in both versions.

kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  - nodes/stats
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch

@manfredlift
Copy link

Thanks @rmendal, I had the same issue.

@yangjunmyfm192085
Copy link
Contributor

Why metrics-server v0.6.x still needs nodes/stats in eks cluster?

@yangjunmyfm192085
Copy link
Contributor

yangjunmyfm192085 commented Jun 16, 2022

I don't understand why?
@serathius @stevehipwell Do you have any suggestions?

@stevehipwell
Copy link
Contributor

@rmendal how do you provision you EKS cluster? If you're using the EKS community module v18 have you made sure you've added the correct security groups or if using the Helm chart set containerPort: 10250?

@rmendal
Copy link

rmendal commented Jun 16, 2022

@stevehipwell - All of our clusters are provisioned using Terragrunt and the cloud posse eks module. We're using v0.42.1 release. I can confirm it creates the default security group as defined here and we've not altered that at this time.

As for the metrics server install I'm using the default v0.6.1 manifest as linked on the release page in this repo, link. The only change I've made is adding the aforementioned nodes/stats back to the cluster role.

@stevehipwell
Copy link
Contributor

@rmendal this might be un-related but for Metrics Server to function correctly the control plane needs to be able to reach the node on port 4443 and the nodes need to be able to communicate with each other on port 10250. You can reduce this to just port 10250 by changing the container port and --secure-port arg to 10250 in the manifest you're applying.

As an aside I'm not sure if the AWS document you linked is correct as I think the outbound 443 was for legacy Metrics Server.

@rmendal
Copy link

rmendal commented Jun 16, 2022

@stevehipwell so our sec group for the cluster still has an inbound all/all rule but I decided to add an explicit inbound rule for 10250 and then removed node/stats from the cluster role, restarted the metrics server pod and lo and behold, I still have metrics and no 403 forbidden errors.

However, I then removed the node/stats line from the cr in another cluster which DOESN'T have the explicit 10250 sec group inbound rule, restarted the metrics server pod and it's fine. No 403, metrics still being collected.

In conclusion, I have no idea what happened to make this work suddenly without the sec group rule and the added line to the cr but it does. If I encounter any other issues in the near term I'll reply here but things suddenly seem to be working as intended.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 14, 2022
@ledroide
Copy link

Adding the line node/stats solved the issue for me.

However, it had worked during months, before I encountered the metrics-server failure. I don't understand why this line suddenly disappeared from the ClusterRole/system:metrics-server

@mzaian
Copy link

mzaian commented Sep 21, 2022

Adding the line node/stats solved the issue for me.

However, it had worked during months, before I encountered the metrics-server failure. I don't understand why this line suddenly disappeared from the ClusterRole/system:metrics-server

Because this below:

Breaking changes
Metrics Server now requires access to nodes/metrics RBAC resource instead of nodes/stats. No changes needed if you use official manifests, however please update RBAC resources if you just use Metrics Server image with custom manifests.
From: https://github.com/kubernetes-sigs/metrics-server/releases/tag/v0.6.0

@ledroide
Copy link

@mzaian : We had to roll back from v0.6.1 to v0.5.2 because of error "Failed to scrape node" reported in
#1031 and that is fixed for v0.6.2 - that is not available yet.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 26, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ledroide
Copy link

/reopen
The bug should be fixed in v0.6.2, but v0.6.2 is not yet released, so we cannot tell if it actually fixes the bug.

@k8s-ci-robot
Copy link
Contributor

@ledroide: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
The bug should be fixed in v0.6.2, but v0.6.2 is not yet released, so we cannot tell if it actually fixes the bug.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zentavr
Copy link

zentavr commented Jan 4, 2024

I still have this issue with v0.6.4 at my AWS EKS Cluster (Fargate).
Metrics server had been installed by helm

@stevehipwell
Copy link
Contributor

@zentavr for Fargate you need to be using a custom port and you'll need to make sure that you've got that port open in the security group(s).

@zentavr
Copy link

zentavr commented Jan 4, 2024

I have 2 SGs, one of them is for Control Plane with no rules inside, other one is for the Fargates nodes probably.

The service for the metric server listens at 443, while the deployment exposes tcp/10250.

which group, which port and which source address should I use for that rule?

@zentavr
Copy link

zentavr commented Jan 4, 2024

@zentavr for Fargate you need to be using a custom port and you'll need to make sure that you've got that port open in the security group(s).

Have this right now:

$ kubectl get apiservices
NAME                                   SERVICE                              AVAILABLE                      AGE
v1.                                    Local                                True                           256d
v1.admissionregistration.k8s.io        Local                                True                           256d
v1.apiextensions.k8s.io                Local                                True                           256d
v1.apps                                Local                                True                           256d
v1.authentication.k8s.io               Local                                True                           256d
v1.authorization.k8s.io                Local                                True                           256d
v1.autoscaling                         Local                                True                           256d
v1.batch                               Local                                True                           256d
v1.certificates.k8s.io                 Local                                True                           256d
v1.coordination.k8s.io                 Local                                True                           256d
v1.discovery.k8s.io                    Local                                True                           256d
v1.events.k8s.io                       Local                                True                           256d
v1.networking.k8s.io                   Local                                True                           256d
v1.node.k8s.io                         Local                                True                           256d
v1.policy                              Local                                True                           256d
v1.rbac.authorization.k8s.io           Local                                True                           256d
v1.scheduling.k8s.io                   Local                                True                           256d
v1.storage.k8s.io                      Local                                True                           256d
v1alpha1.crd.k8s.amazonaws.com         Local                                True                           197d
v1alpha1.elbv2.k8s.aws                 Local                                True                           197d
v1alpha1.kubeapps.com                  Local                                True                           197d
v1alpha1.networking.k8s.aws            Local                                True                           116d
v1alpha1.vpcresources.k8s.aws          Local                                True                           143d
v1beta1.elbv2.k8s.aws                  Local                                True                           197d
v1beta1.metrics.k8s.io                 kube-metrics-server/metrics-server   False (FailedDiscoveryCheck)   19h
v1beta1.storage.k8s.io                 Local                                True                           256d
v1beta1.vpcresources.k8s.aws           Local                                True                           197d
v1beta2.flowcontrol.apiserver.k8s.io   Local                                True                           256d
v1beta3.flowcontrol.apiserver.k8s.io   Local                                True                           197d
v2.autoscaling                         Local                                True                           256d

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request

$ kubectl describe apiservice v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       app.kubernetes.io/instance=metrics-server
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=metrics-server
              app.kubernetes.io/version=0.6.4
              helm.sh/chart=metrics-server-3.11.0
Annotations:  meta.helm.sh/release-name: metrics-server
              meta.helm.sh/release-namespace: kube-metrics-server
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2024-01-04T00:58:32Z
  Managed Fields:
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:meta.helm.sh/release-name:
          f:meta.helm.sh/release-namespace:
        f:labels:
          .:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
          f:helm.sh/chart:
      f:spec:
        f:group:
        f:groupPriorityMinimum:
        f:insecureSkipTLSVerify:
        f:service:
          .:
          f:name:
          f:namespace:
          f:port:
        f:version:
        f:versionPriority:
    Manager:      helm
    Operation:    Update
    Time:         2024-01-04T00:58:32Z
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .:
          k:{"type":"Available"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
    Manager:         kube-apiserver
    Operation:       Update
    Subresource:     status
    Time:            2024-01-04T02:20:39Z
  Resource Version:  125497215
  UID:               48f0249e-de24-4498-bf9c-b326520e6b47
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-metrics-server
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2024-01-04T00:58:32Z
    Message:               failing or missing response from https://172.31.182.189:10250/apis/metrics.k8s.io/v1beta1: bad status from https://172.31.182.189:10250/apis/metrics.k8s.io/v1beta1: 404
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

172.31.182.189 is the IP of the Fargate Node:

$ kubectl get nodes
NAME                                     STATUS   ROLES    AGE     VERSION
.....
fargate-ip-172-31-182-189.ec2.internal   Ready    <none>   18h     v1.26.10-eks-4f4795d
.....

@zentavr
Copy link

zentavr commented Jan 4, 2024

Resolved at my EKS/Fargate cluster:

  • by default the helm chart uses tcp/10250 as the containerPort. Changed that to tcp/8443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

10 participants