Skip to content

Unable to Access kops Cluster on AWS (Account Has Limited Permissions) #17400

@lmeandry

Description

@lmeandry

/kind support

1. What kops version are you running? The command kops version, will display
this information.

Client version: 1.31.0 (git-v1.31.0)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

My kubectl is...

$ kubectl version --client
Client Version: v1.32.3
Kustomize Version: v5.5.0

My test kops cluster is unreachable. I cannot use KUBECONFIG=~/.kube/config kubectl cluster-info

3. What cloud provider are you using?
AWS cloud account in us-east-1. Our VPCs and Subnets are setup by our gov't agency sponsor. We do however have the ability to create and manage EC2/EBS, Security Groups, S3, Route53 resources.

**4. What commands did you run? What is the simplest way to reproduce this issue?**kubect

5. What happened after the commands executed?

6. What did you expect to happen?
That I could run kubectl cluster-info with the exported kubeconfig.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

---
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: '2025-05-08T20:06:57Z'
  generation: 2
  name: <REDACTED>platform-kops.<REDACTED_ORG>.com
spec:
  api:
    dns: {}
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://<REDACTED_ORG>-ocp-<REDACTED>-testing/<REDACTED>platform-kops.<REDACTED_ORG>.com
  dnsZone: Z0772711NMWNNROOWA4O
  etcdClusters:
    - cpuRequest: 200m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: control-plane-us-east-1b
          name: b
      manager:
        backupRetentionDays: 90
      memoryRequest: 100Mi
      name: main
    - cpuRequest: 100m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: control-plane-us-east-1b
          name: b
      manager:
        backupRetentionDays: 90
      memoryRequest: 100Mi
      name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
    - 0.0.0.0/0
    - ::/0
  kubernetesVersion: 1.31.7
  networkCIDR: 10.175.55.0/24
  networkID: vpc-0e17e4d5cab5c90db
  networking:
    amazonvpc: {}
  nonMasqueradeCIDR: 10.175.55.0/24
  sshAccess:
    - 0.0.0.0/0
    - ::/0
  subnets:
    - cidr: 10.175.55.32/27
      egress: External
      id: subnet-0e2a1ff61e1784941
      name: us-east-1b
      type: Private
      zone: us-east-1b
    - cidr: 10.175.55.0/27
      egress: External
      id: subnet-0871bf04b10f3c364
      name: us-east-1c
      type: Private
      zone: us-east-1c
  topology:
    dns:
      type: Private
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: '2025-05-08T20:06:57Z'
  labels:
    kops.k8s.io/cluster: <REDACTED>platform-kops.<REDACTED_ORG>.com
  name: control-plane-us-east-1b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20250305
  machineType: t3.small
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
    - us-east-1b
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: '2025-05-08T20:06:57Z'
  labels:
    kops.k8s.io/cluster: <REDACTED>platform-kops.<REDACTED_ORG>.com
  name: nodes-us-east-1b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20250305
  machineType: t3.small
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
    - us-east-1b
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: '2025-05-08T20:06:57Z'
  labels:
    kops.k8s.io/cluster: <REDACTED>platform-kops.<REDACTED_ORG>.com
  name: nodes-us-east-1c
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20250305
  machineType: t3.small
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
    - us-east-1c

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

$ KUBECONFIG=~/.kube/config kubectl cluster-info -v 10
I0508 16:17:46.854658   33423 loader.go:402] Config loaded from file:  /home/ec2-user/.kube/config
I0508 16:17:46.855281   33423 envvar.go:172] "Feature gate default state" feature="ClientsPreferCBOR" enabled=false
I0508 16:17:46.855319   33423 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false
I0508 16:17:46.855341   33423 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false
I0508 16:17:46.855355   33423 envvar.go:172] "Feature gate default state" feature="ClientsAllowCBOR" enabled=false
I0508 16:17:46.855492   33423 discovery_client.go:253] "Request Body" body=""
I0508 16:17:46.855599   33423 round_trippers.go:473] curl -v -XGET  -H "Accept: application/json;g=apidiscovery.k8s.io;v=v2;as=APIGroupDiscoveryList,application/json;g=apidiscovery.k8s.io;v=v2beta1;as=APIGroupDiscoveryList,application/json" -H "User-Agent: kubectl/v1.32.3 (linux/amd64) kubernetes/32cc146" 'https://api.internal.<REDACTED>platform-kops.<REDACTED_ORG>.com/api?timeout=32s'
I0508 16:17:46.859084   33423 round_trippers.go:502] HTTP Trace: DNS Lookup for api.internal.<REDACTED>platform-kops.<REDACTED_ORG>.com resolved to [{203.0.113.123 }]
^C

9. Anything else do we need to know?
Additional background....
I work for a defense contractor supporting a US government agency. The agency allocates and manages the cloud resources and has some limits placed via IAM roles and policies.

Currently we build and deploy K8s clusters via the Rancher/RKE2 Kubernetes distribution. We use Ansible to run Terraform that creates some AWS resources within the AWS account. We use the Ansible lablabs/ansible-role-rke2 role to install RKE2. The terraform also generates the necessary load balancers pointing to cluster's service ingresses. Once installed an Ansible task creates the kubeconfig using /etc/rancher/rke2/rke2.yaml from one of the master nodes, and update the server line to point to the API server. This works out well to let us use helm, kubectl and the Ansible kubernetes/core collections.

I'm investigating kops because we have to move away from using Rancher/RKE2.

After a little trial and error, it seems like almost all the necessary AWS resources are set up with a cluster.

However the newly built kops cluster is inaccessible with k8s utilities.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/supportCategorizes issue or PR as a support question.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions