Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vSphere Provider setup is valid error on template deployment: please specify a datacenter #409

Closed
MasterWayZ opened this issue Oct 7, 2021 · 15 comments
Milestone

Comments

@MasterWayZ
Copy link

What happened:
When trying to deploy a cluster to vSphere, it will error out on the step of deploying the template to a Content Library with:
Validation failed {"validation": "vsphere Provider setup is valid", "error": "failed deploying template: error deploying template: govc: please specify a datacenter\n", "remediation": ""}

What you expected to happen:
It succeeds. From what I can see, I have a datacenter specified. My vCenter has multiple datacenters in it with clusters.

How to reproduce it (as minimally and precisely as possible):
Attempt to deploy a cluster to vSphere. vCenter version is 7.0U3 build 18700403. My vCenter has multiple datacenters in it with clusters.

My config YAML:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: mmg-test
spec:
  clusterNetwork:
    cni: cilium
    pods:
      cidrBlocks:
      - 10.69.0.0/16
    services:
      cidrBlocks:
      - 10.112.0.0/12
  controlPlaneConfiguration:
    count: 2
    endpoint:
      host: "10.96.78.5"
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test-cp
  datacenterRef:
    kind: VSphereDatacenterConfig
    name: mmg-test
  externalEtcdConfiguration:
    count: 3
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test-etcd
  kubernetesVersion: "1.21"
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
  name: mmg-test
spec:
  datacenter: "ZB"
  insecure: true
  network: "eks-mmgtest"
  server: "vcenter-fqdn"
  thumbprint: ""

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test-cp
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-rsa MYPUBLICKEY

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-rsa MYPUBLICKEY

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test-etcd
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-rsa MYPUBLICKEY

---

Anything else we need to know?:
N/A

Environment:

  • EKS Anywhere Release: v0.5.0
  • EKS Distro Release: Unknown
@TerryHowe
Copy link
Contributor

I wonder if the error message is incorrect. It looks like you didn't specify the full path on network for one thing and the server setting looks suspicious. Is your server really named vcenter-fqdn? My config looks like:

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
  name: tlhowe
spec:
  datacenter: SDDC-Datacenter
  insecure: false
  network: /SDDC-Datacenter/network/sddc-cgw-network-1
  server: vcenter.sddc-1-2-3-4.vmwarevmc.com
  thumbprint: ""

@MasterWayZ
Copy link
Author

I wonder if the error message is incorrect. It looks like you didn't specify the full path on network for one thing and the server setting looks suspicious. Is your server really named vcenter-fqdn? My config looks like:

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
  name: tlhowe
spec:
  datacenter: SDDC-Datacenter
  insecure: false
  network: /SDDC-Datacenter/network/sddc-cgw-network-1
  server: vcenter.sddc-1-2-3-4.vmwarevmc.com
  thumbprint: ""

No, I just changed it for pasting it here. Listing the FQDN here isn't that bad since it's not publicly resolvable/reachable anyway. The FQDN I put, and also the correct one, is vcenter.internal.masterwayz.nl. It passes the validation steps, but when trying to import the template it fails with the error listed on the issue.

@abhay-krishna
Copy link
Member

Hi @MasterWayZ ! I notice that you've set insecure to true under VSphereDatacenterConfig. This is usually set if the vCenter server does not have a valid certificate. Is this the case here?

@MasterWayZ
Copy link
Author

Hi @MasterWayZ ! I notice that you've set insecure to true under VSphereDatacenterConfig. This is usually set if the vCenter server does not have a valid certificate. Is this the case here?

Yes, it has a self-signed certificate that was created during its installation by itself. It's my homelab's vCenter, so I never changed the cert and just click through certificate warnings.

@abhay-krishna
Copy link
Member

Could you retry the cluster creation providing the full path of the network in the cluster config?

@abhay-krishna abhay-krishna added this to the oncall milestone Oct 11, 2021
@TerryHowe
Copy link
Contributor

Any progress with this?

If you run the CLI with full logging -v9 you will get a print out of the command it is running to create the template and that might be useful to determine what is going on. You might also be able to cut and paste from that and see if govc is able to push the template by itself.

@TerryHowe
Copy link
Contributor

I wonder if providing a thumbprint would help.

@MasterWayZ
Copy link
Author

My bad for the delay!
I forgot I have GitHub emails going into a folder I rarely read because I get too many.
Will try the above right now! Apologies again!

@abhinavmpandey08
Copy link
Member

abhinavmpandey08 commented Oct 12, 2021

Seems like a bug in the code while creating the ova template here:

if err = f.createTemplate(ctx, machineConfig.Spec.Template, ovaURL, string(osFamily)); err != nil {

We use the machineConfig.Spec.Template for the template path but if a user doesn't specify a path, it defaults to empty string. By default, govc uses /<datacenter>/vm if empty string is passed for folder path which works fine in single datacenter environment but not in multi-datacenter environments.

For now, can you try importing the ova manually, instructions here: https://anywhere.eks.amazonaws.com/docs/reference/vsphere/vsphere-ovas/
and then provide the full path for the template using this field under VSphereMachineConfig sections in clusterconfig: https://anywhere.eks.amazonaws.com/docs/reference/clusterspec/vsphere/#template-optional

@MasterWayZ
Copy link
Author

Seems like a bug in the code while creating the ova template here:

if err = f.createTemplate(ctx, machineConfig.Spec.Template, ovaURL, string(osFamily)); err != nil {

We use the machineConfig.Spec.Template for the template path but if a user doesn't specify a path, it defaults to empty string. By default, govc uses /<datacenter>/vm if empty string is passed for folder path which works fine in single datacenter environment but not in multi-datacenter environments.

For now, can you try importing the ova manually, instructions here: https://anywhere.eks.amazonaws.com/docs/reference/vsphere/vsphere-ovas/ and then provide the full path for the template using this field under VSphereMachineConfig sections in clusterconfig: https://anywhere.eks.amazonaws.com/docs/reference/clusterspec/vsphere/#template-optional

I tried this, but it fails.

2021-10-13T02:08:07.993+0200    V0      ✅ Resource pool validated
2021-10-13T02:08:07.993+0200    V6      Executing command       {"cmd": "/usr/bin/docker run -i --network host -v /home/michelle:/home/michelle -w /home/michelle -v /var/run/docker.sock:/var/run/docker.sock -e GOVC_INSECURE=true -e GOVC_USERNAME=***** -e GOVC_PASSWORD=***** -e GOVC_URL=vcenter.internal.masterwayz.nl --entrypoint govc public.ecr.aws/eks-anywhere/cli-tools:v0.1.0-eks-a-1 find -json /ZB -type VirtualM
achine -name bottlerocket-vmware-k8s-1.21"}
2021-10-13T02:08:08.960+0200    V-3     odd number of arguments passed as key-value pairs for logging   {"ignored key": "/ZB/bottlerocket-vmware-k8s-1.21"}
2021-10-13T02:08:08.960+0200    V4      Tasks completed {"duration": "11.0467658s"}
panic: odd number of arguments passed as key-value pairs for logging

goroutine 1 [running]:
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc000fb4240, 0xc0003fe600, 0x1, 0x1)
        /go/pkg/mod/go.uber.org/zap@v1.16.1-0.20210329175301-c23abee72d19/zapcore/entry.go:234 +0x532
go.uber.org/zap.(*Logger).DPanic(0xc0002d9f80, 0x3a9c771, 0x3d, 0xc0003fe600, 0x1, 0x1)
        /go/pkg/mod/go.uber.org/zap@v1.16.1-0.20210329175301-c23abee72d19/logger.go:219 +0x85
github.com/go-logr/zapr.handleFields(0xc0002d9f80, 0xc000d004d0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x10, 0x34a64a0, 0x36e4b01)
        /go/pkg/mod/github.com/go-logr/zapr@v0.4.0/zapr.go:100 +0x5e5
github.com/go-logr/zapr.(*zapLogger).Info(0xc000d004c0, 0x3a29758, 0x17, 0xc000d004d0, 0x1, 0x1)
        /go/pkg/mod/github.com/go-logr/zapr@v0.4.0/zapr.go:127 +0xb0
github.com/aws/eks-anywhere/pkg/executables.(*Govc).SearchTemplate(0xc00099a020, 0x3e92e50, 0xc000052078, 0xc000edf6c0, 0x2, 0xc000fa6540, 0xc000633c50, 0xc000e53800, 0x18, 0x100000000000600)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/executables/govc.go:111 +0x877
github.com/aws/eks-anywhere/pkg/providers/vsphere.(*vsphereProvider).validateTemplatePresence(0xc000fb40c0, 0x3e92e50, 0xc000052078, 0xc000edf6c0, 0x2, 0xc000fa6540, 0x0, 0xc000989a00)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/providers/vsphere/vsphere.go:754 +0x76
github.com/aws/eks-anywhere/pkg/providers/vsphere.(*vsphereProvider).validateTemplate(0xc000fb40c0, 0x3e92e50, 0xc000052078, 0xc0003012c0, 0xc000fa6540, 0xc000fb4108, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/providers/vsphere/vsphere.go:742 +0x68
github.com/aws/eks-anywhere/pkg/providers/vsphere.(*vsphereProvider).setupAndValidateCluster(0xc000fb40c0, 0x3e92e50, 0xc000052078, 0xc0003012c0, 0x0, 0x1)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/providers/vsphere/vsphere.go:600 +0xa85
github.com/aws/eks-anywhere/pkg/providers/vsphere.(*vsphereProvider).SetupAndValidateCreateCluster(0xc000fb40c0, 0x3e92e50, 0xc000052078, 0xc0003012c0, 0x1, 0xc000a43ce0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/providers/vsphere/vsphere.go:838 +0xf5
github.com/aws/eks-anywhere/pkg/workflows.(*SetAndValidateTask).providerValidation.func1(0xc000482410)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/workflows/create.go:157 +0x115
github.com/aws/eks-anywhere/pkg/validations.(*Runner).Run(0xc001007798, 0xc000482410, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/validations/runner.go:24 +0x52
github.com/aws/eks-anywhere/pkg/workflows.(*SetAndValidateTask).Run(0x5538498, 0x3e92e50, 0xc000052078, 0xc000b42a00, 0xe, 0x2)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/workflows/create.go:144 +0x28c
github.com/aws/eks-anywhere/pkg/task.(*taskRunner).RunTask(0xc0010078d0, 0x3e92e50, 0xc000052078, 0xc000b42a00, 0x0, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/task/task.go:123 +0x2da
github.com/aws/eks-anywhere/pkg/workflows.(*Create).Run(0xc001007b20, 0x3e92e50, 0xc000052078, 0xc0003012c0, 0x3e9f000, 0xc000f018c0, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/pkg/workflows/create.go:54 +0x149
github.com/aws/eks-anywhere/cmd/eks-a/cmd.(*createClusterOptions).createCluster(0x54f23d0, 0x3e92e50, 0xc000052078, 0x0, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/cmd/eks-a/cmd/createcluster.go:165 +0xf30
github.com/aws/eks-anywhere/cmd/eks-a/cmd.glob..func1(0x53752c0, 0xc000a1cea0, 0x0, 0x3, 0x0, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/cmd/eks-a/cmd/createcluster.go:45 +0x93
github.com/spf13/cobra.(*Command).execute(0x53752c0, 0xc000a1ce70, 0x3, 0x3, 0x53752c0, 0xc000a1ce70)
        /go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:852 +0x472
github.com/spf13/cobra.(*Command).ExecuteC(0x5374640, 0x8, 0xc000000180, 0x32bf325)
        /go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:890
github.com/aws/eks-anywhere/cmd/eks-a/cmd.Execute(0x0, 0x0)
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/cmd/eks-a/cmd/root.go:43 +0x53
main.main()
        /codebuild/output/src181274652/src/git-codecommit.us-west-2.amazonaws.com/v1/repos/aws.eks-anywhere/cmd/eks-a/main.go:29 +0xe5

Screenshot of the template in vCenter with tags

@MasterWayZ
Copy link
Author

Entry in the config yaml is: template: "/ZB/bottlerocket-vmware-k8s-1.21"

@abhinavmpandey08
Copy link
Member

abhinavmpandey08 commented Oct 13, 2021

Do you mind posting the whole cluster config here? Also, vSphere has an implicit folder path for virtual machines vm so you will have to set the template to template: "/ZB/vm/bottlerocket-vmware-k8s-1.21"

@MasterWayZ
Copy link
Author

Oh, whoops.
After that edit it's deploying, still going as I type this.
This is my config file:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: mmg-test
spec:
  clusterNetwork:
    cni: cilium
    pods:
      cidrBlocks:
      - 10.69.0.0/16
    services:
      cidrBlocks:
      - 10.112.0.0/12
  controlPlaneConfiguration:
    count: 2
    endpoint:
      host: "10.96.78.5"
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test-cp
  datacenterRef:
    kind: VSphereDatacenterConfig
    name: mmg-test
  externalEtcdConfiguration:
    count: 3
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test-etcd
  kubernetesVersion: "1.21"
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: VSphereMachineConfig
      name: mmg-test

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
  name: mmg-test
spec:
  datacenter: "ZB"
  insecure: true
  network: "eks-mmgtest"
  server: "vcenter.internal.masterwayz.nl"
  thumbprint: ""

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test-cp
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  template: "/ZB/vm/bottlerocket-vmware-k8s-1.21"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-pubkey-here

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  template: "/ZB/vm/bottlerocket-vmware-k8s-1.21"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-pubkey-here

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
  name: mmg-test-etcd
spec:
  datastore: "ESXi 3/ESXi3 SSD 3"
  template: "/ZB/vm/bottlerocket-vmware-k8s-1.21"
  diskGiB: 25
  folder: ""
  memoryMiB: 8192
  numCPUs: 2
  osFamily: bottlerocket
  resourcePool: "COMG3/Resources"
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-pubkey-here

---

@MasterWayZ
Copy link
Author

So the deployment worked! :)

@abhay-krishna
Copy link
Member

So the deployment worked! :)

Awesome to hear that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants