diff --git a/README.md b/README.md index 2627af999..94379090a 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ ### What works -- Creation of worker nodes on AWS, Digitalocean, Openstack, Azure, Google Cloud Platform, Nutanix, VMWare Cloud Director, VMWare Vsphere, Linode, Hetzner cloud and Kubevirt (experimental) +- Creation of worker nodes on AWS, Digitalocean, Openstack, Azure, Google Cloud Platform, Nutanix, VMWare Cloud Director, VMWare Vsphere, Linode, Hetzner cloud, Kubevirt (technology preview) and Proxmox (technology preview) - Using Ubuntu, Flatcar or CentOS 7 distributions ([not all distributions work on all providers](/docs/operating-system.md)) ### Supported Kubernetes versions diff --git a/docs/operating-system.md b/docs/operating-system.md index 93f16a872..648ff3566 100644 --- a/docs/operating-system.md +++ b/docs/operating-system.md @@ -4,19 +4,20 @@ ### Cloud provider -| | Ubuntu | CentOS | Flatcar | RHEL | SLES | Amazon Linux 2 | Rocky Linux | -|---|---|---|---|---|---|---|---| -| AWS | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | -| Azure | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | -| Digitalocean | ✓ | ✓ | x | x | x | x | ✓ | -| Equinix Metal | ✓ | ✓ | ✓ | x | x | x | ✓ | -| Google Cloud Platform | ✓ | x | x | x | x | x | x | -| Hetzner | ✓ | ✓ | x | x | x | x | ✓ | -| KubeVirt | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | -| Nutanix | ✓ | ✓ | x | x | x | x | x | -| Openstack | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | -| VMware Cloud Director | ✓ | x | x | x | x | x | x | -| VSphere | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | +| | Ubuntu | CentOS | Flatcar | RHEL | SLES | Amazon Linux 2 | Rocky Linux | +| --- | --- | --- | --- | --- | --- | --- | --- | +| AWS | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | +| Azure | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | +| Digitalocean | ✓ | ✓ | x | x | x | x | ✓ | +| Equinix Metal | ✓ | ✓ | ✓ | x | x | x | ✓ | +| Google Cloud Platform | ✓ | x | x | x | x | x | x | +| Hetzner | ✓ | ✓ | x | x | x | x | ✓ | +| KubeVirt | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | +| Nutanix | ✓ | ✓ | x | x | x | x | x | +| Openstack | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | +| Proxmox | ✓ | x | x | x | x | x | x | +| VMware Cloud Director | ✓ | x | x | x | x | x | x | +| VSphere | ✓ | ✓ | ✓ | ✓ | x | x | ✓ | ## Configuring a operating system @@ -38,11 +39,11 @@ OS specific settings can be set via `machine.spec.providerConfig.operatingSystem Note that the table below lists the OS versions that we are validating in our automated tests. Machine controller may work with other OS versions that are not listed in the table but support won’t be provided. -| | Versions | -|---|---| -| AmazonLinux2 | 2.x | -| CentOS | 7.4.x, 7.6.x, 7.7.x | -| RHEL | 8.x | -| Rocky Linux | 8.5 | -| SLES | SLES 15 SP3 | -| Ubuntu | 20.04 LTS, 22.04 LTS | +| | Versions | +| --- | --- | +| AmazonLinux2 | 2.x | +| CentOS | 7.4.x, 7.6.x, 7.7.x | +| RHEL | 8.x | +| Rocky Linux | 8.5 | +| SLES | SLES 15 SP3 | +| Ubuntu | 20.04 LTS, 22.04 LTS | diff --git a/docs/proxmox.md b/docs/proxmox.md new file mode 100644 index 000000000..ebc59af08 --- /dev/null +++ b/docs/proxmox.md @@ -0,0 +1,94 @@ +# Proxmox Virtual Environment + +## State of the implementation + +Support for Proxmox as a provider in the machine-controller is currently just a technical demo. It +is possible to create MachineDeployments using manually created VM templates as demonstrated below. +In this example the VM template is using local storage, which is why this template can only be +cloned on the same node it is located at. + +## Prerequisites + +### Authentication + +For authentication the following data is needed: + +- `user_id` is expected to be in the form `USER@REALM!TOKENID` +- `token` is just the UUID you get when initially creating the token + +See also: +* https://pve.proxmox.com/wiki/User_Management#pveum_tokens +* https://pve.proxmox.com/wiki/Proxmox_VE_API#API_Tokens + +#### User Privileges + +For the provider to properly function the user needs an API token with the following privileges: + +* `Datastore.AllocateSpace` +* `Pool.Allocate` +* `Pool.Audit` +* `VM.Allocate` +* `VM.Audit` +* `VM.Clone` +* `VM.Config.CDROM` +* `VM.Config.CPU` +* `VM.Config.Cloudinit` +* `VM.Config.Disk` +* `VM.Config.HWType` +* `VM.Config.Memory` +* `VM.Config.Network` +* `VM.Config.Options` +* `VM.Monitor` +* `Sys.Audit` +* `Sys.Console` + +### Cloud-init enabled VM Templates + +Although it is possible to upload cloud-init images in Proxmox VE and create VM disks directly from +these images via CLI tools on the nodes directly, there is no API endpoint yet to provide this +functionality externally. That's why the `proxmox` provider assumes there are VM templates in place +to clone new machines from. + +Proxmox recommends using either ready-to-use cloud-init images provided by many Linux distributions +(mostly designed for OpenStack) or to prepare the images yourself as you have full control over +what's in these images. + +For VM templates to be available on all nodes, they need to be added to the `ha-manager`. + +Example for creating a VM template: +```bash +# Download the cloud-image. +wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img +INSTANCE_ID=9000 +# Create the VM that will be turned into the template. +qm create $INSTANCE_ID -name ubuntu-18.04-LTS +# Import the downloaded cloud-image as disk. +qm importdisk $INSTANCE_ID bionic-server-cloudimg-amd64.img local-lvm +# Set the imported Disk as SCSI drive. +qm set $INSTANCE_ID -scsihw virtio-scsi-pci -scsi0 local-lvm:vm-$INSTANCE_ID-disk-0 +# Create the cloud-init drive where the user-data is read from. +qm set $INSTANCE_ID -ide2 local-lvm:cloudinit +# Boot from the imported disk. +qm set $INSTANCE_ID -boot c -bootdisk scsi0 +# Set a serial console for better proxmox access. +qm set $INSTANCE_ID -serial0 socket -vga serial0 +# Setup bridged network for the VM. +qm set $INSTANCE_ID -net0 virtio,bridge=vmbr0 +# Enable QEMU Agent support for this VM (mandatory). +qm set $INSTANCE_ID -agent 1 +# Convert VM to tempate. +qm template $INSTANCE_ID +# Make VM template available on any node, not just were it was created. +ha-manager add vm:$INSTANCE_ID -state stopped +``` + +### Cloud-init user-data + +Proxmox currently does not support the upload of "snippets" via API, but these snippets are used for +cloud-init user-data which are required for the machine-controller to function. This provider +implementation needs to copy the generated user-data yaml file via SFTP to every proxmox node where +a VM is created or migrated to. For this to work, make sure that: + +* A storage is enabled for content `snippets` (e.g. `local`) +* You have the SSH private key of a user that exists on all nodes which has write permission to the +path where snippets are stored (e.g. `/var/lib/vz/snippets`) diff --git a/examples/proxmox-machinedeployment.yaml b/examples/proxmox-machinedeployment.yaml new file mode 100644 index 000000000..9f44a5880 --- /dev/null +++ b/examples/proxmox-machinedeployment.yaml @@ -0,0 +1,89 @@ +apiVersion: v1 +kind: Secret +metadata: + # If you change the namespace/name, you must also + # adjust the rbac rules + name: machine-controller-proxmox + namespace: kube-system +type: Opaque +stringData: + token: << PM_API_TOKEN >> + sshPrivateKey: | + -----BEGIN OPENSSH PRIVATE KEY----- + ... + -----END OPENSSH PRIVATE KEY----- +--- +apiVersion: "cluster.k8s.io/v1alpha1" +kind: MachineDeployment +metadata: + name: proxmox-machinedeployment + namespace: kube-system +spec: + paused: false + replicas: 1 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + minReadySeconds: 0 + selector: + matchLabels: + foo: bar + template: + metadata: + labels: + foo: bar + spec: + providerSpec: + value: + sshPublicKeys: + - "<< YOUR_PUBLIC_KEY >>" + cloudProvider: "proxmox" + cloudProviderSpec: + # Can also be set via the env var 'PM_API_ENDPOINT' on the machine-controller + # example: 'https://10.0.1.5/api2/json' + endpoint: '<< PM_API_ENDPOINT >>' + # Can also be set via the env var 'PM_API_USER_ID' on the machine-controller + userID: '<< PM_API_USER_ID >>' + # Can also be set via the env var 'PM_API_TOKEN' on the machine-controller + token: + secretKeyRef: + namespace: kube-system + name: machine-controller-proxmox + key: token + # Optional: Allow insecure connections to endpoint if no valid TLS certificate is + # presented. Can also be set via the env var 'PM_TLS_INSECURE' on the machine-controller + allowInsecure: true + # Optional: Connect through a proxy + # Can also be set via the env var 'PM_PROXY_URL' on the machine-controller + # proxyURL: '<< PM_PROXY_URL >>' + # SSH private key of a user that has write access to local storage on all nodes. this + # is needed to push cloud-init userdata. + ciStorageSSHPrivateKey: + secretKeyRef: + namespace: kube-system + name: machine-controller-proxmox + key: sshPrivateKey + ciStorageName: local + ciStoragePath: /var/lib/vz + # Can also be set via the env + # ID of the VM template + vmTemplateID: 9000 + # Sets the CPU count for this VM + cpuSockets: 2 + # Sets the CPU cores per CPU + cpuCores: 1 + # Memory configuration in MiB + memoryMB: 2048 + # Optional: Set up system disk size in GB. If not set, will be based on disk size in vm + # template. Cannot be smaller than the initial disk size in vm template. + diskSizeGB: 20 + # Optional: Name of the disk used in the vm template. + diskName: scsi0 + operatingSystem: "ubuntu" + operatingSystemSpec: + distUpgradeOnBoot: false + disableAutoUpdate: true + versions: + kubelet: 1.22.5 diff --git a/go.mod b/go.mod index 2560f8da1..da35a3abe 100644 --- a/go.mod +++ b/go.mod @@ -11,6 +11,7 @@ require ( github.com/BurntSushi/toml v1.1.0 github.com/Masterminds/semver/v3 v3.1.1 github.com/Masterminds/sprig/v3 v3.2.2 + github.com/Telmate/proxmox-api-go v0.0.0-20220605094644-df18575a84d9 github.com/aliyun/alibaba-cloud-sdk-go v1.61.1645 github.com/aws/aws-sdk-go-v2 v1.16.12 github.com/aws/aws-sdk-go-v2/config v1.17.3 @@ -61,6 +62,8 @@ require ( sigs.k8s.io/yaml v1.3.0 ) +require github.com/pkg/sftp v1.13.5 + require ( cloud.google.com/go v0.100.2 // indirect cloud.google.com/go/compute v1.5.0 // indirect @@ -118,6 +121,7 @@ require ( github.com/jmespath/go-jmespath v0.4.0 // indirect github.com/josharian/intern v1.0.0 // indirect github.com/json-iterator/go v1.1.12 // indirect + github.com/kr/fs v0.1.0 // indirect github.com/kr/pretty v0.3.0 // indirect github.com/kr/text v0.2.0 // indirect github.com/mailru/easyjson v0.7.7 // indirect diff --git a/go.sum b/go.sum index 34adc05c4..fe55be108 100644 --- a/go.sum +++ b/go.sum @@ -112,6 +112,8 @@ github.com/PuerkitoBio/purell v1.1.1/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbt github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE= github.com/Shopify/sarama v1.19.0/go.mod h1:FVkBWblsNy7DGZRfXLU0O9RCGt5g3g3yEuWXgklEdEo= github.com/Shopify/toxiproxy v2.1.4+incompatible/go.mod h1:OXgGpZ6Cli1/URJOF1DMxUHB2q5Ap20/P/eIdh4G0pI= +github.com/Telmate/proxmox-api-go v0.0.0-20220605094644-df18575a84d9 h1:iu4+7XxumrN8tWP1t2PFRMYyTGkCS6Ttg/Y7RFsg8+I= +github.com/Telmate/proxmox-api-go v0.0.0-20220605094644-df18575a84d9/go.mod h1:uHptTDYag2s4dJQWUYP4VczIeUPGGu3BNy4JGm9MSjg= github.com/VividCortex/gohistogram v1.0.0/go.mod h1:Pf5mBqqDxYaXu3hDrrU+w6nw50o/4+TcAqDqk/vUH7g= github.com/afex/hystrix-go v0.0.0-20180502004556-fa1af6a1f4f5/go.mod h1:SkGFH1ia65gfNATL8TAiHDNxPzPdmEL5uirI2Uyuz6c= github.com/agnivade/levenshtein v1.0.1/go.mod h1:CURSv5d9Uaml+FovSIICkLbAUZ9S4RqaHDIsdSBg7lM= @@ -616,6 +618,7 @@ github.com/klauspost/compress v1.15.1/go.mod h1:/3/Vjq9QcHkK5uEr5lBEmyoZ1iFhe47e github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= github.com/konsorten/go-windows-terminal-sequences v1.0.2/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= github.com/konsorten/go-windows-terminal-sequences v1.0.3/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= +github.com/kr/fs v0.1.0 h1:Jskdu9ieNAYnjxsi0LbQp1ulIKZV1LAFgK1tWhpZgl8= github.com/kr/fs v0.1.0/go.mod h1:FFnZGqtBN9Gxj7eW1uZ42v5BccTP0vu6NEaFoC2HwRg= github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc= github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= @@ -793,6 +796,8 @@ github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4= github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= github.com/pkg/profile v1.2.1/go.mod h1:hJw3o1OdXxsrSjjVksARp5W95eeEaEfptyVZyv6JUPA= github.com/pkg/sftp v1.10.1/go.mod h1:lYOWFsE0bwd1+KfKJaKeuokY15vzFx25BLbzYYoAxZI= +github.com/pkg/sftp v1.13.5 h1:a3RLUqkyjYRtBTZJZ1VRrKbN3zhuPLlUc3sphVz81go= +github.com/pkg/sftp v1.13.5/go.mod h1:wHDZ0IZX6JcBYRK1TH9bcVq8G7TLpVHYIGJRFnmPfxg= github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= github.com/posener/complete v1.1.1/go.mod h1:em0nMJCgc9GFtwrmVmEMR/ZL6WyhyjMBndrE9hABlRI= diff --git a/pkg/cloudprovider/provider.go b/pkg/cloudprovider/provider.go index 8447be854..2af1af889 100644 --- a/pkg/cloudprovider/provider.go +++ b/pkg/cloudprovider/provider.go @@ -34,6 +34,7 @@ import ( "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/linode" "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/nutanix" "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/openstack" + "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/proxmox" "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/scaleway" vcd "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/vmwareclouddirector" "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/vsphere" @@ -105,6 +106,9 @@ var ( providerconfigtypes.CloudProviderNutanix: func(cvr *providerconfig.ConfigVarResolver) cloudprovidertypes.Provider { return nutanix.New(cvr) }, + providerconfigtypes.CloudProviderProxmox: func(cvr *providerconfig.ConfigVarResolver) cloudprovidertypes.Provider { + return proxmox.New(cvr) + }, providerconfigtypes.CloudProviderVMwareCloudDirector: func(cvr *providerconfig.ConfigVarResolver) cloudprovidertypes.Provider { return vcd.New(cvr) }, diff --git a/pkg/cloudprovider/provider/proxmox/client.go b/pkg/cloudprovider/provider/proxmox/client.go new file mode 100644 index 000000000..b9801c4e7 --- /dev/null +++ b/pkg/cloudprovider/provider/proxmox/client.go @@ -0,0 +1,226 @@ +/* +Copyright 2022 The Machine Controller Authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package proxmox + +import ( + "context" + "crypto/tls" + "encoding/json" + "errors" + "fmt" + "path/filepath" + "sort" + "strings" + + "github.com/Telmate/proxmox-api-go/proxmox" + "github.com/pkg/sftp" + "golang.org/x/crypto/ssh" + corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/util/rand" + + "github.com/kubermatic/machine-controller/pkg/apis/cluster/common" + cloudprovidererrors "github.com/kubermatic/machine-controller/pkg/cloudprovider/errors" + proxmoxtypes "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/proxmox/types" +) + +const ( + taskTimeout = 300 + exitStatusSuccess = "OK" +) + +type ClientSet struct { + *proxmox.Client +} + +func GetClientSet(config *Config) (*ClientSet, error) { + if config == nil { + return nil, errors.New("no configuration passed") + } + + if config.UserID == "" { + return nil, errors.New("no user_id specified") + } + + if config.Token == "" { + return nil, errors.New("no token specificed") + } + + if config.Endpoint == "" { + return nil, errors.New("no endpoint specified") + } + + client, err := proxmox.NewClient(config.Endpoint, nil, &tls.Config{InsecureSkipVerify: config.TLSInsecure}, config.ProxyURL, taskTimeout) + if err != nil { + return nil, fmt.Errorf("could not initiate proxmox client: %w", err) + } + + client.SetAPIToken(config.UserID, config.Token) + + return &ClientSet{client}, nil +} + +func (c ClientSet) getVMRefByName(name string) (*proxmox.VmRef, error) { + vmr, err := c.GetVmRefByName(name) + if err != nil { + if err.Error() == fmt.Sprintf("vm '%s' not found", name) { + return nil, cloudprovidererrors.ErrInstanceNotFound + } + return nil, err + } + + return vmr, nil +} + +func (c ClientSet) getNodeList() (*proxmoxtypes.NodeList, error) { + nodeList, err := c.GetNodeList() + if err != nil { + return nil, fmt.Errorf("cannot fetch nodes from cluster: %w", err) + } + + var nl *proxmoxtypes.NodeList + + nodeListJSON, err := json.Marshal(nodeList) + if err != nil { + return nil, fmt.Errorf("marshalling nodeList to JSON: %w", err) + } + err = json.Unmarshal(nodeListJSON, &nl) + if err != nil { + return nil, fmt.Errorf("unmarshalling JSON to NodeList: %w", err) + } + + return nl, nil +} + +func (c ClientSet) checkTemplateExists(vmID int) (bool, error) { + vmInfo, err := c.GetVmInfo(proxmox.NewVmRef(vmID)) + if err != nil { + return false, fmt.Errorf("failed to retrieve info for VM template %d", vmID) + } + + return vmInfo["template"] == 1, nil +} + +func (c ClientSet) getIPsByVMRef(vmr *proxmox.VmRef) (map[string]corev1.NodeAddressType, error) { + addresses := map[string]corev1.NodeAddressType{} + netInterfaces, err := c.GetVmAgentNetworkInterfaces(vmr) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to get network interfaces: %v", err), + } + } + for _, netIf := range netInterfaces { + if netIf.Name == "lo" { + continue + } + for _, ipAddr := range netIf.IPAddresses { + if len(ipAddr) > 0 { + ip := ipAddr.String() + addresses[ip] = corev1.NodeInternalIP + } + } + } + + return addresses, nil +} + +func (c ClientSet) copyUserdata(ctx context.Context, node, localStoragePath, userID, userdata, privateKey string, vmID int) (string, error) { + nodeIP, err := c.getNodeIP(node) + if err != nil { + return "", fmt.Errorf("failed to get node IP: %w", err) + } + + signer, err := ssh.ParsePrivateKey([]byte(privateKey)) + if err != nil { + return "", fmt.Errorf("could not parse private key: %w", err) + } + + username := strings.Split(userID, "@")[0] + sshConfig := ssh.ClientConfig{ + User: username, + Auth: []ssh.AuthMethod{ + ssh.PublicKeys(signer), + }, + HostKeyCallback: ssh.InsecureIgnoreHostKey(), + } + client, err := ssh.Dial("tcp", nodeIP+":22", &sshConfig) + if err != nil { + return "", fmt.Errorf("unable to connect: %w", err) + } + defer client.Close() + + sftpClient, err := sftp.NewClient(client) + if err != nil { + return "", fmt.Errorf("failed to create SFTP client: %w", err) + } + + filePath := filepath.Join("snippets", fmt.Sprintf("userdata-%d.yml", vmID)) + remoteFilePath := filepath.Join(localStoragePath, filePath) + remoteFile, err := sftpClient.Create(remoteFilePath) + if err != nil { + return "", fmt.Errorf("failed to create remote file: %w", err) + } + defer remoteFile.Close() + + _, err = remoteFile.ReadFrom(strings.NewReader(userdata)) + + return filePath, err +} + +func (c ClientSet) getNodeIP(node string) (string, error) { + var response map[string]interface{} + var devices proxmoxtypes.NodeNetworkDeviceList + err := c.GetJsonRetryable(fmt.Sprintf("/nodes/%s/network", node), &response, 3) + if err != nil { + return "", fmt.Errorf("could not get node network data: %w", err) + } + + // JSON roundtrip to transform map[string]interface{} to struct + devicesJSON, err := json.Marshal(response) + if err != nil { + return "", fmt.Errorf("marshalling response to JSON: %w", err) + } + err = json.Unmarshal(devicesJSON, &devices) + if err != nil { + return "", fmt.Errorf("unmarshalling JSON to NodeNetworkDeviceList: %w", err) + } + + sort.Slice(devices.Data, func(i, j int) bool { + return devices.Data[i].Priority < devices.Data[j].Priority + }) + + for _, d := range devices.Data { + if d.Address != nil { + return strings.Split(*d.Address, "/")[0], nil + } + } + + return "", fmt.Errorf("could not retrieve IP for node %q", node) +} + +func (c ClientSet) selectNode(cpuCores, memoryMB int) (string, error) { + nodeList, err := c.getNodeList() + if err != nil { + return "", fmt.Errorf("no nodes to select from: %w", err) + } + + // For the first tech demo just pick a random available node. Later more + // sophisticated approaches may be posslibe like avoiding overutilized + // nodes or round robin. + + return nodeList.Data[rand.Intn(len(nodeList.Data))].Name, nil +} diff --git a/pkg/cloudprovider/provider/proxmox/provider.go b/pkg/cloudprovider/provider/proxmox/provider.go new file mode 100644 index 000000000..1cfcb971d --- /dev/null +++ b/pkg/cloudprovider/provider/proxmox/provider.go @@ -0,0 +1,537 @@ +/* +Copyright 2022 The Machine Controller Authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package proxmox + +import ( + "context" + "encoding/json" + "errors" + "fmt" + "time" + + "github.com/Telmate/proxmox-api-go/proxmox" + "github.com/kubermatic/machine-controller/pkg/apis/cluster/common" + clusterv1alpha1 "github.com/kubermatic/machine-controller/pkg/apis/cluster/v1alpha1" + cloudprovidererrors "github.com/kubermatic/machine-controller/pkg/cloudprovider/errors" + "github.com/kubermatic/machine-controller/pkg/cloudprovider/instance" + proxmoxtypes "github.com/kubermatic/machine-controller/pkg/cloudprovider/provider/proxmox/types" + cloudprovidertypes "github.com/kubermatic/machine-controller/pkg/cloudprovider/types" + "github.com/kubermatic/machine-controller/pkg/providerconfig" + providerconfigtypes "github.com/kubermatic/machine-controller/pkg/providerconfig/types" + + corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/runtime" + "k8s.io/apimachinery/pkg/types" +) + +const ( + defaultCIStoragePath = "/var/lib/vz" + defaultCIStorageName = "local" + defaultDiskName = "virtio0" +) + +type Config struct { + Endpoint string + UserID string + Token string + TLSInsecure bool + ProxyURL string + + CIStorageName string + CIStoragePath string + CIStorageSSHPrivateKey string + + VMTemplateID int + CPUSockets *int + CPUCores *int + MemoryMB int + DiskSizeGB *int + DiskName *string +} + +type provider struct { + configVarResolver *providerconfig.ConfigVarResolver +} + +// Server holds the proxmox VM information. +type Server struct { + configQemu *proxmox.ConfigQemu + vmRef *proxmox.VmRef + status instance.Status + addresses map[string]corev1.NodeAddressType +} + +// Ensures that Server implements Instance interface. +var _ instance.Instance = &Server{} + +// Ensures that provider implements Provider interface. +var _ cloudprovidertypes.Provider = &provider{} + +// Name returns the instance name. +func (server *Server) Name() string { + return server.configQemu.Name +} + +// ID returns the instance identifier. +func (server *Server) ID() string { + return fmt.Sprintf("node-%s-vm-%d", server.vmRef.Node(), server.vmRef.VmId()) +} + +// Addresses returns a list of addresses associated with the instance. +func (server *Server) Addresses() map[string]corev1.NodeAddressType { + return server.addresses +} + +// Status returns the instance status. +func (server *Server) Status() instance.Status { + return server.status +} + +// ProviderID is n/a for Proxmox. +func (*Server) ProviderID() string { + return "" +} + +func New(configVarResolver *providerconfig.ConfigVarResolver) cloudprovidertypes.Provider { + provider := &provider{configVarResolver: configVarResolver} + return provider +} + +func (p *provider) getConfig(provSpec clusterv1alpha1.ProviderSpec) (*Config, *proxmoxtypes.RawConfig, error) { + if provSpec.Value == nil { + return nil, nil, fmt.Errorf("machine.spec.providerconfig.value is nil") + } + + pconfig, err := providerconfigtypes.GetConfig(provSpec) + if err != nil { + return nil, nil, err + } + + if pconfig.OperatingSystemSpec.Raw == nil { + return nil, nil, errors.New("operatingSystemSpec in the MachineDeployment cannot be empty") + } + + rawConfig, err := proxmoxtypes.GetConfig(*pconfig) + if err != nil { + return nil, nil, err + } + + config := Config{} + + config.Endpoint, err = p.configVarResolver.GetConfigVarStringValueOrEnv(rawConfig.Endpoint, "PM_API_ENDPOINT") + if err != nil { + return nil, nil, err + } + + config.UserID, err = p.configVarResolver.GetConfigVarStringValueOrEnv(rawConfig.UserID, "PM_API_USER_ID") + if err != nil { + return nil, nil, err + } + + config.Token, err = p.configVarResolver.GetConfigVarStringValueOrEnv(rawConfig.Token, "PM_API_TOKEN") + if err != nil { + return nil, nil, err + } + + config.TLSInsecure, err = p.configVarResolver.GetConfigVarBoolValueOrEnv(rawConfig.AllowInsecure, "PM_TLS_INSECURE") + if err != nil { + return nil, nil, err + } + + config.ProxyURL, err = p.configVarResolver.GetConfigVarStringValueOrEnv(rawConfig.ProxyURL, "PM_PROXY_URL") + if err != nil { + return nil, nil, err + } + + config.CIStorageSSHPrivateKey, err = p.configVarResolver.GetConfigVarStringValue(rawConfig.CIStorageSSHPrivateKey) + if err != nil { + return nil, nil, err + } + + config.CIStorageName = *rawConfig.CIStorageName + config.CIStoragePath = *rawConfig.CIStoragePath + + config.VMTemplateID = rawConfig.VMTemplateID + config.CPUCores = rawConfig.CPUCores + config.CPUSockets = rawConfig.CPUSockets + config.MemoryMB = rawConfig.MemoryMB + config.DiskName = rawConfig.DiskName + config.DiskSizeGB = rawConfig.DiskSizeGB + + return &config, rawConfig, nil +} + +// AddDefaults will read the MachineSpec and apply defaults for provider specific fields. +func (p *provider) AddDefaults(spec clusterv1alpha1.MachineSpec) (clusterv1alpha1.MachineSpec, error) { + _, rawConfig, err := p.getConfig(spec.ProviderSpec) + if err != nil { + return spec, err + } + + if rawConfig.CIStorageName == nil { + rawConfig.CIStorageName = proxmox.PointerString(defaultCIStorageName) + } + + if rawConfig.CIStoragePath == nil { + rawConfig.CIStoragePath = proxmox.PointerString(defaultCIStoragePath) + } + + if rawConfig.DiskName == nil { + rawConfig.DiskName = proxmox.PointerString(defaultDiskName) + } + + spec.ProviderSpec.Value, err = setProviderSpec(*rawConfig, spec.ProviderSpec) + return spec, err +} + +func (p *provider) Validate(ctx context.Context, spec clusterv1alpha1.MachineSpec) error { + config, _, err := p.getConfig(spec.ProviderSpec) + if err != nil { + return cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to parse machineSpec: %v", err), + } + } + + c, err := GetClientSet(config) + if err != nil { + return cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to construct client: %v", err), + } + } + + templateExists, err := c.checkTemplateExists(config.VMTemplateID) + if err != nil { + return err + } + if !templateExists { + return cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("%q is not a VM template", config.VMTemplateID), + } + } + + return nil +} + +func (p *provider) Get(ctx context.Context, machine *clusterv1alpha1.Machine, data *cloudprovidertypes.ProviderData) (instance.Instance, error) { + config, _, err := p.getConfig(machine.Spec.ProviderSpec) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to parse machineSpec: %v", err), + } + } + + c, err := GetClientSet(config) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to construct client: %v", err), + } + } + + vmr, err := c.getVMRefByName(machine.Name) + if err != nil { + return nil, err + } + + configQemu, err := proxmox.NewConfigQemuFromApi(vmr, c.Client) + if err != nil { + return nil, fmt.Errorf("failed to fetch config of VM: %w", err) + } + + addresses, err := c.getIPsByVMRef(vmr) + if err != nil { + return nil, fmt.Errorf("failed to get IP addresses of VM: %w", err) + } + + var status instance.Status + vmState, err := c.GetVmState(vmr) + if err != nil { + return nil, fmt.Errorf("failed to get state of VM: %w", err) + } + switch vmState["status"] { + case "running": + status = instance.StatusRunning + case "stopped": + status = instance.StatusCreating + default: + status = instance.StatusUnknown + } + + return &Server{ + vmRef: vmr, + configQemu: configQemu, + addresses: addresses, + status: status, + }, nil +} + +func (*provider) GetCloudConfig(spec clusterv1alpha1.MachineSpec) (config string, name string, err error) { + return "", "", nil +} + +func (p *provider) Create(ctx context.Context, machine *clusterv1alpha1.Machine, data *cloudprovidertypes.ProviderData, userdata string) (instance.Instance, error) { + config, _, err := p.getConfig(machine.Spec.ProviderSpec) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to parse machineSpec: %v", err), + } + } + + c, err := GetClientSet(config) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to construct client: %v", err), + } + } + + vm, err := p.create(ctx, c, config, machine, userdata) + if err != nil { + _, cleanupErr := p.Cleanup(ctx, machine, data) + if cleanupErr != nil { + return nil, fmt.Errorf("cleaning up failed with err %v after creation failed with err %w", cleanupErr, err) + } + return nil, err + } + + return vm, nil +} + +func (p *provider) create(ctx context.Context, c *ClientSet, config *Config, machine *clusterv1alpha1.Machine, userdata string) (*Server, error) { + sourceVmr := proxmox.NewVmRef(config.VMTemplateID) + nodes, err := c.getNodeList() + if err != nil && len(nodes.Data) == 0 { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: "failed to retrieve any nodes", + } + } + // The template needs to be set as HA. This makes it available on any node, + // so it's sufficient to pick any existing as source node. + sourceVmr.SetNode(nodes.Data[0].Name) + + if err := c.CheckVmRef(sourceVmr); err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to retrieve VM template %q", config.VMTemplateID), + } + } + + vmID, err := c.GetNextID(0) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to get next available VM ID: %v", err), + } + } + + configQemu := &proxmox.ConfigQemu{ + Name: machine.Name, + VmID: vmID, + FullClone: proxmox.PointerInt(0), + } + + targetVmr := proxmox.NewVmRef(vmID) + targetNode, err := c.selectNode(*config.CPUSockets, config.MemoryMB) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to select target node: %v", err), + } + } + targetVmr.SetNode(targetNode) + + err = configQemu.CloneVm(sourceVmr, targetVmr, c.Client) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to create VM: %v", err), + } + } + + configClone, err := proxmox.NewConfigQemuFromApi(targetVmr, c.Client) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to fetch config of newly created VM: %v", err), + } + } + + configClone.VmID = vmID + configClone.QemuSockets = *config.CPUSockets + configClone.QemuCores = *config.CPUCores + configClone.Memory = config.MemoryMB + + filePath, err := c.copyUserdata(ctx, targetNode, config.CIStoragePath, config.UserID, userdata, config.CIStorageSSHPrivateKey, vmID) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to upload cloud-init userdata: %v", err), + } + } + configClone.CIcustom = fmt.Sprintf("user=%s:%s", config.CIStorageName, filePath) + configClone.Ipconfig0 = "ip=dhcp" + + err = configClone.UpdateConfig(targetVmr, c.Client) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to update VM size: %v", err), + } + } + + _, err = c.ResizeQemuDiskRaw(targetVmr, *config.DiskName, fmt.Sprintf("%dG", *config.DiskSizeGB)) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to update disk size: %v", err), + } + } + + exitStatus, err := c.StartVm(targetVmr) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to start VM: %v", err), + } + } + if exitStatus != exitStatusSuccess { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("starting VM returned unexpected status: %q", exitStatus), + } + } + + deadline := time.Now().Add(time.Second * taskTimeout) + for time.Now().Before(deadline) { + if _, err = c.QemuAgentPing(targetVmr); err == nil { + break + } + time.Sleep(time.Second) + } + + addresses, err := c.getIPsByVMRef(targetVmr) + if err != nil { + return nil, cloudprovidererrors.TerminalError{ + Reason: common.CreateMachineError, + Message: fmt.Sprintf("failed to get IP addresses of VM: %v", err), + } + } + + return &Server{ + vmRef: targetVmr, + configQemu: configClone, + addresses: addresses, + status: instance.StatusRunning, + }, nil +} + +func (p *provider) Cleanup(ctx context.Context, machine *clusterv1alpha1.Machine, data *cloudprovidertypes.ProviderData) (bool, error) { + config, _, err := p.getConfig(machine.Spec.ProviderSpec) + if err != nil { + return false, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to parse machineSpec: %v", err), + } + } + + c, err := GetClientSet(config) + if err != nil { + return false, cloudprovidererrors.TerminalError{ + Reason: common.InvalidConfigurationMachineError, + Message: fmt.Sprintf("failed to construct client: %v", err), + } + } + + vmr, err := c.getVMRefByName(machine.Name) + if err != nil { + if cloudprovidererrors.IsNotFound(err) { + // VM is already gone + return true, nil + } + return false, err + } + + exitStatusStop, err := c.StopVm(vmr) + if err != nil { + return false, fmt.Errorf("failed to start VM: %w", err) + } + if exitStatusStop != exitStatusSuccess { + return false, fmt.Errorf("starting VM returned unexpected status: %q", exitStatusStop) + } + + deleteParams := map[string]interface{}{ + // Clean all disks matching this VM ID even not referenced in the current VM config. + "destroy-unreferenced-disks": true, + // Remove all traces of this VM ID (backup, replication, HA) + "purge": true, + } + exitStatusDelete, err := c.DeleteVmParams(vmr, deleteParams) + + return exitStatusDelete == exitStatusSuccess, err +} + +func (p *provider) MachineMetricsLabels(machine *clusterv1alpha1.Machine) (map[string]string, error) { + labels := make(map[string]string) + + config, _, err := p.getConfig(machine.Spec.ProviderSpec) + if err != nil { + return labels, fmt.Errorf("failed to parse config: %w", err) + } + + labels["size"] = fmt.Sprintf("%d-cpus-%d-mb", config.CPUSockets, config.MemoryMB) + labels["templateID"] = fmt.Sprintf("%d", config.VMTemplateID) + + return labels, nil +} + +func (*provider) MigrateUID(ctx context.Context, machine *clusterv1alpha1.Machine, newUID types.UID) error { + return nil +} + +func (*provider) SetMetricsForMachines(machines clusterv1alpha1.MachineList) error { + return nil +} + +func setProviderSpec(rawConfig proxmoxtypes.RawConfig, s clusterv1alpha1.ProviderSpec) (*runtime.RawExtension, error) { + if s.Value == nil { + return nil, fmt.Errorf("machine.spec.providerconfig.value is nil") + } + + pconfig, err := providerconfigtypes.GetConfig(s) + if err != nil { + return nil, err + } + + rawCloudProviderSpec, err := json.Marshal(rawConfig) + if err != nil { + return nil, err + } + + pconfig.CloudProviderSpec = runtime.RawExtension{Raw: rawCloudProviderSpec} + rawPconfig, err := json.Marshal(pconfig) + if err != nil { + return nil, err + } + + return &runtime.RawExtension{Raw: rawPconfig}, nil +} diff --git a/pkg/cloudprovider/provider/proxmox/types/types.go b/pkg/cloudprovider/provider/proxmox/types/types.go new file mode 100644 index 000000000..724af7ba6 --- /dev/null +++ b/pkg/cloudprovider/provider/proxmox/types/types.go @@ -0,0 +1,90 @@ +/* +Copyright 2022 The Machine Controller Authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package types + +import ( + "github.com/kubermatic/machine-controller/pkg/jsonutil" + providerconfigtypes "github.com/kubermatic/machine-controller/pkg/providerconfig/types" +) + +type RawConfig struct { + Endpoint providerconfigtypes.ConfigVarString `json:"endpoint"` + UserID providerconfigtypes.ConfigVarString `json:"userID"` + Token providerconfigtypes.ConfigVarString `json:"token"` + AllowInsecure providerconfigtypes.ConfigVarBool `json:"allowInsecure"` + ProxyURL providerconfigtypes.ConfigVarString `json:"proxyURL,omitempty"` + + CIStorageSSHPrivateKey providerconfigtypes.ConfigVarString `json:"ciStorageSSHPrivateKey"` + CIStorageName *string `json:"ciStorageName,omitempty"` + CIStoragePath *string `json:"ciStoragePath,omitempty"` + + VMTemplateID int `json:"vmTemplateID"` + CPUSockets *int `json:"cpuSockets"` + CPUCores *int `json:"cpuCores,omitempty"` + MemoryMB int `json:"memoryMB"` + DiskSizeGB *int `json:"diskSizeGB,omitempty"` + DiskName *string `json:"diskName,omitempty"` +} + +func GetConfig(pconfig providerconfigtypes.Config) (*RawConfig, error) { + rawConfig := &RawConfig{} + + return rawConfig, jsonutil.StrictUnmarshal(pconfig.CloudProviderSpec.Raw, rawConfig) +} + +// NodeList represents the response body of GET /api2/json/nodes. +type NodeList struct { + Data []Node `json:"data"` +} + +// Node is one single node in the response of GET /api2/json/nodes. +type Node struct { + CPUCount int `json:"maxcpu,omitempty"` + CPUUtilization float64 `json:"cpu,omitempty"` + MemoryAvailable int `json:"maxmem,omitempty"` + MemoryUsed int `json:"mem,omitempty"` + Name string `json:"node"` + SSLFingerprint string `json:"ssl_fingerprint,omitempty"` + Status string `json:"status"` + SupportLevel string `json:"level,omitempty"` + Uptime int `json:"uptime"` +} + +// NodeNetworkDeviceList represents the response body of GET /api2/json/nodes//network. +type NodeNetworkDeviceList struct { + Data []NodeNetworkDevice `json:"data"` +} + +// NodeNetworkDevice is one single network device of a node in the response of GET /api2/json/nodes//network. +type NodeNetworkDevice struct { + Active *int `json:"active"` + Address *string `json:"address"` + Autostart *int `json:"autostart"` + BridgeFD *string `json:"bridge_fd"` + BridgePorts *string `json:"bridge_ports"` + BridgeSTP *string `json:"bridge_stp"` + CIDR *string `json:"cidr"` + Exists *int `json:"exists"` + Families []string `json:"families"` + Gateway *string `json:"gateway"` + Iface string `json:"iface"` + MethodIPv4 *string `json:"method"` + MethodIPv6 *string `json:"method6"` + Netmask *string `json:"netmask"` + Priority int `json:"priority"` + Type string `json:"type"` +} diff --git a/pkg/providerconfig/types/types.go b/pkg/providerconfig/types/types.go index b4e12ad73..571435f60 100644 --- a/pkg/providerconfig/types/types.go +++ b/pkg/providerconfig/types/types.go @@ -58,6 +58,7 @@ const ( CloudProviderLinode CloudProvider = "linode" CloudProviderNutanix CloudProvider = "nutanix" CloudProviderOpenstack CloudProvider = "openstack" + CloudProviderProxmox CloudProvider = "proxmox" CloudProviderVsphere CloudProvider = "vsphere" CloudProviderVMwareCloudDirector CloudProvider = "vmware-cloud-director" CloudProviderFake CloudProvider = "fake" diff --git a/pkg/userdata/ubuntu/provider.go b/pkg/userdata/ubuntu/provider.go index 5a83a8a1b..9c9cb8a2e 100644 --- a/pkg/userdata/ubuntu/provider.go +++ b/pkg/userdata/ubuntu/provider.go @@ -216,6 +216,9 @@ write_files: {{- if eq .CloudProviderName "nutanix" }} open-iscsi \ {{- end }} + {{- if eq .CloudProviderName "proxmox" }} + qemu-guest-agent \ + {{- end }} ipvsadm {{- /* iscsid service is required on Nutanix machines for CSI driver to attach volumes. */}} @@ -223,6 +226,11 @@ write_files: systemctl enable --now iscsid {{ end }} + {{- /* qemu-guest-agent is required for proxmox VMs to obtain their IP addresses */}} + {{- if eq .CloudProviderName "proxmox" }} + systemctl enable --now qemu-guest-agent + {{ end }} + # Update grub to include kernel command options to enable swap accounting. # Exclude alibaba cloud until this is fixed https://github.com/kubermatic/machine-controller/issues/682 {{ if eq .CloudProviderName "alibaba" }} diff --git a/pkg/userdata/ubuntu/provider_test.go b/pkg/userdata/ubuntu/provider_test.go index 14c230c61..1cc171db0 100644 --- a/pkg/userdata/ubuntu/provider_test.go +++ b/pkg/userdata/ubuntu/provider_test.go @@ -559,6 +559,32 @@ func TestUserDataGeneration(t *testing.T) { DistUpgradeOnBoot: false, }, }, + { + name: "proxmox", + providerSpec: &providerconfigtypes.Config{ + CloudProvider: "proxmox", + SSHPublicKeys: []string{"ssh-rsa AAABBB"}, + OverwriteCloudConfig: stringPtr("custom\ncloud\nconfig"), + }, + spec: clusterv1alpha1.MachineSpec{ + ObjectMeta: metav1.ObjectMeta{ + Name: "node1", + }, + Versions: clusterv1alpha1.MachineVersionInfo{ + Kubelet: "1.23.5", + }, + }, + ccProvider: &fakeCloudConfigProvider{ + name: "proxmox", + config: "{proxmox-config:true}", + err: nil, + }, + DNSIPs: []net.IP{net.ParseIP("10.10.10.10")}, + kubernetesCACert: "CACert", + osConfig: &Config{ + DistUpgradeOnBoot: false, + }, + }, }...) for _, test := range tests { diff --git a/pkg/userdata/ubuntu/testdata/proxmox.yaml b/pkg/userdata/ubuntu/testdata/proxmox.yaml new file mode 100644 index 000000000..a65db3b4c --- /dev/null +++ b/pkg/userdata/ubuntu/testdata/proxmox.yaml @@ -0,0 +1,453 @@ +#cloud-config + +hostname: node1 + + +ssh_pwauth: false +ssh_authorized_keys: +- "ssh-rsa AAABBB" + +write_files: + +- path: "/etc/systemd/journald.conf.d/max_disk_use.conf" + content: | + [Journal] + SystemMaxUse=5G + + +- path: "/opt/load-kernel-modules.sh" + permissions: "0755" + content: | + #!/usr/bin/env bash + set -euo pipefail + + modprobe ip_vs + modprobe ip_vs_rr + modprobe ip_vs_wrr + modprobe ip_vs_sh + + if modinfo nf_conntrack_ipv4 &> /dev/null; then + modprobe nf_conntrack_ipv4 + else + modprobe nf_conntrack + fi + + +- path: "/etc/sysctl.d/k8s.conf" + content: | + net.bridge.bridge-nf-call-ip6tables = 1 + net.bridge.bridge-nf-call-iptables = 1 + kernel.panic_on_oops = 1 + kernel.panic = 10 + net.ipv4.ip_forward = 1 + vm.overcommit_memory = 1 + fs.inotify.max_user_watches = 1048576 + fs.inotify.max_user_instances = 8192 + + +- path: "/etc/default/grub.d/60-swap-accounting.cfg" + content: | + # Added by kubermatic machine-controller + # Enable cgroups memory and swap accounting + GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1" + +- path: "/opt/bin/setup" + permissions: "0755" + content: | + #!/bin/bash + set -xeuo pipefail + if systemctl is-active ufw; then systemctl stop ufw; fi + systemctl mask ufw + systemctl restart systemd-modules-load.service + sysctl --system + apt-get update + + DEBIAN_FRONTEND=noninteractive apt-get -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install -y \ + curl \ + ca-certificates \ + ceph-common \ + cifs-utils \ + conntrack \ + e2fsprogs \ + ebtables \ + ethtool \ + glusterfs-client \ + iptables \ + jq \ + kmod \ + openssh-client \ + nfs-common \ + socat \ + util-linux \ + qemu-guest-agent \ + ipvsadm + systemctl enable --now qemu-guest-agent + + + # Update grub to include kernel command options to enable swap accounting. + # Exclude alibaba cloud until this is fixed https://github.com/kubermatic/machine-controller/issues/682 + + + apt-get update + apt-get install -y apt-transport-https ca-certificates curl software-properties-common lsb-release + curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - + add-apt-repository "deb https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" + + mkdir -p /etc/systemd/system/containerd.service.d /etc/systemd/system/docker.service.d + + cat <"$kube_sum_file" + + for bin in kubelet kubeadm kubectl; do + curl -Lfo "$kube_dir/$bin" "$kube_base_url/$bin" + chmod +x "$kube_dir/$bin" + sum=$(curl -Lf "$kube_base_url/$bin.sha256") + echo "$sum $kube_dir/$bin" >>"$kube_sum_file" + done + sha256sum -c "$kube_sum_file" + + for bin in kubelet kubeadm kubectl; do + ln -sf "$kube_dir/$bin" "$opt_bin"/$bin + done + + if [[ ! -x /opt/bin/health-monitor.sh ]]; then + curl -Lfo /opt/bin/health-monitor.sh https://raw.githubusercontent.com/kubermatic/machine-controller/7967a0af2b75f29ad2ab227eeaa26ea7b0f2fbde/pkg/userdata/scripts/health-monitor.sh + chmod +x /opt/bin/health-monitor.sh + fi + + # set kubelet nodeip environment variable + /opt/bin/setup_net_env.sh + + systemctl enable --now kubelet + systemctl enable --now --no-block kubelet-healthcheck.service + systemctl disable setup.service + +- path: "/opt/bin/supervise.sh" + permissions: "0755" + content: | + #!/bin/bash + set -xeuo pipefail + while ! "$@"; do + sleep 1 + done + +- path: "/opt/disable-swap.sh" + permissions: "0755" + content: | + sed -i.orig '/.*swap.*/d' /etc/fstab + swapoff -a + +- path: "/etc/systemd/system/kubelet.service" + content: | + [Unit] + After=docker.service + Requires=docker.service + + Description=kubelet: The Kubernetes Node Agent + Documentation=https://kubernetes.io/docs/home/ + + [Service] + User=root + Restart=always + StartLimitInterval=0 + RestartSec=10 + CPUAccounting=true + MemoryAccounting=true + + Environment="PATH=/opt/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin/" + EnvironmentFile=-/etc/environment + + ExecStartPre=/bin/bash /opt/load-kernel-modules.sh + + ExecStartPre=/bin/bash /opt/disable-swap.sh + + ExecStartPre=/bin/bash /opt/bin/setup_net_env.sh + ExecStart=/opt/bin/kubelet $KUBELET_EXTRA_ARGS \ + --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ + --config=/etc/kubernetes/kubelet.conf \ + --cert-dir=/etc/kubernetes/pki \ + --cloud-provider=proxmox \ + --cloud-config=/etc/kubernetes/cloud-config \ + --hostname-override=node1 \ + --exit-on-lock-contention \ + --lock-file=/tmp/kubelet.lock \ + --container-runtime=docker \ + --container-runtime-endpoint=unix:///var/run/dockershim.sock \ + --network-plugin=cni \ + --node-ip ${KUBELET_NODE_IP} + + [Install] + WantedBy=multi-user.target + +- path: "/etc/systemd/system/kubelet.service.d/extras.conf" + content: | + [Service] + Environment="KUBELET_EXTRA_ARGS=--resolv-conf=/run/systemd/resolve/resolv.conf" + +- path: "/etc/kubernetes/cloud-config" + permissions: "0600" + content: | + custom + cloud + config + +- path: "/opt/bin/setup_net_env.sh" + permissions: "0755" + content: | + #!/usr/bin/env bash + echodate() { + echo "[$(date -Is)]" "$@" + } + + # get the default interface IP address + DEFAULT_IFC_IP=$(ip -o route get 1 | grep -oP "src \K\S+") + + # get the full hostname + FULL_HOSTNAME=$(hostname -f) + + if [ -z "${DEFAULT_IFC_IP}" ] + then + echodate "Failed to get IP address for the default route interface" + exit 1 + fi + + # write the nodeip_env file + # we need the line below because flatcar has the same string "coreos" in that file + if grep -q coreos /etc/os-release + then + echo -e "KUBELET_NODE_IP=${DEFAULT_IFC_IP}\nKUBELET_HOSTNAME=${FULL_HOSTNAME}" > /etc/kubernetes/nodeip.conf + elif [ ! -d /etc/systemd/system/kubelet.service.d ] + then + echodate "Can't find kubelet service extras directory" + exit 1 + else + echo -e "[Service]\nEnvironment=\"KUBELET_NODE_IP=${DEFAULT_IFC_IP}\"\nEnvironment=\"KUBELET_HOSTNAME=${FULL_HOSTNAME}\"" > /etc/systemd/system/kubelet.service.d/nodeip.conf + fi + + +- path: "/etc/kubernetes/bootstrap-kubelet.conf" + permissions: "0600" + content: | + apiVersion: v1 + clusters: + - cluster: + certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUVXakNDQTBLZ0F3SUJBZ0lKQUxmUmxXc0k4WVFITUEwR0NTcUdTSWIzRFFFQkJRVUFNSHN4Q3pBSkJnTlYKQkFZVEFsVlRNUXN3Q1FZRFZRUUlFd0pEUVRFV01CUUdBMVVFQnhNTlUyRnVJRVp5WVc1amFYTmpiekVVTUJJRwpBMVVFQ2hNTFFuSmhaR1pwZEhwcGJtTXhFakFRQmdOVkJBTVRDV3h2WTJGc2FHOXpkREVkTUJzR0NTcUdTSWIzCkRRRUpBUllPWW5KaFpFQmtZVzVuWVM1amIyMHdIaGNOTVRRd056RTFNakEwTmpBMVdoY05NVGN3TlRBME1qQTAKTmpBMVdqQjdNUXN3Q1FZRFZRUUdFd0pWVXpFTE1Ba0dBMVVFQ0JNQ1EwRXhGakFVQmdOVkJBY1REVk5oYmlCRwpjbUZ1WTJselkyOHhGREFTQmdOVkJBb1RDMEp5WVdSbWFYUjZhVzVqTVJJd0VBWURWUVFERXdsc2IyTmhiR2h2CmMzUXhIVEFiQmdrcWhraUc5dzBCQ1FFV0RtSnlZV1JBWkdGdVoyRXVZMjl0TUlJQklqQU5CZ2txaGtpRzl3MEIKQVFFRkFBT0NBUThBTUlJQkNnS0NBUUVBdDVmQWpwNGZUY2VrV1VUZnpzcDBreWloMU9ZYnNHTDBLWDFlUmJTUwpSOE9kMCs5UTYySHlueStHRndNVGI0QS9LVThtc3NvSHZjY2VTQUFid2ZieEZLLytzNTFUb2JxVW5PUlpyT29UClpqa1V5Z2J5WERTSzk5WUJiY1IxUGlwOHZ3TVRtNFhLdUx0Q2lnZUJCZGpqQVFkZ1VPMjhMRU5HbHNNbm1lWWsKSmZPRFZHblZtcjVMdGI5QU5BOElLeVRmc25ISjRpT0NTL1BsUGJVajJxN1lub1ZMcG9zVUJNbGdVYi9DeWtYMwptT29MYjR5SkpReUEvaVNUNlp4aUlFajM2RDR5V1o1bGc3WUpsK1VpaUJRSEdDblBkR3lpcHFWMDZleDBoZVlXCmNhaVc4TFdaU1VROTNqUStXVkNIOGhUN0RRTzFkbXN2VW1YbHEvSmVBbHdRL1FJREFRQUJvNEhnTUlIZE1CMEcKQTFVZERnUVdCQlJjQVJPdGhTNFA0VTd2VGZqQnlDNTY5UjdFNkRDQnJRWURWUjBqQklHbE1JR2lnQlJjQVJPdApoUzRQNFU3dlRmakJ5QzU2OVI3RTZLRi9wSDB3ZXpFTE1Ba0dBMVVFQmhNQ1ZWTXhDekFKQmdOVkJBZ1RBa05CCk1SWXdGQVlEVlFRSEV3MVRZVzRnUm5KaGJtTnBjMk52TVJRd0VnWURWUVFLRXd0Q2NtRmtabWwwZW1sdVl6RVMKTUJBR0ExVUVBeE1KYkc5allXeG9iM04wTVIwd0d3WUpLb1pJaHZjTkFRa0JGZzVpY21Ga1FHUmhibWRoTG1OdgpiWUlKQUxmUmxXc0k4WVFITUF3R0ExVWRFd1FGTUFNQkFmOHdEUVlKS29aSWh2Y05BUUVGQlFBRGdnRUJBRzZoClU5ZjlzTkgwLzZvQmJHR3kyRVZVMFVnSVRVUUlyRldvOXJGa3JXNWsvWGtEalFtKzNsempUMGlHUjRJeEUvQW8KZVU2c1FodWE3d3JXZUZFbjQ3R0w5OGxuQ3NKZEQ3b1pOaEZtUTk1VGIvTG5EVWpzNVlqOWJyUDBOV3pYZllVNApVSzJabklOSlJjSnBCOGlSQ2FDeEU4RGRjVUYwWHFJRXE2cEEyNzJzbm9MbWlYTE12Tmwza1lFZG0ramU2dm9ECjU4U05WRVVzenR6UXlYbUpFaENwd1ZJMEE2UUNqelhqK3F2cG13M1paSGk4SndYZWk4WlpCTFRTRkJraThaN24Kc0g5QkJIMzgvU3pVbUFONFFIU1B5MWdqcW0wME9BRThOYVlEa2gvYnpFNGQ3bUxHR01XcC9XRTNLUFN1ODJIRgprUGU2WG9TYmlMbS9reGszMlQwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t + server: https://server:443 + name: "" + contexts: null + current-context: "" + kind: Config + preferences: {} + users: + - name: "" + user: + token: my-token + + +- path: "/etc/kubernetes/pki/ca.crt" + content: | + -----BEGIN CERTIFICATE----- + MIIEWjCCA0KgAwIBAgIJALfRlWsI8YQHMA0GCSqGSIb3DQEBBQUAMHsxCzAJBgNV + BAYTAlVTMQswCQYDVQQIEwJDQTEWMBQGA1UEBxMNU2FuIEZyYW5jaXNjbzEUMBIG + A1UEChMLQnJhZGZpdHppbmMxEjAQBgNVBAMTCWxvY2FsaG9zdDEdMBsGCSqGSIb3 + DQEJARYOYnJhZEBkYW5nYS5jb20wHhcNMTQwNzE1MjA0NjA1WhcNMTcwNTA0MjA0 + NjA1WjB7MQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0ExFjAUBgNVBAcTDVNhbiBG + cmFuY2lzY28xFDASBgNVBAoTC0JyYWRmaXR6aW5jMRIwEAYDVQQDEwlsb2NhbGhv + c3QxHTAbBgkqhkiG9w0BCQEWDmJyYWRAZGFuZ2EuY29tMIIBIjANBgkqhkiG9w0B + AQEFAAOCAQ8AMIIBCgKCAQEAt5fAjp4fTcekWUTfzsp0kyih1OYbsGL0KX1eRbSS + R8Od0+9Q62Hyny+GFwMTb4A/KU8mssoHvcceSAAbwfbxFK/+s51TobqUnORZrOoT + ZjkUygbyXDSK99YBbcR1Pip8vwMTm4XKuLtCigeBBdjjAQdgUO28LENGlsMnmeYk + JfODVGnVmr5Ltb9ANA8IKyTfsnHJ4iOCS/PlPbUj2q7YnoVLposUBMlgUb/CykX3 + mOoLb4yJJQyA/iST6ZxiIEj36D4yWZ5lg7YJl+UiiBQHGCnPdGyipqV06ex0heYW + caiW8LWZSUQ93jQ+WVCH8hT7DQO1dmsvUmXlq/JeAlwQ/QIDAQABo4HgMIHdMB0G + A1UdDgQWBBRcAROthS4P4U7vTfjByC569R7E6DCBrQYDVR0jBIGlMIGigBRcAROt + hS4P4U7vTfjByC569R7E6KF/pH0wezELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNB + MRYwFAYDVQQHEw1TYW4gRnJhbmNpc2NvMRQwEgYDVQQKEwtCcmFkZml0emluYzES + MBAGA1UEAxMJbG9jYWxob3N0MR0wGwYJKoZIhvcNAQkBFg5icmFkQGRhbmdhLmNv + bYIJALfRlWsI8YQHMAwGA1UdEwQFMAMBAf8wDQYJKoZIhvcNAQEFBQADggEBAG6h + U9f9sNH0/6oBbGGy2EVU0UgITUQIrFWo9rFkrW5k/XkDjQm+3lzjT0iGR4IxE/Ao + eU6sQhua7wrWeFEn47GL98lnCsJdD7oZNhFmQ95Tb/LnDUjs5Yj9brP0NWzXfYU4 + UK2ZnINJRcJpB8iRCaCxE8DdcUF0XqIEq6pA272snoLmiXLMvNl3kYEdm+je6voD + 58SNVEUsztzQyXmJEhCpwVI0A6QCjzXj+qvpmw3ZZHi8JwXei8ZZBLTSFBki8Z7n + sH9BBH38/SzUmAN4QHSPy1gjqm00OAE8NaYDkh/bzE4d7mLGGMWp/WE3KPSu82HF + kPe6XoSbiLm/kxk32T0= + -----END CERTIFICATE----- + +- path: "/etc/systemd/system/setup.service" + permissions: "0644" + content: | + [Install] + WantedBy=multi-user.target + + [Unit] + Requires=network-online.target + After=network-online.target + + [Service] + Type=oneshot + RemainAfterExit=true + EnvironmentFile=-/etc/environment + ExecStart=/opt/bin/supervise.sh /opt/bin/setup + +- path: "/etc/profile.d/opt-bin-path.sh" + permissions: "0644" + content: | + export PATH="/opt/bin:$PATH" + +- path: /etc/docker/daemon.json + permissions: "0644" + content: | + {"exec-opts":["native.cgroupdriver=systemd"],"storage-driver":"overlay2","log-driver":"json-file","log-opts":{"max-file":"5","max-size":"100m"}} + +- path: "/etc/kubernetes/kubelet.conf" + content: | + apiVersion: kubelet.config.k8s.io/v1beta1 + authentication: + anonymous: + enabled: false + webhook: + cacheTTL: 0s + enabled: true + x509: + clientCAFile: /etc/kubernetes/pki/ca.crt + authorization: + mode: Webhook + webhook: + cacheAuthorizedTTL: 0s + cacheUnauthorizedTTL: 0s + cgroupDriver: systemd + clusterDNS: + - 10.10.10.10 + clusterDomain: cluster.local + containerLogMaxSize: 100Mi + cpuManagerReconcilePeriod: 0s + evictionHard: + imagefs.available: 15% + memory.available: 100Mi + nodefs.available: 10% + nodefs.inodesFree: 5% + evictionPressureTransitionPeriod: 0s + featureGates: + RotateKubeletServerCertificate: true + fileCheckFrequency: 0s + httpCheckFrequency: 0s + imageMinimumGCAge: 0s + kind: KubeletConfiguration + kubeReserved: + cpu: 200m + ephemeral-storage: 1Gi + memory: 200Mi + logging: + flushFrequency: 0 + options: + json: + infoBufferSize: "0" + verbosity: 0 + memorySwap: {} + nodeStatusReportFrequency: 0s + nodeStatusUpdateFrequency: 0s + protectKernelDefaults: true + rotateCertificates: true + runtimeRequestTimeout: 0s + serverTLSBootstrap: true + shutdownGracePeriod: 0s + shutdownGracePeriodCriticalPods: 0s + staticPodPath: /etc/kubernetes/manifests + streamingConnectionIdleTimeout: 0s + syncFrequency: 0s + systemReserved: + cpu: 200m + ephemeral-storage: 1Gi + memory: 200Mi + tlsCipherSuites: + - TLS_AES_128_GCM_SHA256 + - TLS_AES_256_GCM_SHA384 + - TLS_CHACHA20_POLY1305_SHA256 + - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 + - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 + - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 + - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 + - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 + - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 + volumePluginDir: /var/lib/kubelet/volumeplugins + volumeStatsAggPeriod: 0s + + +- path: /etc/systemd/system/kubelet-healthcheck.service + permissions: "0644" + content: | + [Unit] + Requires=kubelet.service + After=kubelet.service + + [Service] + ExecStart=/opt/bin/health-monitor.sh kubelet + + [Install] + WantedBy=multi-user.target + + +runcmd: +- systemctl enable --now setup.service