Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[poc] kubevirt: Add hypershift requirements to primary interface #3364

Closed
wants to merge 2 commits into from

Conversation

qinqon
Copy link
Contributor

@qinqon qinqon commented Jan 19, 2023

- What this PR does and why is it needed
This is a draft PR to implement features needed for hypershift hosted workers:

  • seamless kubevirt live migration
    • stable IPs
    • stable connections
  • east/west and north/south communication
  • access to services

TODO

openshift issues

"kube-dns" not found`

problem getting dns server addLogicalPort failed for cluster1/virt-launcher-worker1-chdqx: failed composing DHCP options: services "kube-dns" not found

For the poc we just fallback to openshift one, but we shoiuld "mimic" what kubevirt dhcp server is doing and read for virt-launcher pod resolv.conf, this can be done by the CNI annotating the pod then ovnkube-master will use those to generate DHCP options.

node-valid-hostname.service

Failed Units: 1
  node-valid-hostname.service

cannot migrate VMI: PVC test1-k7m4b-rhcos is not shared

You are using a client virtctl version that is different from the KubeVirt version running in the cluster
Client Version: v0.58.0
Server Version: v0.59.0-rc.1-19-ga5e3a1b9b
Error migrating VirtualMachine Internal error occurred: admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: DisksNotLiveMigratable, Message: cannot migrate VMI: PVC test1-k7m4b-rhcos is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)

We have to use the patch from other hypershift pocs to change the mode

We have to pass "hostname" same as kubevirt DHCP server and set it with the vm name labels from virt-launcher pod.

Using "hostname" at DHCP options is causing problem with DHCP and the VM end up without IP address

hostname option has to be quoted

Cannot connect after live-migration

Looks like the endpoint is referencing the old virt-launcher pod wich has the same ip address, removing old virt-launchers fix
the issue

apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2023-02-16T14:24:00Z"
  creationTimestamp: "2023-02-16T14:17:08Z"
  name: ssh-test1-2ghnh
  namespace: clusters-test1
  resourceVersion: "220258"
  uid: c7d4a953-f876-40b9-bfa7-a1ce5d3e43d0
subsets:
- addresses:
  - ip: 10.129.2.25
    nodeName: worker-1
    targetRef:
      kind: Pod
      name: virt-launcher-test1-2ghnh-tkhsb
      namespace: clusters-test1
      uid: d779f8a5-ba16-4146-bb4d-0b489ffee47b
  notReadyAddresses:
  - ip: 10.129.2.25
    nodeName: worker-2
    targetRef:
      kind: Pod
      name: virt-launcher-test1-2ghnh-t2x28
      namespace: clusters-test1
      uid: 650dfac8-00af-4f3f-8c14-c40cc3feb778
  ports:
  - port: 22
    protocol: TCP

Sometime the notReadyAddress is not there, looks like ovn-k Recocnile bug that appears when both pods has the same IP

Nice to have

  • Investigate solutions for VM restart

Capk demo

https://asciinema.org/a/hEqzoFveibsuHKynQxZ3RxCHZ

- How to verify it
This is verify with the umbrella project https://github.com/qinqon/ovn-kubevirt

Docs:

@qinqon qinqon force-pushed the spike-kubevirt-migration branch 5 times, most recently from 54aa588 to f883ad9 Compare January 23, 2023 17:32
Copy link

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I made a few comments in line.

go-controller/pkg/cni/helper_linux.go Outdated Show resolved Hide resolved
go-controller/pkg/cni/helper_linux.go Outdated Show resolved Hide resolved
go-controller/pkg/cni/helper_linux.go Show resolved Hide resolved
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 6 times, most recently from 89eabea to 86601e6 Compare January 30, 2023 16:05
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 3 times, most recently from 4c06f2f to f857c9e Compare February 9, 2023 11:34
@qinqon qinqon marked this pull request as ready for review February 9, 2023 12:18
@qinqon
Copy link
Contributor Author

qinqon commented Feb 9, 2023

/hold

@qinqon
Copy link
Contributor Author

qinqon commented Feb 9, 2023

/retest-requiered

@qinqon
Copy link
Contributor Author

qinqon commented Feb 9, 2023

/retest

@ovn-robot
Copy link
Collaborator

Oops, something went wrong:

This workflow is already running

@qinqon qinqon force-pushed the spike-kubevirt-migration branch 5 times, most recently from 0c86f68 to 20d207b Compare February 13, 2023 07:04
@coveralls
Copy link

coveralls commented Feb 13, 2023

Coverage Status

Coverage: 51.713% (-0.3%) from 51.984% when pulling 174c0b7 on qinqon:spike-kubevirt-migration into b1ded23 on ovn-org:master.

@martinkennelly martinkennelly self-assigned this Feb 13, 2023
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 2 times, most recently from e3fd6c1 to 1090b81 Compare February 14, 2023 07:41
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 2 times, most recently from 73959aa to dc1c1eb Compare February 16, 2023 11:01
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 17 times, most recently from 7884add to 0aac346 Compare March 2, 2023 14:53
This allow to use kind.sh and `--deploy` with kind local registry.
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 2 times, most recently from e584c39 to d8ac37b Compare March 8, 2023 09:30
@qinqon qinqon force-pushed the spike-kubevirt-migration branch 2 times, most recently from 174c0b7 to 64d831f Compare March 14, 2023 12:16
Signed-off-by: Enrique Llorente <ellorent@redhat.com>
@qinqon qinqon closed this Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants