Migrate dispatch #177

admiyo · 2017-03-30T14:30:08Z

WIP: Unified dispatch function is written, but not yet triggered by queue update.

stu-gott

Nice

stu-gott · 2017-03-30T18:48:34Z

pkg/virt-controller/watch/migration.go

+	migrationPod, exists, err := md.vmService.GetMigrationJob(migration)
+
+	if err != nil {
+		logger.Error().Reason(err).Msg("Checking for an existing migration job failed.")


Minor point, but it causes cognitive dissonance for me when we refer to this as a "migration job". it's understandable why we call it that, but the word "job" has specific meaning in Kubernetes which doesn't really apply.

Agreed. I didn't want to change it yet, but I am starting to refer to this as the migration pod, and the others as the source and destination pods. I'd like to formalize these terms, and move the code for the pod handlers into the vm and migration.go files.

stu-gott · 2017-03-30T18:53:22Z

pkg/virt-controller/watch/migration_test.go

+		migrationKey   interface{}
+		srcIp          clientv1.NodeAddress
+		destIp         kubev1.NodeAddress
+		srcNodeIp      kubev1.Node


This variable name is a bit misleading. It's a node which has an IP defined.

good point. I'll adjust the name.

rmohr · 2017-04-03T10:39:03Z

pkg/virt-controller/watch/migration.go

+		setMigrationPhase(migration, v1.MigrationInProgress)
+		return
+	}
+


No matter if you had to create the targetPod, or it was there from a last failed attempt. Here you have to set the migration to InProgress and then return.

Then on subsequent loop entries, you can start right after this line:

switch migration.Status.Phase { case "": do everything until this line and set state to MigrationInProgress case MigrationInProgress: 1) check if the target pod is running 2) set migration to failed if it is in failed/succeeded state 3) if it is not running, you are done, since we did not yet get informed by the secondary pod watch loop (it will come later, so no reenqueue) 4) if pod is running, continue with the logic starting from above this comment. } // Successful run, so forget the history forget(key) done(key)

Also think about the possibility to safe conditions on a controller object, to have controller internal substates ...

rmohr · 2017-04-03T10:45:37Z

pkg/virt-controller/watch/migration.go

 				queue.AddRateLimited(key)
 				return
 			}
 		}
+		setMigrationPhase(migration, v1.MigrationInProgress)


Setting the state in the else if is wrong, since we could error before we reach this line. On a subsequent run the pod might not be scheduled yet. Below you are not testing if the pod is running or scheduled, which can irritate components, relying on the correctness of that field.

rmohr · 2017-04-07T10:48:02Z

pkg/virt-controller/watch/migration.go

+
+	switch migration.Status.Phase {
+	case v1.MigrationUnknown:
+		// Fetch vm which we want to migrate


rmohr · 2017-04-07T10:53:37Z

pkg/virt-controller/watch/migration.go

+		_, targetPod := investigateTargetPodSituation(migration, podList)
+
+		if targetPod == nil {
+			setMigrationFailed(migration)


In the case of a final return, we have to always do a queue.Forget(key). Otherwise, if a migration with the same name is posted again, it might be processed delayed. Also the controller would see a fail history ...

Done here. What about elsewhere?

Everywhere where you return without AddRateLimited.

In the next iteration of refactoring we can finally get rid of that duplication

rmohr · 2017-04-07T10:54:48Z

pkg/virt-controller/watch/migration.go

+		switch targetPod.Status.Phase {
+		case k8sv1.PodRunning:
+			break
+			//Figure out why. report.


The comment is confusing

Removed. Was a reminder to myself that should not have been left there.

rmohr · 2017-04-07T10:55:50Z

pkg/virt-controller/watch/migration.go

+		if vm.Status.MigrationNodeName != targetPod.Spec.NodeName {
+			vm.Status.Phase = v1.Migrating
+			vm.Status.MigrationNodeName = targetPod.Spec.NodeName
+			if err = md.updateVm(vm); err != nil {
 				queue.AddRateLimited(key)


Is there a return missing?

rmohr · 2017-04-07T10:58:31Z

pkg/virt-controller/watch/migration.go

-				logger.Error().Reason(err).Msgf("updating migration state failed : %v ", err)
+			//TODO add a new state that more accurately reflects the process up the this point
+			//and then use MigrationInProgress to accurately indicate the actual migration pod is running
+			//setMigrationPhase(migration, v1.MigrationInProgress)


Any suggestions already?

MigrationTargetRequested when we call to start the target pod. MigrationRunning will be when we call to start the migration process itself.

rmohr · 2017-04-07T10:59:32Z

pkg/virt-controller/watch/migration.go

+
+		switch migrationPod.Status.Phase {
+		case k8sv1.PodFailed:
+			setMigrationPhase(migration, v1.MigrationFailed)


We have to first set the VM back and then update the migration state, otherwies we would not reach this point again in an error case.

rmohr · 2017-04-07T10:59:43Z

pkg/virt-controller/watch/migration.go

 			}
+			setMigrationPhase(migration, v1.MigrationSucceeded)


Here the order is good

stu-gott · 2017-04-07T15:54:20Z

pkg/virt-controller/watch/job.go

+			jd.migrationQueue.Add(migrationKey)
+		} else {
+			logger := logging.DefaultLogger().Object(migration)
+			logger.Error().Reason(err).Msgf("Updating migration queue failed.", migrationLabel)


Presumably you meant to include the migrationLabel in the error message.

stu-gott · 2017-04-07T15:57:10Z

pkg/virt-controller/watch/migration.go

+	}
+
+	if !exists {
+		logger.Info().Msgf("VM with name %s does not exist, marking migration as failed", migration.Spec.Selector.Name)


Did you mean to use logger.Info() or should this condition also be a call to logger.Error()

Yes. Yes indeed.

rmohr

I think some forget calls are missing and one state. The rest looks good.

rmohr · 2017-04-21T15:54:52Z

pkg/virt-controller/watch/migration.go

+func (md *MigrationDispatch) updateVm(vmCopy *v1.VM) error {
+	if _, err := md.vmService.PutVm(vmCopy); err != nil {
+		logger := logging.DefaultLogger().Object(vmCopy)
+		logger.V(3).Info().Msg("Enqueuing VM again.")


Could you move that message out of the method? it is not requeued here.

Put better message in its place.

rmohr · 2017-04-21T16:02:45Z

pkg/virt-controller/watch/migration.go

@@ -219,8 +231,8 @@ func (md *MigrationDispatch) Execute(store cache.Store, queue workqueue.RateLimi
 				return
 			}
 			//TODO add a new state that more accurately reflects the process up the this point
-			//and then use MigrationInProgress to accurately indicate the actual migration pod is running
-			//setMigrationPhase(migration, v1.MigrationInProgress)
+			//and then use MigrationScheduled to accurately indicate the actual migration pod is running


You were not introducing a new state here, so far you just renamed MigrationPending and MigrationInProgress.
At this point we have scheduled the pod which will do the migration, that is where we are missing the state transission.

rmohr · 2017-04-21T16:09:32Z

pkg/virt-controller/watch/migration.go

+		_, targetPod := investigateTargetPodSituation(migration, podList)
+
+		if targetPod == nil {
+			setMigrationFailed(migration)


Everywhere where you return without AddRateLimited.

In the next iteration of refactoring we can finally get rid of that duplication

rmohr · 2017-04-21T16:12:16Z

pkg/virt-controller/watch/migration.go

+			queue.AddRateLimited(key)
+			return
+		}
+		_, targetPod := investigateTargetPodSituation(migration, podList)


Here we are missing a check if targetPod is nil.

rmohr

We need to fix our setup, affinity does not seem to work anymore ...

admiyo

Aside from the Hardcoded 10 in the call (at least make a symbolic constant) looks good.

Instead of work being performed in the pod and job dispatches, the Migration is requeued, which triggers the Migration controller to reevaluate the migration. This unifies the migration logic in the migration controller's ispatch.

rmohr

Functional test succeed on my workstation. Still some work to do to make jenkins work. Looks good to me.

Macvlan arp proxy

tuning: add some generic link settings (MAC, promisc, MTU)

This diff adds documents for kubevirt#177 change (mac/mtu/promisc) in tuning README.md. Fixes kubevirt#199.

* Provisioning of 1.16.2 Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com> * Fix node01.sh for k8s 1.16 Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com> * Update local-volume for 1.16 Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com> * Add cluster-up for k8s 1.16 Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com> * Add sha256 sum for 1.16 Add missing one for 1.15 also Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>

stu-gott reviewed Mar 30, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch from 1da58be to d1f8e25 Compare March 31, 2017 10:55

rmohr requested changes Apr 3, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch from d1f8e25 to 1ac94d5 Compare April 3, 2017 18:01

rmohr reviewed Apr 7, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch 2 times, most recently from a5e8f8f to 87e76c3 Compare April 7, 2017 15:56

stu-gott reviewed Apr 7, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch from 87e76c3 to b378053 Compare April 7, 2017 17:09

admiyo force-pushed the migrate-dispatch branch from b378053 to 0bbde71 Compare April 21, 2017 01:19

rmohr reviewed Apr 21, 2017

View reviewed changes

rmohr force-pushed the migrate-dispatch branch from 3cb990f to 26b2984 Compare April 24, 2017 12:14

rmohr approved these changes Apr 24, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch from 26b2984 to 9bcc124 Compare April 27, 2017 12:28

admiyo commented Apr 27, 2017

View reviewed changes

admiyo force-pushed the migrate-dispatch branch from 9bcc124 to f8b0019 Compare April 28, 2017 13:34

Extract workqueue from migration controller business logic

95cc007

admiyo force-pushed the migrate-dispatch branch from f8b0019 to 95cc007 Compare April 28, 2017 13:39

rmohr approved these changes Apr 28, 2017

View reviewed changes

rmohr merged commit fe497ba into kubevirt:master Apr 28, 2017

mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021

Merge pull request kubevirt#177 from steveeJ/macvlan-arp-proxy

41ee449

Macvlan arp proxy

mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021

Merge pull request kubevirt#177 from s1061123/tuninig-iplink

fb7a244

tuning: add some generic link settings (MAC, promisc, MTU)

mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021

Add description for mac/mtu/promisc in tuning README.md

9048a61

This diff adds documents for kubevirt#177 change (mac/mtu/promisc) in tuning README.md. Fixes kubevirt#199.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate dispatch #177

Migrate dispatch #177

admiyo commented Mar 30, 2017

stu-gott left a comment

stu-gott Mar 30, 2017

admiyo Apr 4, 2017

stu-gott Mar 30, 2017

admiyo Apr 4, 2017

rmohr Apr 3, 2017

rmohr Apr 3, 2017

rmohr Apr 3, 2017

rmohr Apr 7, 2017

rmohr Apr 7, 2017

admiyo Apr 20, 2017

rmohr Apr 21, 2017

rmohr Apr 7, 2017

admiyo Apr 20, 2017

rmohr Apr 7, 2017

rmohr Apr 7, 2017

admiyo Apr 20, 2017

rmohr Apr 7, 2017

admiyo Apr 20, 2017

rmohr Apr 7, 2017

stu-gott Apr 7, 2017

stu-gott Apr 7, 2017

admiyo Apr 20, 2017

rmohr left a comment

rmohr Apr 21, 2017

admiyo Apr 21, 2017

rmohr Apr 21, 2017

rmohr Apr 21, 2017

rmohr Apr 21, 2017

admiyo Apr 23, 2017

rmohr left a comment

admiyo left a comment

rmohr left a comment

Migrate dispatch #177

Migrate dispatch #177

Conversation

admiyo commented Mar 30, 2017

stu-gott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmohr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmohr left a comment

Choose a reason for hiding this comment

admiyo left a comment

Choose a reason for hiding this comment

rmohr left a comment

Choose a reason for hiding this comment