Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate dispatch #177

Merged
merged 2 commits into from Apr 28, 2017
Merged

Migrate dispatch #177

merged 2 commits into from Apr 28, 2017

Conversation

admiyo
Copy link
Contributor

@admiyo admiyo commented Mar 30, 2017

WIP: Unified dispatch function is written, but not yet triggered by queue update.

Copy link
Member

@stu-gott stu-gott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

migrationPod, exists, err := md.vmService.GetMigrationJob(migration)

if err != nil {
logger.Error().Reason(err).Msg("Checking for an existing migration job failed.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but it causes cognitive dissonance for me when we refer to this as a "migration job". it's understandable why we call it that, but the word "job" has specific meaning in Kubernetes which doesn't really apply.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I didn't want to change it yet, but I am starting to refer to this as the migration pod, and the others as the source and destination pods. I'd like to formalize these terms, and move the code for the pod handlers into the vm and migration.go files.

migrationKey interface{}
srcIp clientv1.NodeAddress
destIp kubev1.NodeAddress
srcNodeIp kubev1.Node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable name is a bit misleading. It's a node which has an IP defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I'll adjust the name.

setMigrationPhase(migration, v1.MigrationInProgress)
return
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No matter if you had to create the targetPod, or it was there from a last failed attempt. Here you have to set the migration to InProgress and then return.

Then on subsequent loop entries, you can start right after this line:

switch migration.Status.Phase {

case "":
    do everything until this line and set state to MigrationInProgress
case MigrationInProgress:
   1) check if the target pod is running
   2) set migration to failed if it is in failed/succeeded state
   3) if it is not running, you are done, since we did not yet get informed by the secondary pod watch loop (it will come later, so no reenqueue)
   4) if pod is running, continue with the logic starting from above this comment.

}
// Successful run, so forget the history
forget(key)
done(key)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also think about the possibility to safe conditions on a controller object, to have controller internal substates ...

queue.AddRateLimited(key)
return
}
}
setMigrationPhase(migration, v1.MigrationInProgress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the state in the else if is wrong, since we could error before we reach this line. On a subsequent run the pod might not be scheduled yet. Below you are not testing if the pod is running or scheduled, which can irritate components, relying on the correctness of that field.


switch migration.Status.Phase {
case v1.MigrationUnknown:
// Fetch vm which we want to migrate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover

_, targetPod := investigateTargetPodSituation(migration, podList)

if targetPod == nil {
setMigrationFailed(migration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of a final return, we have to always do a queue.Forget(key). Otherwise, if a migration with the same name is posted again, it might be processed delayed. Also the controller would see a fail history ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here. What about elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere where you return without AddRateLimited.

In the next iteration of refactoring we can finally get rid of that duplication

switch targetPod.Status.Phase {
case k8sv1.PodRunning:
break
//Figure out why. report.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Was a reminder to myself that should not have been left there.

if vm.Status.MigrationNodeName != targetPod.Spec.NodeName {
vm.Status.Phase = v1.Migrating
vm.Status.MigrationNodeName = targetPod.Spec.NodeName
if err = md.updateVm(vm); err != nil {
queue.AddRateLimited(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a return missing?

logger.Error().Reason(err).Msgf("updating migration state failed : %v ", err)
//TODO add a new state that more accurately reflects the process up the this point
//and then use MigrationInProgress to accurately indicate the actual migration pod is running
//setMigrationPhase(migration, v1.MigrationInProgress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MigrationTargetRequested when we call to start the target pod. MigrationRunning will be when we call to start the migration process itself.


switch migrationPod.Status.Phase {
case k8sv1.PodFailed:
setMigrationPhase(migration, v1.MigrationFailed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to first set the VM back and then update the migration state, otherwies we would not reach this point again in an error case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
setMigrationPhase(migration, v1.MigrationSucceeded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the order is good

@admiyo admiyo force-pushed the migrate-dispatch branch 2 times, most recently from a5e8f8f to 87e76c3 Compare April 7, 2017 15:56
jd.migrationQueue.Add(migrationKey)
} else {
logger := logging.DefaultLogger().Object(migration)
logger.Error().Reason(err).Msgf("Updating migration queue failed.", migrationLabel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you meant to include the migrationLabel in the error message.

}

if !exists {
logger.Info().Msgf("VM with name %s does not exist, marking migration as failed", migration.Spec.Selector.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to use logger.Info() or should this condition also be a call to logger.Error()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Yes indeed.

Copy link
Member

@rmohr rmohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some forget calls are missing and one state. The rest looks good.

func (md *MigrationDispatch) updateVm(vmCopy *v1.VM) error {
if _, err := md.vmService.PutVm(vmCopy); err != nil {
logger := logging.DefaultLogger().Object(vmCopy)
logger.V(3).Info().Msg("Enqueuing VM again.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move that message out of the method? it is not requeued here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put better message in its place.

@@ -219,8 +231,8 @@ func (md *MigrationDispatch) Execute(store cache.Store, queue workqueue.RateLimi
return
}
//TODO add a new state that more accurately reflects the process up the this point
//and then use MigrationInProgress to accurately indicate the actual migration pod is running
//setMigrationPhase(migration, v1.MigrationInProgress)
//and then use MigrationScheduled to accurately indicate the actual migration pod is running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were not introducing a new state here, so far you just renamed MigrationPending and MigrationInProgress.
At this point we have scheduled the pod which will do the migration, that is where we are missing the state transission.

_, targetPod := investigateTargetPodSituation(migration, podList)

if targetPod == nil {
setMigrationFailed(migration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere where you return without AddRateLimited.

In the next iteration of refactoring we can finally get rid of that duplication

queue.AddRateLimited(key)
return
}
_, targetPod := investigateTargetPodSituation(migration, podList)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are missing a check if targetPod is nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@rmohr rmohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix our setup, affinity does not seem to work anymore ...

Copy link
Contributor Author

@admiyo admiyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the Hardcoded 10 in the call (at least make a symbolic constant) looks good.

Instead of work being performed in the pod and job dispatches,
the Migration is requeued, which triggers the Migration controller
to reevaluate the migration.  This unifies the migration logic in
the migration controller's ispatch.
Copy link
Member

@rmohr rmohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional test succeed on my workstation. Still some work to do to make jenkins work. Looks good to me.

@rmohr rmohr merged commit fe497ba into kubevirt:master Apr 28, 2017
mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021
mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021
tuning: add some generic link settings (MAC, promisc, MTU)
mzzgaopeng pushed a commit to mzzgaopeng/kubevirt that referenced this pull request Mar 8, 2021
This diff adds documents for kubevirt#177 change (mac/mtu/promisc) in tuning
README.md. Fixes kubevirt#199.
kubevirt-bot pushed a commit to kubevirt-bot/kubevirt that referenced this pull request Dec 7, 2021
* Provisioning of 1.16.2

Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>

* Fix node01.sh for k8s 1.16

Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>

* Update local-volume for 1.16

Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>

* Add cluster-up for k8s 1.16

Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>

* Add sha256 sum for 1.16

Add missing one for 1.15 also

Signed-off-by: Daniel Hiller <daniel.hiller.1972@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants