Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral Registry Disk Rewrite #460

Merged
merged 9 commits into from Oct 4, 2017
Merged

Conversation

@davidvossel
Copy link
Member

davidvossel commented Sep 21, 2017

  • Renames feature from ContainerRegistryDisk to RegistryDisk
  • Removes ISCSI from feature and uses direct file access now, resolving #373
  • Adds unit testing for registry-disk package
  • Adds direct qcow2 support without converting to raw format
  • Updates documentation

These changes greatly simplify the features implementation and improve overall test coverage as well.

@davidvossel davidvossel changed the title Registry Disk Rewrite Ephemeral Registry Disk Rewrite Sep 21, 2017
@davidvossel davidvossel requested review from rmohr and fabiand Sep 21, 2017
Copy link
Member

stu-gott left a comment

Nice!

return result, err
}

result = hash.Sum(result)

This comment has been minimized.

Copy link
@stu-gott

stu-gott Sep 21, 2017

Member

Wait a sec. when did result get initialized? I'm pretty sure the intent was to pass in nil here.

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 21, 2017

Author Member

hash is an object with state. The Sum() function takes in a byte slice that it appends the result to.

this similar to something like.

mySlice = append(mySlice, otherSlice...)

This comment has been minimized.

Copy link
@stu-gott

stu-gott Sep 21, 2017

Member

Understood. I think the thing that's confusing is the file was copied in earlier then on this line a blank array (result) was passed in.

After looking at it closer, this code probably works as intended, but it feels like it's an accidental side effect of result already being empty.

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member
has.Sum(nil)

should allow making this clearer.

Also note that nil and an empty array are exchangeable for as long as you treat it read-only: https://play.golang.org/p/N0kJ3vvubD, just use nil in the whole function.

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

alrighty, fixed in an update i'm working on.

func SetFileOwnership(username string, file string) error {
usrObj, err := user.Lookup(username)
if err != nil {
logging.DefaultLogger().V(2).Error().Reason(err).Msg(fmt.Sprintf("unable to look up username %s", username))

This comment has been minimized.

Copy link
@stu-gott

stu-gott Sep 21, 2017

Member

I think the V() modifier only has an effect with info messages.

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

Yes, that is intentional. We can improve the log builder to not allow building us that. If it is an error which is not worth being printed always, it is kind of not an error then. Then it is probably more info() or debug() level, and for info we have the V().

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

fixed in update i'll be pushing soon.

Copy link
Member

rmohr left a comment

Looks really much simpler now. Looking forward to sharing the images between VMs and/or writing to backing stores, which we would migrate ...

Just minor comments.

@@ -28,7 +28,7 @@ docker push vmdisks/fedora25:latest
## Assigning Ephemeral Disks to VMs

Assign an ephemeral disk backed by an image in the container registry by
adding a ContainerRegistryDisk:v1alpha disk to the VM definition and supplying
adding a RegistryDisk:v1alpha disk to the VM definition and supplying

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

Good new name!

return result, err
}

result = hash.Sum(result)

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member
has.Sum(nil)

should allow making this clearer.

Also note that nil and an empty array are exchangeable for as long as you treat it read-only: https://play.golang.org/p/N0kJ3vvubD, just use nil in the whole function.

func SetFileOwnership(username string, file string) error {
usrObj, err := user.Lookup(username)
if err != nil {
logging.DefaultLogger().V(2).Error().Reason(err).Msg(fmt.Sprintf("unable to look up username %s", username))

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

Yes, that is intentional. We can improve the log builder to not allow building us that. If it is an error which is not worth being printed always, it is kind of not an error then. Then it is probably more info() or debug() level, and for info we have the V().

}
func generateVolumeMountDir(vm *v1.VirtualMachine, diskCount int) string {
baseDir := generateVMBaseDir(vm)
return fmt.Sprintf("%s/disk%d", baseDir, diskCount)
}

func k8sSecretName(vm *v1.VirtualMachine) string {

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

This seems to be unused now.

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

ah, good catch. I stripped out the functionality that used that

"kubevirt.io/kubevirt/pkg/precond"
)

const registryDiskV1Alpha = "ContainerRegistryDisk:v1alpha"
const registryDiskV1Alpha = "RegistryDisk:v1alpha"
const defaultIqn = "iqn.2017-01.io.kubevirt:wrapper/1"

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

These constants are unused

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

oops, i'll fix

vm, err = MapRegistryDisks(vm)
Expect(err).ToNot(HaveOccurred())

// verify file gets renamed by virt-handler to prevent container from

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

I don't fully understand that comment.

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

Ha, you hit a deep topic. I'll explain what's happening and why.

When the RegistryDisk container copies the disk file to a shared directory, that container is responsible for removing the file before it is allowed to exit.

So, the logic is

  • copy disk
  • go into wait loop periodically checking disk copy exists
  • on exit signal, remove disk if it exists before exiting
  • when disk is no longer present, exit.

Now while all that is going on in the RegistryDisk container, when virt-handler gains knowledge of the presence of a local RegistryDisk, virt-handler takes ownership of the disk from the container by renaming the file. At that point, virt-handler owns the lifecycle of the disk. The RegistryDisk container will detect the disk it was watching "disappeared" because of the rename and exit. If the ownership change never occurs for whatever reason, the RegistryDisk container will still own the disk and remove the disk when it is asked to terminate.

Here's why we do this.

For graceful shutdown, it's important that virt-handler takes ownership of the disk from the RegistryDisk container before the VM starts so the disk's lifecycle is not tied to the RegistryDisk container's lifecycle. Otherwise we have to synchronize guaranteeing the RegistryDisk's container will not terminate before the VM process or the virt-launcher container. This is easily solved by having virt-launcher own the disk before starting the VM.

Essentially, having virt-handler take ownership of the disk ensures it has complete control over the disk's lifecycle without having to compete (or attempt to synchronize) with the timing of the RegistryDisk containers termination.

})

It("by verifying mapping of qcow2 disk", func() {
VerifyDiskType("qcow2")

This comment has been minimized.

Copy link
@rmohr

rmohr Sep 27, 2017

Member

You can also use a table to desciribe these two "It"

table.DescribeTable("whatever", func(type string) {

}, 
table.Entry("whatever ", "qcow2"),
table.Entry("whatever ", "raw"),
)

This comment has been minimized.

Copy link
@davidvossel

davidvossel Sep 28, 2017

Author Member

good idea

@davidvossel davidvossel force-pushed the davidvossel:registry-disk-rewrite branch 2 times, most recently from 6ab9fd4 to 6c2b0eb Sep 28, 2017
@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 28, 2017

Update rebases to latest KubeVirt and addresses latest round of feedback.

@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 28, 2017

updated with new functional test that validates starting/stopping the same RegsitryDisk backed VM multiple times

@rmohr

This comment has been minimized.

Copy link
Member

rmohr commented Sep 29, 2017

• Failure [48.874 seconds]
RegistryDisk
/var/lib/jenkins/workspace/kubevirt-functional-tests/go/src/kubevirt.io/kubevirt/tests/registry_disk_test.go:112
  Ephemeral RegistryDisk
  /var/lib/jenkins/workspace/kubevirt-functional-tests/go/src/kubevirt.io/kubevirt/tests/registry_disk_test.go:111
    should be able to start and stop the same VM multiple times. [It]
    /var/lib/jenkins/workspace/kubevirt-functional-tests/go/src/kubevirt.io/kubevirt/tests/registry_disk_test.go:92

    Timed out after 20.168s.
    Expected
        <v1.VMPhase>: Scheduled
    to equal
        <v1.VMPhase>: Running

    /var/lib/jenkins/workspace/kubevirt-functional-tests/go/src/kubevirt.io/kubevirt/tests/utils.go:702
@rmohr

This comment has been minimized.

Copy link
Member

rmohr commented Sep 29, 2017

retest this please

@rmohr

This comment has been minimized.

Copy link
Member

rmohr commented Sep 29, 2017

@davidvossel could it be that the test or related code is not ok?

@rmohr

This comment has been minimized.

Copy link
Member

rmohr commented Sep 29, 2017

retest this please

@rmohr
rmohr approved these changes Sep 29, 2017
Copy link
Member

rmohr left a comment

@davidvossel could you check if the timeout on the one test wich failed two times in a row should be increased? Aside from that it looks fine now.

@rmohr

This comment has been minimized.

Copy link
Member

rmohr commented Sep 29, 2017

@davidvossel Could you also go through the user-guide and prepare a PR because of the CRD to RD rename?

@davidvossel davidvossel force-pushed the davidvossel:registry-disk-rewrite branch from 2e74ee1 to 8ff66c3 Sep 29, 2017
@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 29, 2017

@rmohr yep, increased the timeout. I'll give it a couple of retests throughout the day to see if that improves the situation.

The inconsistent timing is caused by the race condition between shutting down a VM process and the termination/deletion of the corresponding POD. That's something we're already aware of and will solve with graceful shutdown. Even with this race, the test should be eventually consistent. It just takes longer some runs than others I believe.

@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 29, 2017

Could you also go through the user-guide and prepare a PR because of the CRD to RD rename?

It actually doesn't appear we reference RegistryDisks or the ContainerRegistryDisks in the user-guide at all. I'll submit a PR to address that

@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 29, 2017

retest this please

3 similar comments
@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 29, 2017

retest this please

@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Sep 29, 2017

retest this please

@davidvossel

This comment has been minimized.

Copy link
Member Author

davidvossel commented Oct 2, 2017

retest this please

davidvossel added 7 commits Sep 21, 2017
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
davidvossel added 2 commits Sep 28, 2017
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
@davidvossel davidvossel force-pushed the davidvossel:registry-disk-rewrite branch from 8ff66c3 to 1cfa57d Oct 2, 2017
@rmohr rmohr merged commit 0515df7 into kubevirt:master Oct 4, 2017
3 checks passed
3 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.3%) to 55.85%
Details
kubevirt-functional-tests/jenkins/pr All is well
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.