Skip to content

[WIP][DO_NOT_MERGE] KVM implementation of tasks#978

Merged
rvs merged 1 commit into
lf-edge:masterfrom
cshari-zededa:qemu_container
Jun 2, 2020
Merged

[WIP][DO_NOT_MERGE] KVM implementation of tasks#978
rvs merged 1 commit into
lf-edge:masterfrom
cshari-zededa:qemu_container

Conversation

@cshari-zededa
Copy link
Copy Markdown
Contributor

Signed-off-by: Hariharasubramanian C S cshari@zededa.com

Comment thread pkg/pillar/containerd/containerd.go Outdated
_ = CtrStop(containerID, true)

stdinDir := path.Join("/run", "containers-stdin", containerID)
if err := os.RemoveAll(stdinDir); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you doing this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cleaning it up, since the next CtrCreate expects these directories to be not present. This got exposed when LKTaskLaunch() deletes existing container (if any) before creating new container with the same domainID

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to dig into this more @cshari-zededa -- since it may expose an issue with how we're using containerd and I'd rather fix the root cause (since containerd becomes a center piece of our design). Is there an easy way to reproduce this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not reproducible, I think I added this change unnecessarily. Removed it!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rvs,
With the #containerd issue fixed,
we no longer need this stdin workaround, so removed that piece of code, and this cleanup
code as well. Please let me know if you think we should retain it.

config := "/containers/services" + linuxkit + "/config.json"
rootfs := "/containers/services" + linuxkit + "/lower"
config := "/containers/services/" + linuxkit + "/config.json"
rootfs := "/containers/services/" + linuxkit + "/rootfs"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need a R/W filesystem you need to prepare a volume for it. A better question is: why DO you need a R/W root filesystem? All the R/W bits are given to you via /run and /persist anyway -- so what's missing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make rootfs as R/W because of /dev/, /sys, /proc systems, since qemu/kernel is creating /dev/pts/ and /proc/ files during the VM launch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If those are the only ones -- we need to dig deeper into this as well. Those are supposed to be R/W mounted from the outside when you ask runc to run the container. Lets start with seeing what gets mounted on those endpoint when you call LKTaskLaunch() and we can proceed from there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rvs, you are right, I unnecessarily marked this as R/W, whereas Readonly rootfs works just fine, as I verified it today.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cshari-zededa -- do you plan to update your PR to reflect that going back to a R/O rootfs?

Copy link
Copy Markdown
Contributor Author

@cshari-zededa cshari-zededa May 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rvs, I already reverted the Root.Readonly = false part.

Copy link
Copy Markdown
Contributor Author

@cshari-zededa cshari-zededa May 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue with read-only root partition is that, we can't use "screen" with default settings,
since it tries to create /tmp/.screens directory, and it fails. But we can always override that by
explicitly mentioning "SCREENDIR" env variable. So we are all good with the Readonly root fs


spec.Root.Path = rootfs
spec.Root.Readonly = true
if domSettings != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make rootfs as R/W because of /dev/, /sys, /proc systems, since qemu/kernel is creating /dev/pts/ and /proc/ files during the VM launch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted the Readonly change to true.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue with read-only root partition is that, we can't use "screen" with default settings,
since it tries to create /tmp/.screens directory, and it fails. But we can always override that by
explicitly mentioning "SCREENDIR" env variable. So we are all good with the Readonly root fs.

Comment thread pkg/pillar/hypervisor/kvm.go
Copy link
Copy Markdown
Contributor

@rvs rvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cshari-zededa -- sorry for the delay -- lets make sure we commit it this week. A few things left to do here:

  1. rebase on the latest EVE
  2. take care of the feedback I left just now
  3. lets see if we can reduce overhead to less than 800M (800M really seems excessive)

For #3 I suggest we do the memory pinning and start passing -overcommit mem-lock=on and -overcommit cpu-pm=on to the qemu we're running.

I think that will help us arrive at a more reasonable overhead figure

config := "/containers/services" + linuxkit + "/config.json"
rootfs := "/containers/services" + linuxkit + "/lower"
config := "/containers/services/" + linuxkit + "/config.json"
rootfs := "/containers/services/" + linuxkit + "/rootfs"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cshari-zededa -- do you plan to update your PR to reflect that going back to a R/O rootfs?

Comment thread pkg/pillar/containerd/oci.go Outdated
}

m := int64(dom.Memory * 1024)
m := int64(dom.Memory*1024) + int64(qemuOverHead)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right @cshari-zededa -- since this function is going to be used for much more than just launching qemu tasks -- we need a more generic way to accommodate the overhead. Please move it one level up to at least LKTaskLaunch(). IOW, make LKTaskLaunch do a generic UpdateFromDomain() and later manually tweak the memory setting at that level based on the overhead argument passed to LKTaskLaunch

@cshari-zededa
Copy link
Copy Markdown
Contributor Author

Hey @cshari-zededa -- sorry for the delay -- lets make sure we commit it this week. A few things left to do here:

  1. rebase on the latest EVE

Done

  1. take care of the feedback I left just now

Done, please let me know if you think I have missed anything.

  1. lets see if we can reduce overhead to less than 800M (800M really seems excessive)

Arrived at 500MB with some trail-and-error experiments with PCI passthrough settings.

For #3 I suggest we do the memory pinning and start passing -overcommit mem-lock=on and -overcommit cpu-pm=on to the qemu we're running.

Done

I think that will help us arrive at a more reasonable overhead figure

@cshari-zededa cshari-zededa changed the title [WIP] Initial review of KVM implementation of tasks [WIP][DO_NOT_MERGE] Initial review of KVM implementation of tasks May 29, 2020
@cshari-zededa cshari-zededa changed the title [WIP][DO_NOT_MERGE] Initial review of KVM implementation of tasks [WIP][DO_NOT_MERGE] KVM implementation of tasks May 29, 2020
Copy link
Copy Markdown
Contributor

@rvs rvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cshari-zededa -- this is great, but there's a tiny, tiny nit left (see below). I would've pulled as-is but it pushes the size of rootfs over the 250Mb limit on ARM -- hence it breaks the ARM build.

Comment thread pkg/xen-tools/Dockerfile.in Outdated
pixman=0.34.0-r5 \
yajl=2.1.0-r0 \
keyutils=1.5.10-r0 \
screen=4.6.2-r1 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not pull screen in for now -- anyone debugging the system can always apk add screen (and in fact, we should probably move towards using a socket for stdio anyway -- but that shouldn't be part of this PR).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that you don't have to be in the same R/O rootfs/container for using screen

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rvs, excluded this change to pull in screen. I've updated the PR. Please take a look.

Signed-off-by: Hariharasubramanian C S <cshari@zededa.com>
@rvs rvs merged commit 50c4122 into lf-edge:master Jun 2, 2020
@rvs
Copy link
Copy Markdown
Contributor

rvs commented Jun 2, 2020

Thanks @cshari-zededa -- merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants