Skip to content
This repository has been archived by the owner on Apr 3, 2018. It is now read-only.

Add multi OS support #439

Closed
sameo opened this issue Oct 23, 2017 · 7 comments
Closed

Add multi OS support #439

sameo opened this issue Oct 23, 2017 · 7 comments
Assignees

Comments

@sameo
Copy link
Collaborator

sameo commented Oct 23, 2017

Problem statement

One key advantage of Clear Containers over runc containers is the ability for each container/pod to run on top of its own kernel and guest image.
For customized workloads like e.g. NFV ones, this is very important as they sometimes need some specific kernel features that can be mutually exclusive with other workload needs.

Unfortunately, the OCI runtime specifications do not allow (yet) for specifying a kernel and guest image and thus virtcontainers callers can only use a default path for each of those and pass it through the HypervisorConfig structure.

Proposed solutions

We want to be able to define a kernel and a guest image paths per container/pod. Both of them should be absolute paths, and should be optional. In other words, there must be a default value for those paths which will be passed to virtcontainers through the HypervisorConfig structure. Clear Containers, for example, gets the default paths for its kernel and guest image through the configuration.toml system wide configuration file.

Short term solution: Annotations

As a short term solution we are going to use virtcontainers namespaced pod annotations for passing a kernel and guest image absolute path:

// KernelPath is a pod annotation for passing a per container path pointing at the kernel needed to boot the container VM.
KernelPath = "com.github.containers.virtcontainers.KernelPath"

// ImagePath is an pod annotation for passing a per container path pointing at the guest image that will run the container VM.
ImagePath = "com.github.containers.virtcontainers.ImagePath"

Both annotations can be set through virtcontainers PodConfig annotations. Kernel or image path ContainerConfig annotations will be ignored.

Long term solution: OCI specification changes

See opencontainers/runtime-spec#405

Code path

The virtcontainers code path should be identical for both solutions: The pod.go will trap pod creation request and modify its config.HypervisorConfig structure if there is an ImagePath and/or KernelPath annotation as part of its PodConfig

The pkg/oci/utils.go file will be responsible for adding the right pod annotation, either by parsing the OCI config.json annotations or by parsing the future VM specific section in that file. The latter should take precedence over the former.

Error handling

If any of the kernel or guest image path is invalid (empty files, or inexistent ones), virtcontainers will error out and stop creating the pod's VM. Passing an image or/and a guest image into the container configuration file is a very specific requirement from the user and thus switching to the default value when one of those paths is invalid could be misleading and error prone.

Security

Optionally virtcontainers callers can require kernel and/or image binaries integrity checking. For that purpose, virtcontainers will handle 2 additional annotations:

// KernelHash is a pod annotation for passing a container kernel image SHA-512 hash value.
KernelHash = "com.github.containers.virtcontainers.KernelHash"

// ImageHash is an pod annotation for passing a container guest image SHA-512 hash value.
ImageHash = "com.github.containers.virtcontainers.ImageHash"

By default virtcontainers will not verify the kernel and image binaries integrity unless one or both of the hash annotations are passed in the PodConfig annotations. In the latter case, virtcontainers will build SHA-512 hashes on the kernel or/and image binaries and compare that against the passed annotations.

@sameo sameo self-assigned this Oct 23, 2017
@sameo sameo changed the title [WIP] Add multi OS support Add multi OS support Oct 24, 2017
@sameo
Copy link
Collaborator Author

sameo commented Oct 24, 2017

cc @sboeuf @mcastelino @jodh-intel

@jodh-intel
Copy link
Collaborator

A few thoughts:

Definitions

We should specify what we mean by "path" (I vote for only accepting absolutes).

Behaviour

I think we should document the expected behaviour if either of those annotations are blank or invalid in other ways (immediate error or fallback to defaults with a big fat log warning message :)

Kernel/Image selection

As mentioned on the original OCI PR, we need a way to select between images and kernels. Currently for CC 3.0, the runtime handles that via its configuration file. These annotations suggest a broadening of the ability to switch between images / kernels by allowing the caller (the user running, say, docker run) to specify the image/kernel. I guess the basic question is:

  • Who decides which image+kernel combo the workload gets to run in?

If it's the runtime, I wonder if we need to find a way for workloads to be tagged as needing particular features (such as NFV) to allow the runtime to decide automatically which kernel to choose? That might require we mandate another annotation for KernelConfig = /path/to/kernel/config. Of course, the problem is there are a significant number of kernel config options so finding a way to specify the feature set with kernel config options may not be ideal for various reasons.

Security

We could add checksum annotations for the kernel + image, but that would significantly slow down startup time (if they were being checked). Hence, we should probably document that other facilities (such as package-management features) are used to ensure the images have not been modified.

Stale component versions

Whatever we decide, we need to ensure the runtime is able to identify stale containers. If the kernel+image selection is not going to be decided by the runtime, the runtime will still need a way to identify the range of available images+kernels, and be able to order them by age to be able to identify "stale" containers.

@sameo
Copy link
Collaborator Author

sameo commented Oct 24, 2017

@jodh-intel Thanks for the feedback. Some comments:

Who decides which image+kernel combo the workload gets to run in?

I'd like to look at the problem only from a virtcontainers perspective for now, and I think virtcontainers should not try to be smart about it, but simply work on kernel and/or image paths. Then runtime implementations can decide if they want to resolve higherl level requests into actual paths or not.

We could add checksum annotations for the kernel + image, but that would significantly slow down startup time

That would be an interesting feature, that we could decide to disable by default unless the caller explicitly asks for it. And this could also apply to the default kernel + image files. I'll add that to the issue.

Whatever we decide, we need to ensure the runtime is able to identify stale containers.

With this new approach we will still be able to have cc-runtime list --cc-all report the kernel and image currently being used by any pod. Now Let's make an assumption here: If pod A is running kernel K, upgrading kernel K means replacing the K binary with another one (either called K or K'). It does not mean having both K and K' installed. Starting with this assumption we could decide to add some kernel and image metadata (a timestamp) to our PodConfig and will declare a container to be stale if any of the 2 following conditions is met:

  1. K or I are no longer present on the filesystem.
  2. If K or I are still present in the filesystem, K or I PodConfig timestamps are older than the filesystem latest access timestamp.

@mcastelino
Copy link
Collaborator

@sameo @jodh-intel this feature change should be accompanied by the ability to retrieve the kernel and runtime in use for containers/pods on a given system using the cc-runtime binary. This may be the same as what we are planning to do to report the version of the kernel and image in use.

Another thing we may want to consider is allowing module loading in clear containers. If we allow for module loading, we will be able to support significantly more combinations and allow the workload to be self contained, and the the workload itself can carry the modules it needs. The module loading support does not need to be supported in the default kernel.

Also we should explore if node tagging can be easily extended to report the availability of various kernels/rootfs combinations.

Today there is no hard linkage between the kernel and rootfs, as they are loosely coupled. But keeping this loose coupling may be an issue in the future, when we allow the kernels to change.

@sameo
Copy link
Collaborator Author

sameo commented Oct 24, 2017

@mcastelino

@sameo @jodh-intel this feature change should be accompanied by the ability to retrieve the kernel and runtime in use for containers/pods on a given system using the cc-runtime binary. This may be the same as what we are planning to do to report the version of the kernel and image in use.

With this new approach we will still be able to know which kernel+image each pod is using by calling cc-runtime list --cc-all.

Another thing we may want to consider is allowing module loading in clear containers.

Agreed. This will have little impact on us, except maybe that we will need to add systemd-udevd to our userspace image.

Today there is no hard linkage between the kernel and rootfs, as they are loosely coupled. But keeping this loose coupling may be an issue in the future, when we allow the kernels to change.

So here what we'd most likely want to know is which kernels are compatible with any given image, I think. Just a thought: We could have osbuilder generate a metadata file for any generated image that would point to the compatible kernel(s) for it.

@grahamwhaley
Copy link
Contributor

If we do make a module enabled kernel then I'd recommend ensuring we have module versioning turned on to help ensure we only load modules that were build for that actual kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/kbuild/modules.txt#n426

We don't want to be tracking down issues where somebody has loaded an older or newer module which causes 'strange things to happen'.

@sameo
Copy link
Collaborator Author

sameo commented Oct 25, 2017

cc @eadamsintel

sameo pushed a commit that referenced this issue Oct 30, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 30, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. If we find a valid path for a kernel
or an image, we change the hypervisor config to point to them.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
@sameo sameo mentioned this issue Oct 30, 2017
2 tasks
sameo pushed a commit that referenced this issue Oct 30, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. If we find a valid path for a kernel
or an image, we change the hypervisor config to point to them.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 30, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 30, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Oct 31, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 6, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 6, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 6, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 6, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 7, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 7, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 7, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 7, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 7, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 9, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 9, 2017
We add an annotation package with all virtcontainers public annotations.
For now we will add the assets paths and hashes annotations.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
sameo pushed a commit that referenced this issue Nov 9, 2017
Based on the pod config annotations, we change the underlying
hypervisor configuration. Pod config annotations can carry a
kernel path, a guest image path or both.
Those annotations, if valid, will overwrite the hypervisor
configuration paths. If not valid, virtcontainers will error
out.

Optionally, asset SHA-512 hashes can be passed through pod
annotations as well. In that case virtcontainers will verify
that the asset binary matches the hash annotation and error
out when it does not.

Fixes #439

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
@sameo sameo closed this as completed in #460 Nov 9, 2017
@sameo sameo removed the in progress label Nov 9, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants