Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packer: Enable support for adding nvidia gpu to podvm image #1602

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

bpradipt
Copy link
Member

@bpradipt bpradipt commented Nov 24, 2023

This commit enables NVIDIA gpu support for aws and azure podvm image. For example enabling gpus etc.

NVIDIA GPU support is enabled by default when building packer based
image for aws and azure
A default build of podvm in aws took 10 min

  ==> Wait completed after 10 minutes 16 seconds

  ==> Builds finished. The artifacts of successful builds are:
  --> peer-pod-ubuntu.amazon-ebs.ubuntu: AMIs were created:
  us-east-2: ami-0463ae5aa8d5b3606

  rm -fr toupload

  real    10m34.352s
  user    0m18.919s
  sys     0m10.044s

If you want to disable, then run with ENABLE_NVIDIA_GPU=no
For example:
cd azure/image
PODVM_DISTRO=ubuntu ENABLE_NVIDIA_GPU=no make image

Few additional work is required going forward which I plan to handle in subsequent PRs.

  1. Add gpu verification test case
  2. Add gpu support in mkosi build

@bpradipt
Copy link
Member Author

@katexochen @mkulke @snir911 @surajssd ptal and let me know what you think about this interim method to enable some additional features via the podvm image.
I have included two examples - gpu and blobfuse and both uses oci prestart hook.
Tested blobfuse with confidential VM and gpu with non-confidential gpu instance.

@katexochen
Copy link
Contributor

katexochen commented Nov 24, 2023

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

What are the security considerations of mounting blob fuse? How is this secure in the CC attacker model?

@bpradipt
Copy link
Member Author

bpradipt commented Nov 24, 2023

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

Right and hence I want to hear everyone's thoughts. I can start exploring how to have similar functionality with mkosi.
But unfortunately there is no way to work on other use cases till that time. Also I have made an attempt to keep it isolated and only include in packer builds. By default nothing gets included in the packer generated image.

What are the security considerations of mounting blob fuse? How is this secure in the CC attacker model?

Blob can have encrypted data which will be decrypted inside the CVM. Blob may also have non-encrypted data for processing by the pod.

@mkulke
Copy link
Contributor

mkulke commented Nov 24, 2023

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

Right and hence I want to hear everyone's thoughts. I can start exploring how to have similar functionality with mkosi. But unfortunately there is no way to work on other use cases till that time. Also I have made an attempt to keep it isolated and only include in packer builds. By default nothing gets included in the packer generated image.

IMO it's ok to conduct experiments like until we don't formally declare the packer-related code deprecated and freeze it. I'm not entirely sure we're committed to dropping packer, as it might be required for certain archs. We should reach and document a consensus about that,

What I think we have reached an agreement on so far: we're committed to provide mkosi-based image builds, possibly as a reference architecture. what we do with the existing packer-based infra needs to be decided (like deprecate or move it to a downstream project after 0.9). We should keep discussing this topic in #1326

This PR probably covers 2 aspects:

  1. a facility for having addons in the podvm
  2. sample addons using that facility

I think we can look at those individually.

For 1) I think we should probably consider a similar mechanism for configuring mkosi builds. maybe we can have something like the debug profile? maybe we don't want that level of configurability of mkosi builds, because it leads to an explosion in image permutations or in a large tcb?

In any case, 2) looks intriguing. I'll have to check those out. I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

@katexochen
Copy link
Contributor

I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

Continuously raising the bar for mkosi images by continuing adding features to the legacy approach does not seem reasonable to me.

@bpradipt
Copy link
Member Author

Let me summarise few things here. We can also in slack and the community meeting on the way forward.

  1. We would like to provide a bare minimum podvm (raw) image which is measured and integrity protected and part of CI tests. This will be based on mkosi in a future release.
  2. We will deprecate/remove packer based image builds in a future release.
  3. We would like to provide a way for end users to create custom images to support GPU and any other extensions. This will also be needed for experiments, prototype so that we make it easy for end users to continue innovating.
  4. Instructions, examples etc to create custom images need to be provided for mkosi and packer to help end users.

I think moving out the packer image build templates to a separate project (as suggested by Magnus) is a reasonable approach to balance between flexibility, maintenance and upstream test support matrix.
On similar lines, the mkosi custom templates (I'm using the generic term template to indicate custom configs) can also be moved to a separate project.
This can also help going forward when adding support for other providers which require native image creation. And if there are other tools to build such images, those can go into that separate repo.

@mkulke
Copy link
Contributor

mkulke commented Nov 24, 2023

I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

Continuously raising the bar for mkosi images by continuing adding features to the legacy approach does not seem reasonable to me.

admittedly that's annoying, but I think that's tolerable if temporary. We probably want to decide on concrete timeline/criteria on when to switch over in a coming community call, so there's less ambiguity. OTOH, at this point of the project, I also wouldn't want to curb/hamper exciting tinkering, like the GPU integration, by enforcing a sanctioned-only way to do images. It's a balance, I guess.

@surajssd
Copy link
Member

@bpradipt can we add addon-support to mkosi build process as well?

What are the blockers to support the GPU artifacts for mkosi based image builds?

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Tested a build without an addon => built successfully, was able to start a pod with it on AKS
  • Tested a build with gpu addon => built successfully, couldn't start pod due to missing gpu quota

So, in general things seem to at least not break. A couple of questions/comments:

  • afaict we have no test-coverage for those, so the addon-feature or individual addons might be broken inadvertently by future code changes. manual testing is very cumbersome and in the GPU case, depends on available quota.
  • do we intend to support non-confidential GPU instances in the project?
  • does it make sense to introduce a "*-contrib" structure (maybe not a repo, can be a folder) in which can stage experimental features that are not supported/systematically tested yet?

podvm/addons/README.md Outdated Show resolved Hide resolved
podvm/addons/opa/README.md Outdated Show resolved Hide resolved
labels:
app: test
annotations:
io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
io.containerd.cri.runtime-handler: kata-remote

don't we need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing it was not needed. The annotation will be used by CAA to spin up the specific instance and the runtimeclass is part of of the pod spec itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will break once we switch to kata#main. AFAIU the nydus-snapshotter requires this.

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

@bpradipt can we add addon-support to mkosi build process as well?

What are the blockers to support the GPU artifacts for mkosi based image builds?

This is next in my plan to include both GPU and opa for mkosi based image builds. The only catch is testing the Fedora packages as most of my GPU testing have been with either RHEL or Ubuntu

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

  • Tested a build without an addon => built successfully, was able to start a pod with it on AKS
  • Tested a build with gpu addon => built successfully, couldn't start pod due to missing gpu quota

So, in general things seem to at least not break. A couple of questions/comments:

  • afaict we have no test-coverage for those, so the addon-feature or individual addons might be broken inadvertently by future code changes. manual testing is very cumbersome and in the GPU case, depends on available quota.
  • do we intend to support non-confidential GPU instances in the project?
  • does it make sense to introduce a "*-contrib" structure (maybe not a repo, can be a folder) in which can stage experimental features that are not supported/systematically tested yet?

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

do we intend to support non-confidential GPU instances in the project?

I think having this option will be good as access to confidential GPU is hard and only available with Azure today. Having the option to use non-confidential GPU will benefit two things imho - ability for users to try their AI workload with peer-pods (non confidential) and for developers to continue working on integrating the common building blocks for GPU support

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

I have removed the opa addon from this series as I feel it's best suited to be used with the policy enablement pr - #1607.

Both gpu and opa addons need to be enabled for mkosi builds. This is next in my todo list.

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

I'm not sure I grok the intent of this proposal yet. Is it purely provisional or is intended to be a future-proof image extension mechanism?

In general, we probably want to reduce the configurable surface of podvm images to make testing more manageable and have some confidence about working configurations. This PR would go in an opposite direction. Maybe it's warranted for GPU-enabled podvm images, as a special case. But then I would cover this specific case with some make INCLUDE_GPU_STACK=1 image parameterization, rather than introducing a generic addon mechanism.

labels:
app: test
annotations:
io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will break once we switch to kata#main. AFAIU the nydus-snapshotter requires this.

podvm/addons/README.md Outdated Show resolved Hide resolved
podvm/addons/.gitignore Outdated Show resolved Hide resolved
@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

I'm not sure I grok the intent of this proposal yet. Is it purely provisional or is intended to be a future-proof image extension mechanism?

I would not say future proof. But at least a workable method to add extensions. I am open for other ideas for extensibility.

In general, we probably want to reduce the configurable surface of podvm images to make testing more manageable and have some confidence about working configurations. This PR would go in an opposite direction. Maybe it's warranted for GPU-enabled podvm images, as a special case. But then I would cover this specific case with some make INCLUDE_GPU_STACK=1 image parameterization, rather than introducing a generic addon mechanism.

Agree. I would start with the premise that anything which is turned on by default should be testable in CI.
Also parameterization will soon become unmanageable for different cases. Like the existing misc script with conditionals. But nonetheless required. Please see my previous comment on using addon and parameter both. Using the parameter will enable the addon via Makefile which we want to include by default and test it. Rest will be for users to try out and enable explicitly.

@mkulke
Copy link
Contributor

mkulke commented Dec 1, 2023

I would not say future proof. But at least a workable method to add extensions. I open for other ideas for extensibility.

Do we need such an extension mechanism, or is it sufficient to document how an image can be customized?

In Systemd there's Sysext and Confext: https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html to extend os images in a systematic fashion (like updating a bundled container runtime or system library) but it's not package manager, things easily break if conflicting system extensions are applied.

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html

I would not say future proof. But at least a workable method to add extensions. I open for other ideas for extensibility.

Do we need such an extension mechanism, or is it sufficient to document how an image can be customized?

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

In Systemd there's Sysext and Confext: https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html to extend os images in a systematic fashion (like updating a bundled container runtime or system library) but it's not package manager, things easily break if conflicting system extensions are applied.

Are you suggesting enabling this option in the podvm image to allow extensions ?
Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

@mkulke
Copy link
Contributor

mkulke commented Dec 1, 2023

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

I don't think it's gets a lot easier to customize a build than what is supported right now in packer + hcl if you copy a script and execute it:

  provisioner "file" {
    source      = "/install-some-deps.sh"
    destination = "~/install-some-deps.sh"
  }

  provisioner "shell" {
    remote_folder = "~"
    inline = [
      "sudo bash ~/install-some-deps.sh"
    ]
  }

Are you suggesting enabling this option in the podvm image to allow extensions ? Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

No indeed. sys/confext are only for binaries and configs, nothing is being executed here. rpms/debs you would install in the mkosi or packer build-process. sysext would merely allow extending an image by overlaying some files, without interfering in the build process itself.

Again, I probably don't understand the use case well, to understand the requirements atm. like who would want to extend a podvm image with what?

Is an extension something like a debug image, like a fixed, known-by-us build-time option? could it be make image-gpu like make image-debug or do we require something more composable make image DEBUG=1 GPU=1 then? I think either would work without developing our own extension mechanism.

Or is an extension something unforeseeable, like a bespoke monitoring agent that a customer has to install and configure in their environment? In this case both packer and mkosi provide convenient ways to do this kind of customization, we can hardly improve on those in terms of ux.

@bpradipt
Copy link
Member Author

bpradipt commented Dec 1, 2023

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

I don't think it's gets a lot easier to customize a build than what is supported right now in packer + hcl if you copy a script and execute it:

  provisioner "file" {
    source      = "/install-some-deps.sh"
    destination = "~/install-some-deps.sh"
  }

  provisioner "shell" {
    remote_folder = "~"
    inline = [
      "sudo bash ~/install-some-deps.sh"
    ]
  }

Are you suggesting enabling this option in the podvm image to allow extensions ? Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

No indeed. sys/confext are only for binaries and configs, nothing is being executed here. rpms/debs you would install in the mkosi or packer build-process. sysext would merely allow extending an image by overlaying some files, without interfering in the build process itself.

Again, I probably don't understand the use case well, to understand the requirements atm. like who would want to extend a podvm image with what?

Is an extension something like a debug image, like a fixed, known-by-us build-time option? could it be make image-gpu like make image-debug or do we require something more composable make image DEBUG=1 GPU=1 then? I think either would work without developing our own extension mechanism.

Or is an extension something unforeseeable, like a bespoke monitoring agent that a customer has to install and configure in their environment? In this case both packer and mkosi provide convenient ways to do this kind of customization, we can hardly improve on those in terms of ux.

Let me ask a different question - what if we add gpu and opa in the default podvm image ? And instead of .enable I can include the addons based on AGENT_POLICY and ENABLE_GPU options. By default both these will be enabled.
Using flexible instance types we can always test gpu in CI if needed (although cost might be a factor).
Other than image build time and size, do you see any drawback ?

And for extensions we can rethink in the context of mkosi and not spend too much time in doing it with packer. Wdyt ?

@mkulke
Copy link
Contributor

mkulke commented Dec 1, 2023

Let me ask a different question - what if we add gpu and opa in the default podvm image ? And instead of .enable I can include the addons based on AGENT_POLICY and ENABLE_GPU options. By default both these will be enabled. Using flexible instance types we can always test gpu in CI if needed (although cost might be a factor). Other than image build time and size, do you see any drawback ?

And for extensions we can rethink in the context of mkosi and not spend too much time in doing it with packer. Wdyt ?

yes, as built-time options those do make sense to me. About what feature is supposed to be opt-in and opt-out there can be a separate discussion once we port these to mkosi.

@bpradipt bpradipt changed the title packer: Enable misc addons for podvm packer: Enable support for adding nvidia gpu to podvm image Dec 5, 2023
@bpradipt
Copy link
Member Author

bpradipt commented Dec 5, 2023

@mkulke I changed this PR to add support for NVIDIA GPU in packer based podvm images. Updated the main description as well.

NVIDIA gpu support is enabled by default, meaning the packer built images will have the NVIDIA gpu drivers and should be usable depending on the instance type selected.
Let me know what you think and if we can move ahead with this one for now.

Few additional work is required going forward which I plan to handle in subsequent PRs.

  • Add gpu verification test case
  • Add gpu support in mkosi build

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense to me, thanks for accommodating.

nit: I doubt we need the addon folder. the enable_nvidia_gpu flag would be enough. since we copy the necessary scripts on the build image anyway, we can just check for the ENV in misc-settings.sh and invoke nvidia_gpu/setup.sh if necessary.

podvm/files/etc/agent-config.toml Outdated Show resolved Hide resolved
@bpradipt
Copy link
Member Author

that makes sense to me, thanks for accommodating.

No problems. I'm glad we could arrive at a consensus.

nit: I doubt we need the addon folder. the enable_nvidia_gpu flag would be enough. since we copy the necessary scripts on the build image anyway, we can just check for the ENV in misc-settings.sh and invoke nvidia_gpu/setup.sh if necessary.

I thought about whether to have a separate folder or not, and felt it's cleaner to have a separate folder and keep the config files under it to accommodate support for other GPUs (eg Intel ones) or other accelerators requiring podvm image customisation. The naming is a challenge though. Thought about misc, contrib etc but then went with addons since there is already misc-settings :-).

NVIDIA GPU support is enabled by default when building packer based
image for aws and azure
A default build of podvm in aws took 10 min

    ```
    ==> Wait completed after 10 minutes 16 seconds

    ==> Builds finished. The artifacts of successful builds are:
    --> peer-pod-ubuntu.amazon-ebs.ubuntu: AMIs were created:
    us-east-2: ami-0463ae5aa8d5b3606

    rm -fr toupload

    real    10m34.352s
    user    0m18.919s
    sys     0m10.044s
    ```

If you want to disable, then run with ENABLE_NVIDIA_GPU=no
For example:
cd azure/image
PODVM_DISTRO=ubuntu ENABLE_NVIDIA_GPU=no make image

This results in the qcow2/setup_addons.sh script executing the addons/nvidia_gpu/setup.sh
to setup NVIDIA drivers, libraries and prestart hook into the podvm image

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Copy link
Contributor

@snir911 snir911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, added minor comment, thanks!

@bpradipt bpradipt merged commit ddbddb7 into confidential-containers:main Dec 13, 2023
16 checks passed
@bpradipt bpradipt deleted the packer-addons branch December 13, 2023 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants