Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: block-device mounts #2168

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anmaxvl
Copy link
Contributor

@anmaxvl anmaxvl commented Jun 13, 2024

This PR adds capability to mount virtual and passthrough disks as block devices inside containers.

We add a new "blockdev://" prefix to OCI Mount.ContainerPath, which indicates that the source should be mounted as a blcok device.

A new BlockDev field has been added to mountConfig used by mountManager, which indicates that the SCSI attachment should be mounted as a block device.

The GCS has also been updated to handle BlockDev. Instead of mounting the filesystem, GCS creates a symlink to the block device corresponding to the SCSI attachment. The symlink path is set by shim as a source of bind mount in OCI container spec. GCS resolves the symlink and adds the corresponding device cgroup. Without the cgroup, the container won't be able to work with the block device.

We chose a symlink approach instead of bind mounting the device directly, because the shim doesn't know the path at which the device will appear inside UVM. For this to work, we either need to encode the SCSI controller/LUN in the OCI mount's HostPath or update the communication protocol between the shim and GCS, where GCS would either return the device path, or add capability for the shim to query for it.

Below are some CRI container config examples for physical and virtual disks:

Passthrough physical disk:

{
    ...
    "mounts": [
        {
            "host_path": "\\\\.\\PHYSICALDRIVE1",
            "container_path": "blockdev:///my/block/mount",
            "readonly": false
        }
    ]
    ...
}

Virtual VHD disk:

{
    ...
    "mounts": [
        {
            "host_path": "C:\\path\\to\\my\\disk.vhdx",
            "container_path": "blockdev:///my/block/mount",
            "readonly": false
        }
    ]
    ...
}

This PR adds capability to mount virtual and passthrough disks
as block devices inside containers.

We add a new "blockdev://" prefix to OCI `Mount.ContainerPath`,
which indicates that the source should be mounted as a blcok
device.

A new `BlockDev` field has been added to `mountConfig` used by
`mountManager`, which indicates that the SCSI attachment should
be mounted as a block device.

The GCS has also been updated to handle `BlockDev`. Instead of
mounting the filesystem, GCS creates a symlink to the block device
corresponding to the SCSI attachment. The symlink path is set
by shim as a source of bind mount in OCI container spec. GCS
resolves the symlink and adds the corresponding device cgroup.
Without the cgroup, the container won't be able to work with the
block device.

We chose a symlink approach instead of bind mounting the device
directly, because the shim doesn't know the path at which the
device will appear inside UVM. For this to work, we either need
to encode the SCSI controller/LUN in the OCI mount's HostPath or
update the communication protocol between the shim and GCS, where
GCS would either return the device path, or add capability for
the shim to query for it.

Below are some CRI container config examples for physical and
virtual disks:

Passthrough physical disk:
```json
{
    ...
    "mounts": [
        {
            "host_path": "\\\\.\\PHYSICALDRIVE1",
            "container_path": "blockdev:///my/block/mount",
            "readonly": false
        }
    ]
    ...
}
```

Virtual VHD disk:
```json
{
    ...
    "mounts": [
        {
            "host_path": "C:\\path\\to\\my\\disk.vhdx",
            "container_path": "blockdev:///my/block/mount",
            "readonly": false
        }
    ]
    ...
}
```

Signed-off-by: Maksim An <maksiman@microsoft.com>
@anmaxvl anmaxvl requested a review from a team as a code owner June 13, 2024 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants