Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: block-device mounts #2168

Merged
merged 1 commit into from
Jun 26, 2024
Merged

Commits on Jun 24, 2024

  1. feature: block-device mounts

    This PR adds capability to mount virtual and passthrough disks
    as block devices inside containers.
    
    We add a new "blockdev://" prefix to OCI `Mount.ContainerPath`,
    which indicates that the source should be mounted as a blcok
    device.
    
    A new `BlockDev` field has been added to `mountConfig` used by
    `mountManager`, which indicates that the SCSI attachment should
    be mounted as a block device.
    
    The GCS has also been updated to handle `BlockDev`. Instead of
    mounting the filesystem, GCS creates a symlink to the block device
    corresponding to the SCSI attachment. The symlink path is set
    by shim as a source of bind mount in OCI container spec. GCS
    resolves the symlink and adds the corresponding device cgroup.
    Without the cgroup, the container won't be able to work with the
    block device.
    
    We chose a symlink approach instead of bind mounting the device
    directly, because the shim doesn't know the path at which the
    device will appear inside UVM. For this to work, we either need
    to encode the SCSI controller/LUN in the OCI mount's HostPath or
    update the communication protocol between the shim and GCS, where
    GCS would either return the device path, or add capability for
    the shim to query for it.
    
    Below are some CRI container config examples for physical and
    virtual disks:
    
    Passthrough physical disk:
    ```json
    {
        ...
        "mounts": [
            {
                "host_path": "\\\\.\\PHYSICALDRIVE1",
                "container_path": "blockdev:///my/block/mount",
                "readonly": false
            }
        ]
        ...
    }
    ```
    
    Virtual VHD disk:
    ```json
    {
        ...
        "mounts": [
            {
                "host_path": "C:\\path\\to\\my\\disk.vhdx",
                "container_path": "blockdev:///my/block/mount",
                "readonly": false
            }
        ]
        ...
    }
    ```
    
    Mount manager will differentiate between a block device and a
    filesystem mount. Two containers can use the same managed disk
    inside UVM as a block device or filesystem at the same time.
    For block device mount a symlink will be created, for filesystem
    mount the block device will be mounted in the UVM.
    ```
    bash-5.0# ls -l /run/mounts/scsi/
    total 16
    drwxr-xr-x    3 root     root          4096 Jan  1  1970 m0
    drwxr-xr-x    4 root     root          4096 Jun 20 23:20 m1
    drwxr-xr-x   18 root     root          4096 Jan  1  1970 m2
    drwxr-xr-x    3 root     root          4096 Jun 20 23:20 m3
    lrwxrwxrwx    1 root     root             8 Jun 20 23:22 m4 -> /dev/sde
    bash-5.0# mount | grep sde
    /dev/sde on /run/mounts/scsi/m3 type ext4 (rw,relatime)
    ```
    
    Signed-off-by: Maksim An <maksiman@microsoft.com>
    anmaxvl committed Jun 24, 2024
    Configuration menu
    Copy the full SHA
    92ca394 View commit details
    Browse the repository at this point in the history