Skip to content

[BUG] Extension Runs Under the Wrong Slice When Installed on a VMSS - Ubuntu 24.04 #254

Open
@jraycroft

Description

@jraycroft

Most of the discussion about this issue is actually in the actions/runner-images repo in this issue here. actions/runner-images#11790.

In summary after upgrading to Ubuntu 24.04 myself and others noticed significant performance problems with Azure Pipelines Agents running in a Virtual Machine Scale Set. After a lot of digging I managed to find that when installed this way Azure Pipelines Agent is running as a child of azure.slice/walinuxagent.service

Image

This is problematic because WALinuxAgent recently started enforcing cgroup limits, and in turn limiting the Azure Pipelines Agent to only 50% CPU utilization. I haven't been able to pinpoint the exact commit where they started doing this. This issue can be worked around by creating a drop-in Systemd Unit file like I described in the issue that I linked in actions/runner-images.

I believe though that the Azure Pipelines Agent isn't running in the correct context when installed on a VMSS. As far as I can tell the WALinuxAgent maintainers don't intend for VM Extensions to run as a child of WALinuxAgent.

See here:

  """
  The agent creates "azure.slice" for use by extensions and the agent. The agent runs under "azure.slice" directly and each
  extension runs under its own slice ("Microsoft.CPlat.Extension.slice" in the example below). All the slices for
  extensions are grouped under "vmextensions.slice".

  Example:  -.slice
            ├─user.slice
            ├─system.slice
            └─azure.slice
              ├─walinuxagent.service
              │ ├─5759 /usr/bin/python3 -u /usr/sbin/waagent -daemon
              │ └─5764 python3 -u bin/WALinuxAgent-2.2.53-py2.7.egg -run-exthandlers
              └─azure-vmextensions.slice
                └─Microsoft.CPlat.Extension.slice
                    └─5894 /usr/bin/python3 /var/lib/waagent/Microsoft.CPlat.Extension-1.0.0.0/enable.py

  This method ensures that the "azure" and "vmextensions" slices are created. Setup should create those slices
  under /lib/systemd/system; but if they do not exist, __ensure_azure_slices_exist will create them.
  """

https://github.com/Azure/WALinuxAgent/blob/6a01e43b71795dba1c7078fb89fd423e7f986097/azurelinuxagent/ga/cgroupconfigurator.py#L278

In summary the workaround is to:

  1. Create a script that will generate a Systemd Unit file with this content:
[Service]
CPUQuota=
  1. Run that script on the VM.

So we've implemented it like this:

sudo mkdir -p /usr/lib/systemd/system/walinuxagent.service.d && sudo touch /usr/lib/systemd/system/walinuxagent.service.d/13-CPUQuota.conf
cat <<EOT | sudo tee /usr/lib/systemd/system/walinuxagent.service.d/13-CPUQuota.conf
[Service]
CPUQuota=
EOT
  resource customScript 'extensions' = {
    name: 'CustomScript'
    properties: {
      autoUpgradeMinorVersion: true
      publisher: 'Microsoft.Azure.Extensions'
      type: 'CustomScript'
      typeHandlerVersion: '2.1'
      settings: {
        fileUris: [
          '${storage.properties.primaryEndpoints.blob}scripts/ubuntu2404/systemd-conf.sh'
        ]
      }
      protectedSettings: {
        commandToExecute: './systemd-conf.sh'
        managedIdentity: {}
      }
    }
  }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions