Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blkio limits don't get applied on attached BeegFS volume #6277

Closed
maran opened this issue Oct 7, 2019 · 4 comments · Fixed by #6288

Comments

@maran
Copy link

commented Oct 7, 2019

Required information

  • Distribution: Ubuntu 18.04
  • The output of "lxc info" or if that fails:
config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIICGzCCAaKgAwIBAgIRAJaAdO+HvgrkGbCq34yKPVcwCgYIKoZIzj0EAwMwPjEc
    MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEeMBwGA1UEAwwVcm9vdEAxODA0
    LmRldi5ieXNoLm1lMB4XDTE5MTAwNDA3MzUwMVoXDTI5MTAwMTA3MzUwMVowPjEc
    MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEeMBwGA1UEAwwVcm9vdEAxODA0
    LmRldi5ieXNoLm1lMHYwEAYHKoZIzj0CAQYFK4EEACIDYgAEhsMR682NdOy37MmB
    KUFO33ElBdopi0DKHgGntL6KcLT612TVJkY8hIKeQ8Arh7UVfBHfgzeUbQLgoxUO
    Nz+1uzvzOg2euvNc++opFsFIIlifSVba0niQHEDIwJ/vD7gCo2QwYjAOBgNVHQ8B
    Af8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADAtBgNV
    HREEJjAkghAxODA0LmRldi5ieXNoLm1lhwRf06HChwQKCAABhwSsEQABMAoGCCqG
    SM49BAMDA2cAMGQCMFrqHrF3wytHd5kDsDF/cMTnj1DsjTKB3vgdGyulctL0vF41
    hTjNGdunnBc13GV4MgIwMcEjfgNumzZ17YP8Of1nREbDQ4fVtkNCzHU1Q7L635EG
    spaU9UN60c/FO4PgKUa9
    -----END CERTIFICATE-----
  certificate_fingerprint: 57e6d38c216b0a4c8790940f5b7450a45aa4f68b450c2609c103476c2ac06e32
  driver: lxc
  driver_version: 3.0.3
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    shiftfs: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.0.0-29-generic
  lxc_features:
    mount_injection_file: "false"
    network_gateway_device_route: "false"
    network_ipvlan: "false"
    network_l2proxy: "false"
    network_phys_macvlan_mtu: "false"
    seccomp_notify: "false"
  project: default
  server: lxd
  server_clustered: false
  server_name: 1804.dev.bysh.me
  server_pid: 362974
  server_version: "3.18"
  storage: ceph
  storage_version: ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4)
    mimic (stable)`

Issue description

As discussed here when working with an rbd volumes it appears limits.read/write are not being translated into valid cgroup limits.

Steps to reproduce

  1. Setup a Ceph RBD backend
  2. Create a container and attach a rbd volume to it
  3. Set limits in the config file, in my example:
devices:
  maran-test:
    limits.read: 1MB
    limits.write: 1MB
    path: /mnt/external
    pool: default
    source: maran-test
    type: disk
  1. Expect to see blkio.throttle.write_bps_device or blkio.throttle.read_bps_device have some data, but they are empty.

Let me know how I can help with this.

@stgraber

This comment has been minimized.

Copy link
Member

commented Oct 7, 2019

So I'm pretty confused as to how limits would actually ever have worked since @tomponline's rework of devices.

The issue is that the Start() function on the device is called prior to the container starting, this generates the list of needed mounts but doesn't mount anything yet. This in turn means that we don't know what device backs the source path yet and so can't ever compute the needed cgroup entries.

I suspect what we need to do is move the limit calculation to a PostHook, which then allows us to inspect the mounted disks. That PostHook should then return a RunConfig with the cgroup entries we expect and we can then have LXD apply them through LXC.

@tomponline does that sound right to you?

@tomponline

This comment has been minimized.

Copy link
Member

commented Oct 7, 2019

@stgraber the cgroup settings are returned as part of run config by Start() and if being called as part of container start then the cgroup rules are translated into liblxc settings so they are applied when the container actually starts. Here: https://github.com/lxc/lxd/blob/master/lxd/container_lxc.go#L2243-L2251 this is the same technique used for the actual mounts.

I'll take a look and check its working on other disk types to check its not specific to RBD.

@stgraber

This comment has been minimized.

Copy link
Member

commented Oct 7, 2019

@tomponline the problem is that those rules cannot be generated until a mount entry exists and that mount entry won't exist until RunConfig is applied.

That's why I'm now adding a PostRunConfig which can be filed through PostHooks and get applied after the container has started.

This then allows us to resolve the mounts to block devices and figure out the limits.

@stgraber

This comment has been minimized.

Copy link
Member

commented Oct 7, 2019

I've got a branch which does this now and that's fixed the limits for the root device here.
There's still a problem resolving the mount for devices that are hotplugged though, so looking into those still.

stgraber added a commit to stgraber/lxd that referenced this issue Oct 7, 2019
Closes lxc#6277

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
@stgraber stgraber added the Bug label Oct 7, 2019
@stgraber stgraber self-assigned this Oct 7, 2019
@stgraber stgraber added this to the lxd-3.19 milestone Oct 7, 2019
@brauner brauner closed this in #6288 Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
You can’t perform that action at this time.