Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance regression in dracut-install 060 #316

Open
bdrung opened this issue May 24, 2024 · 13 comments · May be fixed by #332
Open

performance regression in dracut-install 060 #316

bdrung opened this issue May 24, 2024 · 13 comments · May be fixed by #332
Labels

Comments

@bdrung
Copy link
Contributor

bdrung commented May 24, 2024

Describe the bug

When compared to Ubuntu 23.10 (dracut 059), creating intramfs files with update-initramfs in Ubuntu 24.04 (dracut 060) takes 2 to 5 times more time on ARM devices.

Distribution used
Ubuntu

Dracut version
060

To Reproduce
Run dracut-install on an ARM device.

Additional context

@bdrung bdrung added the bug Our bugs label May 24, 2024
@LaszloGombos
Copy link
Collaborator

LaszloGombos commented May 24, 2024

1./ > first bad commit - dracutdevs/dracut@3de4c73

dracutdevs/dracut#2075

CC @athierry1

2./
Recent performance bugreport on Fedora https://bugzilla.redhat.com/show_bug.cgi?id=2278534 (might be a separate issue).

From @Conan-Kudo

It looks like this commit may improve things: 80f2caf

@LaszloGombos LaszloGombos added drm Issues related to the drm module and removed drm Issues related to the drm module labels May 24, 2024
@bdrung
Copy link
Contributor Author

bdrung commented May 24, 2024

From the Ubuntu bug:

first bad commit 3de4c73

@alpernebbi
Copy link
Contributor

alpernebbi commented May 24, 2024

I think dracut-install parses every fw_devlink supplier to get a dependency graph (resulting in a lot of /sys and modules.alias.bin reads), then uses that to install dependencies for modules given as arguments. Maybe it could parse suppliers for only the given modules? Or maybe it could cache the supplier dependency information across multiple runs on the same system?

FWIW, I get this on a RK3399 system:

$ time /usr/lib/dracut/dracut-install -D "$(mktemp -d)" --kerneldir "/lib/modules/$(uname -r)" -v -o -m nonexistent
dracut-install: Failed to find module 'nonexistent'

real    0m2.596s
user    0m1.728s
sys     0m0.864s
Some syscalls it does
$ strace -e trace=readlinkat,openat,newfstatat /usr/lib/dracut/dracut-install -D "$(mktemp -d)" --kerneldir "/lib/modules/$(uname -r)" -v -o -m nonexistent | uniq -c | sort -nr | head -20
   4984 newfstatat(4, "", {st_mode=S_IFDIR|0755, st_size=0, ...}, AT_EMPTY_PATH) = 0
   1673 readlinkat(AT_FDCWD, "/sys/devices/platform", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
   1429 openat(AT_FDCWD, "..", O_RDONLY)        = 4
   1272 readlinkat(AT_FDCWD, "/sys/devices", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
   1272 readlinkat(AT_FDCWD, "/sys", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    550 openat(AT_FDCWD, "power", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
    550 newfstatat(AT_FDCWD, "power", {st_mode=S_IFDIR|0755, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
    401 readlinkat(AT_FDCWD, "/sys/devices/virtual/devlink", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    401 readlinkat(AT_FDCWD, "/sys/devices/virtual", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    291 readlinkat(AT_FDCWD, "/sys/devices/platform/ff200000.spi", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    239 readlinkat(AT_FDCWD, "/sys/devices/platform/ff200000.spi/spi_master/spi5/spi5.0", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    239 readlinkat(AT_FDCWD, "/sys/devices/platform/ff200000.spi/spi_master/spi5", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    239 readlinkat(AT_FDCWD, "/sys/devices/platform/ff200000.spi/spi_master", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    202 openat(AT_FDCWD, "/lib/modules/6.9+unreleased-arm64/modules.alias.bin", O_RDONLY|O_CLOEXEC) = 5
    202 newfstatat(5, "", {st_mode=S_IFREG|0644, st_size=1342225, ...}, AT_EMPTY_PATH) = 0
    195 readlinkat(AT_FDCWD, "/sys/devices/platform/pinctrl", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    174 readlinkat(AT_FDCWD, "/sys/devices/platform/usb@fe900000", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    156 readlinkat(AT_FDCWD, "/sys/devices/platform/usb@fe900000/fe900000.usb", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
    110 readlinkat(AT_FDCWD, "/sys/devices/platform/usb@fe900000/fe900000.usb/xhci-hcd.13.auto", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)
     98 readlinkat(AT_FDCWD, "/sys/devices/platform/ff200000.spi/spi_master/spi5/spi5.0/cros-ec-dev.0.auto", 0xffffec4b0a40, 1023) = -1 EINVAL (Invalid argument)

Recent performance bugreport on Fedora https://bugzilla.redhat.com/show_bug.cgi?id=2278534 (might be a separate issue).

Looks similar if not the same. Except we have initramfs-tools calling dracut-install a lot instead of /usr/lib/dracut/modules.d/50drm/module-setup.sh calling dracut-install a lot (via dracut_instmods).

@bdrung
Copy link
Contributor Author

bdrung commented May 24, 2024

I just annotated manual_add_modules in /usr/share/initramfs-tools/hook-functions and ran it on my amd64 laptop:

/usr/lib/dracut/dracut-install -o -m -P /hid-(a4tech|cypress|dr|elecom|gyration|icade|kensington|kye|lcpower|magicmouse|ntrig|petalynx|picolcd|pl|ps3remote|quanta|roccat-ko.*|roccat-pyra|saitek|sensor-hub|sony|speedlink|tivo|twinhan|uclogic|wacom|waltop|wiimote|zydacron|.*ff)\.ko =drivers/hid
/usr/lib/dracut/dracut-install -o -m -P /((cdc_mbim|ipheth|qmi_wwan|sierra_net|veth|xen-netback)\.ko|(isdn|net/ethernet|net/phy|net/team|uwb|wan|wireless)/) -s eth_type_trans|register_virtio_device|usbnet_open =drivers/net
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk|iscsi_register_transport =drivers/scsi =drivers/ufs
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk =drivers/block =drivers/nvme =drivers/dax vmd
/usr/lib/dracut/dracut-install -o -m -s nvdimm_bus_register =drivers/nvdimm =drivers/acpi
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk =drivers/ata
/usr/lib/dracut/dracut-install -o -m btrfs ext2 ext3 ext4 f2fs isofs jfs reiserfs squashfs udf xfs nfs nfsv2 nfsv3 nfsv4 af_packet atkbd i8042 psmouse virtio_pci virtio_mmio vfat nls_cp437 nls_iso8859-1 ehci-hcd ehci-pci ehci-platform ohci-hcd ohci-pci uhci-hcd usbhid xhci-hcd xhci-pci xhci-plat-hcd =drivers/usb/typec =drivers/usb/c67x00 =drivers/usb/renesas_usbhs extcon-usb-gpio extcon-usbc-cros-ec =drivers/input/keyboard cros_ec_spi intel_lpss_pci spi_pxa2xx_platform surface_aggregator_registry =drivers/tty/serial =drivers/bus =drivers/i2c/muxes =drivers/pci/controller =drivers/pinctrl =drivers/char/hw_random =drivers/net/ethernet =drivers/net/mdio =drivers/net/phy 8021q ipvlan =drivers/ide be2iscsi bnx2i cxgb3i cxgb4i qedi qla4xxx scsi_dh_alua scsi_dh_emc scsi_dh_rdac mptfc mptsas mptscsih mptspi zfcp scsi_transport_srp dax_pmem nd_pmem dasd_diag_mod dasd_eckd_mod dasd_fba_mod firewire-ohci firewire-sbp2 =drivers/mmc =drivers/usb/storage rockchipdrm pwm-cros-ec pwm_bl pwm-rockchip panel-simple analogix-anx6345 pwm-sun4i sun4i-drm sun8i-mixer panel-edp pwm_imx27 nwl-dsi ti-sn65dsi86 imx-dcss mux-mmio mxsfb imx8mq-interconnect hv_vmbus hv_utils hv_netvsc hv_mouse hv_storvsc hyperv-keyboard nx-compress nx-compress-crypto nx-compress-platform nx-compress-pseries nx-compress-powernv 842-decompress
/usr/lib/dracut/dracut-install -o -m psmouse crc32c mlx4_ib mlx5_ib vmd crc32 crc32c
/usr/lib/dracut/dracut-install -o -m dm_mod
/usr/lib/dracut/dracut-install -o -m dm_crypt
/usr/lib/dracut/dracut-install -o -m serpent-avx-x86_64 poly1305-x86_64 aria-aesni-avx-x86_64 sm4-aesni-avx2-x86_64 twofish-x86_64 sm3-avx-x86_64 twofish-x86_64-3way sm4-aesni-avx-x86_64 des3_ede-x86_64 aria-gfni-avx512-x86_64 crct10dif-pclmul serpent-avx2 aria-aesni-avx2-x86_64 camellia-x86_64 twofish-avx-x86_64 polyval-clmulni ghash-clmulni-intel cast5-avx-x86_64 sha256-ssse3 serpent-sse2-x86_64 nhpoly1305-avx2 chacha-x86_64 camellia-aesni-avx2 sha1-ssse3 blowfish-x86_64 cast6-avx-x86_64 crc32-pclmul aegis128-aesni nhpoly1305-sse2 aesni-intel curve25519-x86_64 camellia-aesni-avx-x86_64
/usr/lib/dracut/dracut-install -o -m sm2_generic pcrypt md4 ansi_cprng nhpoly1305 chacha20poly1305 sm3_generic keywrap echainiv 842 xctr lrw twofish_common ccm blowfish_generic algif_hash adiantum xcbc algif_rng crypto_user ecdh_generic pkcs8_key_parser pkcs7_test_key camellia_generic blake2b_generic vmac authenc aria_generic rmd160 ecrdsa_generic hctr2 poly1305_generic ecdsa_generic blowfish_common xxhash_generic af_alg crc32_generic sm4_generic twofish_generic authencesn cast5_generic des_generic aegis128 algif_aead tcrypt pcbc wp512 ecc chacha_generic curve25519-generic lz4 serpent_generic xor sm4 algif_skcipher cast_common michael_mic zstd polyval-generic crypto_simd cast6_generic async_tx async_pq async_raid6_recov async_memcpy async_xor essiv sm3 crypto_engine streebog_generic cryptd cmac lz4hc aes_ti fcrypt
/usr/lib/dracut/dracut-install -o -m -s drm_privacy_screen_register =drivers/platform/x86
/usr/lib/dracut/dracut-install -o -m efifb fbcon simplefb vesafb vga16fb =drivers/gpu/drm/tiny vboxvideo virtio-gpu
/usr/lib/dracut/dracut-install -o -m fuse
/usr/lib/dracut/dracut-install -o -m fan thermal
/usr/lib/dracut/dracut-install -o -m dm-cache
/usr/lib/dracut/dracut-install -o -m dm-cache-smq
/usr/lib/dracut/dracut-install -o -m dm-thin-pool
/usr/lib/dracut/dracut-install -o -m dm_mod
/usr/lib/dracut/dracut-install -o -m dm_snapshot
/usr/lib/dracut/dracut-install -o -m dm_mirror
/usr/lib/dracut/dracut-install -o -m dm_raid
/usr/lib/dracut/dracut-install -o -m raid0
/usr/lib/dracut/dracut-install -o -m raid1
/usr/lib/dracut/dracut-install -o -m raid10
/usr/lib/dracut/dracut-install -o -m raid456
/usr/lib/dracut/dracut-install -o -m dm_mod

On that setup initramfs-tools had 28 dracut-install calls.

@bdrung
Copy link
Contributor Author

bdrung commented May 24, 2024

I could combine some of those calls reducing the number from 28 to 18:

/usr/lib/dracut/dracut-install -o -m -P /hid-(a4tech|cypress|dr|elecom|gyration|icade|kensington|kye|lcpower|magicmouse|ntrig|petalynx|picolcd|pl|ps3remote|quanta|roccat-ko.*|roccat-pyra|saitek|sensor-hub|sony|speedlink|tivo|twinhan|uclogic|wacom|waltop|wiimote|zydacron|.*ff)\.ko =drivers/hid
/usr/lib/dracut/dracut-install -o -m -P /((cdc_mbim|ipheth|qmi_wwan|sierra_net|veth|xen-netback)\.ko|(isdn|net/ethernet|net/phy|net/team|uwb|wan|wireless)/) -s eth_type_trans|register_virtio_device|usbnet_open =drivers/net
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk|iscsi_register_transport =drivers/scsi =drivers/ufs
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk =drivers/block =drivers/nvme =drivers/dax vmd
/usr/lib/dracut/dracut-install -o -m -s nvdimm_bus_register =drivers/nvdimm =drivers/acpi
/usr/lib/dracut/dracut-install -o -m -s ahci_platform_get_resources|ata_scsi_ioctl|scsi_add_host|blk_cleanup_queue|register_mtd_blktrans|scsi_esp_register|register_virtio_device|usb_stor_disconnect|mmc_add_host|sdhci_add_host|scsi_add_host_with_dma|blk_mq_alloc_disk|blk_mq_alloc_request|blk_mq_destroy_queue|blk_cleanup_disk =drivers/ata
/usr/lib/dracut/dracut-install -o -m btrfs ext2 ext3 ext4 f2fs isofs jfs reiserfs squashfs udf xfs nfs nfsv2 nfsv3 nfsv4 af_packet atkbd i8042 psmouse virtio_pci virtio_mmio vfat nls_cp437 nls_iso8859-1 ehci-hcd ehci-pci ehci-platform ohci-hcd ohci-pci uhci-hcd usbhid xhci-hcd xhci-pci xhci-plat-hcd =drivers/usb/typec =drivers/usb/c67x00 =drivers/usb/renesas_usbhs extcon-usb-gpio extcon-usbc-cros-ec =drivers/input/keyboard cros_ec_spi intel_lpss_pci spi_pxa2xx_platform surface_aggregator_registry =drivers/tty/serial =drivers/bus =drivers/i2c/muxes =drivers/pci/controller =drivers/pinctrl =drivers/char/hw_random =drivers/net/ethernet =drivers/net/mdio =drivers/net/phy 8021q ipvlan =drivers/ide be2iscsi bnx2i cxgb3i cxgb4i qedi qla4xxx scsi_dh_alua scsi_dh_emc scsi_dh_rdac mptfc mptsas mptscsih mptspi zfcp scsi_transport_srp dax_pmem nd_pmem dasd_diag_mod dasd_eckd_mod dasd_fba_mod firewire-ohci firewire-sbp2 =drivers/mmc =drivers/usb/storage rockchipdrm pwm-cros-ec pwm_bl pwm-rockchip panel-simple analogix-anx6345 pwm-sun4i sun4i-drm sun8i-mixer panel-edp pwm_imx27 nwl-dsi ti-sn65dsi86 imx-dcss mux-mmio mxsfb imx8mq-interconnect hv_vmbus hv_utils hv_netvsc hv_mouse hv_storvsc hyperv-keyboard nx-compress nx-compress-crypto nx-compress-platform nx-compress-pseries nx-compress-powernv 842-decompress
/usr/lib/dracut/dracut-install -o -m psmouse crc32c mlx4_ib mlx5_ib vmd crc32 crc32c
/usr/lib/dracut/dracut-install -o -m dm_mod dm_crypt
/usr/lib/dracut/dracut-install -o -m serpent-avx-x86_64 poly1305-x86_64 aria-aesni-avx-x86_64 sm4-aesni-avx2-x86_64 twofish-x86_64 sm3-avx-x86_64 twofish-x86_64-3way sm4-aesni-avx-x86_64 des3_ede-x86_64 aria-gfni-avx512-x86_64 crct10dif-pclmul serpent-avx2 aria-aesni-avx2-x86_64 camellia-x86_64 twofish-avx-x86_64 polyval-clmulni ghash-clmulni-intel cast5-avx-x86_64 sha256-ssse3 serpent-sse2-x86_64 nhpoly1305-avx2 chacha-x86_64 camellia-aesni-avx2 sha1-ssse3 blowfish-x86_64 cast6-avx-x86_64 crc32-pclmul aegis128-aesni nhpoly1305-sse2 aesni-intel curve25519-x86_64 camellia-aesni-avx-x86_64
/usr/lib/dracut/dracut-install -o -m sm2_generic pcrypt md4 ansi_cprng nhpoly1305 chacha20poly1305 sm3_generic keywrap echainiv 842 xctr lrw twofish_common ccm blowfish_generic algif_hash adiantum xcbc algif_rng crypto_user ecdh_generic pkcs8_key_parser pkcs7_test_key camellia_generic blake2b_generic vmac authenc aria_generic rmd160 ecrdsa_generic hctr2 poly1305_generic ecdsa_generic blowfish_common xxhash_generic af_alg crc32_generic sm4_generic twofish_generic authencesn cast5_generic des_generic aegis128 algif_aead tcrypt pcbc wp512 ecc chacha_generic curve25519-generic lz4 serpent_generic xor sm4 algif_skcipher cast_common michael_mic zstd polyval-generic crypto_simd cast6_generic async_tx async_pq async_raid6_recov async_memcpy async_xor essiv sm3 crypto_engine streebog_generic cryptd cmac lz4hc aes_ti fcrypt
/usr/lib/dracut/dracut-install -o -m -s drm_privacy_screen_register =drivers/platform/x86
/usr/lib/dracut/dracut-install -o -m efifb fbcon simplefb vesafb vga16fb =drivers/gpu/drm/tiny vboxvideo virtio-gpu
/usr/lib/dracut/dracut-install -o -m fuse
/usr/lib/dracut/dracut-install -o -m fan thermal
/usr/lib/dracut/dracut-install -o -m dm-cache dm-cache-smq dm-thin-pool
/usr/lib/dracut/dracut-install -o -m dm_mod dm_snapshot dm_mirror dm_raid raid0 raid1 raid10 raid456
/usr/lib/dracut/dracut-install -o -m dm_mod

The remaining ones will be more complicated, because some of them use different parameters and initramfs-tools hooks can call it too.

@alpernebbi
Copy link
Contributor

alpernebbi commented May 24, 2024

I wonder if we can combine them at the manual_add_modules function instead. Like, have it write new arguments to a file, and later call dracut-install once with arguments read from that file?

(Also, I'd say let's keep this bug focused on what we can do in dracut-install, but salsa.debian.org seems down at the moment.)

@bdrung
Copy link
Contributor Author

bdrung commented May 28, 2024

Proposed initramfs-tools change to reduce the number of dracut-install calls: https://salsa.debian.org/kernel-team/initramfs-tools/-/merge_requests/114

@alpernebbi
Copy link
Contributor

I've tried to fix this in #328 (merged) and #332, along with a few other issues I noticed. Please have a look.

@bdrung
Copy link
Contributor Author

bdrung commented Jun 4, 2024

#328 improves the performance on ARM and marginally on amd64, but #332 does not (see comment there).

@marcan
Copy link
Contributor

marcan commented Jun 23, 2024

#408 is a further perf improvement here, which I think gets us close to the original performance before the regression (at least on ARM64 systems like MacBooks).

@LaszloGombos
Copy link
Collaborator

@alpernebbi
Copy link
Contributor

#332 does not [improve performance].

That has become more about fixing missing cases in the implementation than improving performance. It does new work which makes things slower, but it's vital for some devices and I'd say correctness is worth the performance hit there. To make things clearer, I've split the performance fix there to #479.

@alpernebbi
Copy link
Contributor

OTOH, I have a better idea. Parsing sysfs on a device is slow and gives machine-dependent results. These supplier relations originate from the device-tree as far as I can tell, so we might be able to parse DTB files to get a reproducible, full list of these dependencies. We would be able to do it once per installed kernel for all its DTB files and cache the result across runs.

Even better, there's a new "weakdep" thing on the kernel side. If we can convince kernel people to parse device-tree files to generate these weak dependencies at compile-time, we can just copy the existing logic we have for usual dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants