Skip to content

NVMe host buffer optimisation#7314

Merged
pelwell merged 2 commits intoraspberrypi:rpi-6.18.yfrom
P33M:nvme_hmb_arm64
Apr 15, 2026
Merged

NVMe host buffer optimisation#7314
pelwell merged 2 commits intoraspberrypi:rpi-6.18.yfrom
P33M:nvme_hmb_arm64

Conversation

@P33M
Copy link
Copy Markdown
Contributor

@P33M P33M commented Apr 14, 2026

As per the commit description(s) this avoids allocating chunks of CMA, probably entirely. With two drives wanting 64MiB (and a cmdline.txt override for the nvme module parameter) CMA usage after boot is identical to the size=0 case :

CmaTotal:          65536 kB
CmaFree:           55232 kB

The HMB can be enabled/disabled at runtime, and under memory pressure I've seen the chunking try MAX_ORDER / 4 and succeed.

P33M added 2 commits April 14, 2026 16:07
The Host Memory Buffer allocation algorithm interacts badly with arm64
platforms with no IOMMU for PCIe devices, such as BCM2711/BCM2712.

The discontiguous VA allocation in nvme_host_mem_alloc_single() always
fails, so nvme_alloc_host_mem_multi() falls back to the DMA coherent
allocation scheme. On arm64, this will come out of CMA by default.

Recent DRAM-less SSDs will request significant amounts of host memory -
up to 128MB. As NVMe devices are set up early in boot, CMA is
mostly-free so it ends up being claimed by a driver using it for opaque
device-exclusive buffers. The divide-and-conquer allocation strategy
also paradoxically results in increased CMA pressure if portions are
already reserved.

PCIe NVMe controllers implement a variably-sized HMB descriptor table,
typically ranging from 32 to 256 entries in size. Therefore, aside from
implementation-specific costs in the controller doing more granular
look-ups, providing smaller orders is acceptable. Failing to provide a
HMB does not prevent the controller from functioning.

Create an alternate implementation for arm64 that creates a scatterlist
and directly assigns contiguous pages from the buddy allocator, retrying
with smaller orders on failure. This will avoid CMA by default.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
In raspberrypi#6504 the nvme HMB
allocation was found to come from CMA so was restricted to zero on the
basis of a superficial amount of testing showing no performance impact.

This is not the case for high-density low-capacity DRAM-less drives,
where random read can suffer up to 50% degradation.

With the addition of commit 6686634 ("nvme-pci: manually allocate
Host Memory Buffer segments on arm64") this restriction is no longer
necessary.

Bump the default to 32MiB, which is a fair compromise for increasing
random read performance on smaller drives that typically have less
internal parallelism and request smaller HMBs.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
@pelwell pelwell merged commit 7797665 into raspberrypi:rpi-6.18.y Apr 15, 2026
11 of 12 checks passed
@P33M P33M deleted the nvme_hmb_arm64 branch April 15, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants