Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
lib/lz4: enable LZ4_FAST_DEC_LOOP on aarch64 Clang builds
Upstream lz4 mentioned a performance regression on Qualcomm SoCs when built with Clang, but not with GCC [1]. However, according to my testing on sdm845 with LLVM Clang 15, this patch does offer a nice 5-10% boost in decompression, so enable the fast dec loop for Clang as well. Testing procedure: - pre-fill zram with 1GB of real-word zram data dumped under memory pressure, for example $ dd if=/sdcard/zram.test of=/dev/block/zram0 bs=1m count=1000 - $ fio --readonly --name=randread --direct=1 --rw=randread \ --ioengine=psync --randrepeat=0 --numjobs=4 --iodepth=1 \ --group_reporting=1 --filename=/dev/block/zram0 --bs=4K --size=1000M Results: - vanilla lz4: read: IOPS=1282k, BW=5006MiB/s (5249MB/s)(4000MiB/799msec) - lz4 fast dec: read: IOPS=1382k, BW=5398MiB/s (5660MB/s)(4000MiB/741msec) [1] lz4/lz4#707 Signed-off-by: Chenyang Zhong <zhongcy95@gmail.com>
- Loading branch information