-
Notifications
You must be signed in to change notification settings - Fork 8
OS performance tuning
cacao runs under linux on x86 systems.
Note: turbostat
, cpupower
, linux-tools
will not install on custom compiled kernels, making this section not applicable.
To change CPU settings, user can also read/write into /sys/devices/system/cpu. For example:
echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
For more info on performance tune ups with RT kernels, check the latest test page System Latency (kernel 5.6.10 realtime).
Install packages:
sudo apt-get install linux-tools-common linux-tools-generic
sudo apt install linux-tools-4.18.0-25-lowlatency
sudo modprobe msr
sudo depmod -a
To check current CPU clock speed:
sudo turbostat
Set CPU frequency to maximize performance:
sudo cpupower frequency-set --governor performance
Install C program setlatency.
Install rt-tests
cd ~/src
git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
cd rt-test
git checkout -b stable/v1.0 origin/stable/v1.0
sudo apt-get install build-essential libnuma-dev
make
To view results on local computer, install:
sudo apt install imagemagick-6.q16 gnuplot
Create sym link :
sudo ln -s /home/scexao/src/rt-tests/cyclictest /usr/local/bin/
KernelShark is a frontend reader/visualizer for trace-cmd output.
Disable all form of sleep modes:
sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target
Tone down the jounaling by making the following amends in /etc/systemd/journald.conf
:
MaxLevelStore=warning
MaxLevelSyslog=warning
MaxLevelKMsg=warning
MaxLevelConsole=notice
MaxLevelWall=crit
source: https://coreos.com/blog/eliminating-journald-delays-part-1.html
cacao provides a generic config script to setup a computer for high performance real-time operation. The script will need to be edited to match your system configuration and contains notes to help you do so. Please read notes and adapt script before running it.
Consider the following kernel options
- Disable Indirect Branch Restricted Speculation (noibrs)
- Disable Indirect Branch Prediction Barriers (noibpb)
- Disable Page Table Isolation (nopti)
- Disable Spectre v1 mitigation (nospectre_v1)
- Disable Spectre v2 mitigation (nospectre_v2)
- Disable L1 Cache Flushing (l1tf)
- Disable speculative execution store bypass mitigation (nospec_store_bypass_disable)
- Disable Store-buffer forwarding barrier (no_stf_barrier)
- Disable Microarchitectural Data Sampling (mds=off)
Notes and links:
- Mitigating the Performance Impact of Meltdown/Spectre Kernel Patches
- L1 cache flushing
- CVE-2018-3639: Systems with microprocessors utilizing speculative execution and speculative execution of memory reads before the addresses of all prior memory writes are known may allow unauthorized disclosure of information to an attacker with local user access via a side-channel analysis, aka Speculative Store Bypass (SSB), Variant 4.
- Microarchitectural Data Sampling
Some of the settings can be done at runtime. For example:
echo 0 > /sys/kernel/debug/x86/ibrs_enabled
Full recommended list:
noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off
On recent kernels:
no_stf_barrier mds=off mitigations=off
mitigations=
[X86,PPC,S390,ARM64] Control optional mitigations for
CPU vulnerabilities. This is a set of curated,
arch-independent options, each of which is an
aggregation of existing arch-specific options.
off
Disable all optional CPU mitigations. This
improves system performance, but it may also
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
nospectre_v1 [PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC]
ssbd=force-off [ARM64]
l1tf=off [X86]
nohalt [IA-64] Tells the kernel not to use the power saving
function PAL_HALT_LIGHT when idle. This increases
power-consumption. On the positive side, it reduces
interrupt wake-up latency, which may improve performance
in certain environments such as networked servers or
pcie_bus_perf Set device MPS to the largest allowable MPS
based on its parent bus. Also set MRRS (Max
Read Request Size) to the largest supported
value (no larger than the MPS that the device
or bus can support) for best performance.
schedstats= [KNL,X86] Enable or disable scheduled statistics.
Allowed values are enable and disable. This feature
incurs a small amount of overhead in the scheduler
but is useful for debugging and performance tuning.
workqueue.power_efficient
Per-cpu workqueues are generally preferred because
they show better performance thanks to cache
locality; unfortunately, per-cpu workqueues tend to
be more power hungry than unbound workqueues.
Enabling this makes the per-cpu workqueues which
were observed to contribute significantly to power
consumption unbound, leading to measurably lower
power usage at the cost of small performance
overhead.
The default value of this parameter is determined by
the config option CONFIG_WQ_POWER_EFFICIENT_DEFAULT.
compute and control for adaptive optics (cacao) - https://github.com/cacao-org/cacao
- Real-Time OS install
- OS Performance Tuning
- Real-time OS benchmarks:
- GPU drivers and tools
- cacao Performance