Getting PerfSpect | Running PerfSpect | Building PerfSpect
Intel® PerfSpect is a command-line tool designed to help you analyze and optimize Linux servers and the software running on them. Whether you’re a system administrator, a developer, or a performance engineer, PerfSpect provides comprehensive insights and actionable recommendations to enhance performance and efficiency.
We welcome bug reports and enhancement requests, which can be submitted via the Issues section on GitHub. For those interested in contributing to the code, please refer to the guidelines outlined in the CONTRIBUTING.md file.
Pre-built PerfSpect releases are available in the repository's Releases. Download and extract perfspect.tgz.
wget -qO- https://github.com/intel/PerfSpect/releases/latest/download/perfspect.tgz | tar xvz
cd perfspect
PerfSpect includes a suite of commands designed to analyze and optimize both system and software performance.
Usage: perfspect [command] [flags]
Command | Description |
---|---|
metrics |
CPU core and uncore metrics |
report |
System configuration and health |
telemetry |
System telemetry |
flame |
Software call-stacks as flamegraphs |
lock |
Software hot spot, cache-to-cache and lock contention |
config |
Modify system configuration |
Tip
Run perfspect [command] -h
to view command-specific help text.
The metrics
command generates reports containing CPU architectural performance characterization metrics in HTML and CSV formats. Run perfspect metrics
.
The metrics
command supports two modes -- default and "live". Default mode behaves as above -- metrics are collected and saved into report files for review. The "live" mode prints the metrics to stdout where they can be viewed in the console and/or redirected into a file or observability pipeline. Run perfspect metrics --live
.
If neither sudo nor root access is available, an administrator must apply the following configuration to the target system(s):
- sysctl -w kernel.perf_event_paranoid=0
- sysctl -w kernel.nmi_watchdog=0
- write '125' to all perf_event_mux_interval_ms files found under /sys/devices/*, for example,
for i in $(find /sys/devices -name perf_event_mux_interval_ms); do echo 125 > $i; done
Once the configuration changes are applied, use the --noroot
flag on the command line, for example, perfspect metrics --noroot
.
See perfspect metrics -h
for the extensive set of options and examples.
The report
command generates system configuration reports in a variety of formats. All categories of information are collected by default. See perfspect report -h
for all options.
It's possible to report a subset of information by providing command line options. Note that by specifying only the txt
format, it is printed to stdout, as well as written to a report file.
$ ./perfspect report --bios --format txt BIOS ==== Vendor: Intel Corporation Version: EGSDCRB1.SYS.1752.P05.2401050248 Release Date: 01/05/2024
To assist in evaluating the health of target systems, the report
command can run a series of micro-benchmarks by applying the --benchmark
flag, for example, perfspect report --benchmark all
The benchmark results will be reported along with the target's configuration details.
Important
Benchmarks should be run on idle systems to ensure accurate measurements and to avoid interfering with active workloads.
benchmark | Description |
---|---|
all | runs all benchmarks |
speed | runs each stress-ng cpu-method for 1s each, reports the geo-metric mean of all results. |
power | runs stress-ng to load all cpus to 100% for 60s. Uses turbostat to measure power. |
temperature | runs the same micro benchmark as 'power', but extracts maximum temperature from turbostat output. |
frequency | runs avx-turbo to measure scalar and AVX frequencies across processor's cores. Note: Runtime increases with core count. |
memory | runs Intel(r) Memory Latency Checker (MLC) to measure memory bandwidth and latency across a load range. Note: MLC is not included with PerfSpect. It can be downloaded from here. Once downloaded, extract the Linux executable and place it in the perfspect/tools/x86_64 directory. |
numa | runs Intel(r) Memory Latency Checker(MLC) to measure bandwidth between NUMA nodes. See Note above about downloading MLC. |
storage | runs fio for 2 minutes in read/write mode with a single worker to measure single-thread read and write bandwidth. Use the --storage-dir flag to override the default location. Minimum 5GB disk space required to run test. |
The telemetry
command reports CPU utilization, instruction mix, disk stats, network stats, and more on the specified target(s). All telemetry types are collected by default. To choose telemetry types, see the additional command line options (perfspect telemetry -h
).
Software flamegraphs are useful in diagnosing software performance bottlenecks. Run perfspect flame
to capture a system-wide software flamegraph.
Note
Perl is required on the target system to process the data needed for flamegraphs.
As systems contain more and more cores, it can be useful to analyze the Linux kernel lock overhead and potential false-sharing that impacts system scalability. Run perfspect lock
to collect system-wide hot spot, cache-to-cache and lock contention information. Experienced performance engineers can analyze the collected information to identify bottlenecks.
The config
command provides a method to view and change various system configuration parameters. Run perfspect config -h
to view the parameters that can be modified.
Warning
Misconfiguring the system may cause it to stop functionining. In some cases, a reboot may be required to restore default settings.
Example:
$ ./perfspect config --cores 24 --llc 2.0 --uncore-max 1.8 ...
By default, PerfSpect targets the local host, that is, the host where PerfSpect is running. Remote systems can also be targeted if they are reachable via SSH from the local host.
Important
Ensure the remote user has password-less sudo access (or root privileges) to fully utilize PerfSpect's capabilities.
To target a single remote system with a pre-configured private key:
$ ./perfspect report --target 192.168.1.42 --user fred --key ~/.ssh/fredkey ...
To target a single remote system with a password:
$ ./perfspect report --target 192.168.1.42 --user fred fred@192.168.1.42's password: ****** ...
To target more than one remote system, a YAML file with the necessary connection parameters is provided to PerfSpect. Refer to the example YAML file: targets.yaml.
$ ./perfspect report --targets mytargets.yaml ...
Note
All PerfSpect commands support remote targets, but some command options are limited to the local target.
By default, PerfSpect writes to a log file (perfspect.log) in the user's current working directory. Optionally, PerfSpect can direct logs to the local system's syslog daemon.
$ ./perfspect metrics --syslog
By default, PerfSpect creates a unique directory in the user's current working directory to store output files. Users can specify a custom output directory, but the directory provided must exist; PerfSpect will not create it.
$./perfspect telemetry --output /home/elaine/perfspect/telemetry
Tip
Skip the build. Pre-built PerfSpect releases are available in the repository's Releases. Download and extract perfspect.tgz.
Use builder/build.sh
to build the dependencies and the application in Docker containers with the required build environments. Ensure Docker is properly configured on your build system before running the script.
make
builds the app. It assumes the dependencies have been built previously and that you have Go installed on your development system. See go.mod for the minimum Go version.