Skip to content

Commit

Permalink
perf stat: Enable iostat mode for x86 platforms
Browse files Browse the repository at this point in the history
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):

Commit bb42b3d ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:

 - Inbound Read: I/O devices below root port read from the host memory
 - Inbound Write: I/O devices below root port write to the host memory
 - Outbound Read: CPU reads from I/O devices below root port
 - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
    #EventCount * 4B / (1024 * 1024)

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
  • Loading branch information
Alexander Antonov authored and acmel committed Apr 20, 2021
1 parent 19776d3 commit f9ed693
Show file tree
Hide file tree
Showing 6 changed files with 466 additions and 1 deletion.
88 changes: 88 additions & 0 deletions tools/perf/Documentation/perf-iostat.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
perf-iostat(1)
===============

NAME
----
perf-iostat - Show I/O performance metrics

SYNOPSIS
--------
[verse]
'perf iostat' list
'perf iostat' <ports> -- <command> [<options>]

DESCRIPTION
-----------
Mode is intended to provide four I/O performance metrics per each PCIe root port:

- Inbound Read - I/O devices below root port read from the host memory, in MB

- Inbound Write - I/O devices below root port write to the host memory, in MB

- Outbound Read - CPU reads from I/O devices below root port, in MB

- Outbound Write - CPU writes to I/O devices below root port, in MB

OPTIONS
-------
<command>...::
Any command you can specify in a shell.

list::
List all PCIe root ports.

<ports>::
Select the root ports for monitoring. Comma-separated list is supported.

EXAMPLES
--------

1. List all PCIe root ports (example for 2-S platform):

$ perf iostat list
S0-uncore_iio_0<0000:00>
S1-uncore_iio_0<0000:80>
S0-uncore_iio_1<0000:17>
S1-uncore_iio_1<0000:85>
S0-uncore_iio_2<0000:3a>
S1-uncore_iio_2<0000:ae>
S0-uncore_iio_3<0000:5d>
S1-uncore_iio_3<0000:d7>

2. Collect metrics for all PCIe root ports:

$ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
357708+0 records in
357707+0 records out
375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s

Performance counter stats for 'system wide':

port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
0000:00 1 0 2 3
0000:80 0 0 0 0
0000:17 352552 43 0 21
0000:85 0 0 0 0
0000:3a 3 0 0 0
0000:ae 0 0 0 0
0000:5d 0 0 0 0
0000:d7 0 0 0 0

3. Collect metrics for comma-separated list of PCIe root ports:

$ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
357708+0 records in
357707+0 records out
375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s

Performance counter stats for 'system wide':

port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
0000:17 358559 44 0 22
0000:3a 3 2 0 0

197.081983474 seconds time elapsed

SEE ALSO
--------
linkperf:perf-stat[1]
5 changes: 4 additions & 1 deletion tools/perf/Makefile.perf
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,7 @@ SCRIPT_SH =

SCRIPT_SH += perf-archive.sh
SCRIPT_SH += perf-with-kcore.sh
SCRIPT_SH += perf-iostat.sh

grep-libs = $(filter -l%,$(1))
strip-libs = $(filter-out -l%,$(1))
Expand Down Expand Up @@ -948,6 +949,8 @@ endif
$(INSTALL) $(OUTPUT)perf-archive -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
$(call QUIET_INSTALL, perf-with-kcore) \
$(INSTALL) $(OUTPUT)perf-with-kcore -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
$(call QUIET_INSTALL, perf-iostat) \
$(INSTALL) $(OUTPUT)perf-iostat -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
ifndef NO_LIBAUDIT
$(call QUIET_INSTALL, strace/groups) \
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(STRACE_GROUPS_INSTDIR_SQ)'; \
Expand Down Expand Up @@ -1042,7 +1045,7 @@ bpf-skel-clean:
$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)

clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
$(Q)$(RM) $(OUTPUT).config-detected
$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents $(OUTPUT)$(LIBJVMTI).so
Expand Down
1 change: 1 addition & 0 deletions tools/perf/arch/x86/util/Build
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ perf-y += event.o
perf-y += evlist.o
perf-y += mem-events.o
perf-y += evsel.o
perf-y += iostat.o

perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
Expand Down

0 comments on commit f9ed693

Please sign in to comment.