Skip to content

Commit 0ba6477

Browse files
rmurphy-armwilldeacon
authored andcommitted
perf: Add Arm CMN-600 PMU driver
Initial driver for PMU event counting on the Arm CMN-600 interconnect. CMN sports an obnoxiously complex distributed PMU system as part of its debug and trace features, which can do all manner of things like sampling, cross-triggering and generating CoreSight trace. This driver covers the PMU functionality, plus the relevant aspects of watchpoints for simply counting matching flits. Tested-by: Tsahi Zidenberg <tsahee@amazon.com> Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
1 parent c8fdbbf commit 0ba6477

File tree

5 files changed

+1715
-0
lines changed

5 files changed

+1715
-0
lines changed
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
=============================
2+
Arm Coherent Mesh Network PMU
3+
=============================
4+
5+
CMN-600 is a configurable mesh interconnect consisting of a rectangular
6+
grid of crosspoints (XPs), with each crosspoint supporting up to two
7+
device ports to which various AMBA CHI agents are attached.
8+
9+
CMN implements a distributed PMU design as part of its debug and trace
10+
functionality. This consists of a local monitor (DTM) at every XP, which
11+
counts up to 4 event signals from the connected device nodes and/or the
12+
XP itself. Overflow from these local counters is accumulated in up to 8
13+
global counters implemented by the main controller (DTC), which provides
14+
overall PMU control and interrupts for global counter overflow.
15+
16+
PMU events
17+
----------
18+
19+
The PMU driver registers a single PMU device for the whole interconnect,
20+
see /sys/bus/event_source/devices/arm_cmn. Multi-chip systems may link
21+
more than one CMN together via external CCIX links - in this situation,
22+
each mesh counts its own events entirely independently, and additional
23+
PMU devices will be named arm_cmn_{1..n}.
24+
25+
Most events are specified in a format based directly on the TRM
26+
definitions - "type" selects the respective node type, and "eventid" the
27+
event number. Some events require an additional occupancy ID, which is
28+
specified by "occupid".
29+
30+
* Since RN-D nodes do not have any distinct events from RN-I nodes, they
31+
are treated as the same type (0xa), and the common event templates are
32+
named "rnid_*".
33+
34+
* The cycle counter is treated as a synthetic event belonging to the DTC
35+
node ("type" == 0x3, "eventid" is ignored).
36+
37+
* XP events also encode the port and channel in the "eventid" field, to
38+
match the underlying pmu_event0_id encoding for the pmu_event_sel
39+
register. The event templates are named with prefixes to cover all
40+
permutations.
41+
42+
By default each event provides an aggregate count over all nodes of the
43+
given type. To target a specific node, "bynodeid" must be set to 1 and
44+
"nodeid" to the appropriate value derived from the CMN configuration
45+
(as defined in the "Node ID Mapping" section of the TRM).
46+
47+
Watchpoints
48+
-----------
49+
50+
The PMU can also count watchpoint events to monitor specific flit
51+
traffic. Watchpoints are treated as a synthetic event type, and like PMU
52+
events can be global or targeted with a particular XP's "nodeid" value.
53+
Since the watchpoint direction is otherwise implicit in the underlying
54+
register selection, separate events are provided for flit uploads and
55+
downloads.
56+
57+
The flit match value and mask are passed in config1 and config2 ("val"
58+
and "mask" respectively). "wp_dev_sel", "wp_chn_sel", "wp_grp" and
59+
"wp_exclusive" are specified per the TRM definitions for dtm_wp_config0.
60+
Where a watchpoint needs to match fields from both match groups on the
61+
REQ or SNP channel, it can be specified as two events - one for each
62+
group - with the same nonzero "combine" value. The count for such a
63+
pair of combined events will be attributed to the primary match.
64+
Watchpoint events with a "combine" value of 0 are considered independent
65+
and will count individually.

Documentation/admin-guide/perf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Performance monitor support
1212
qcom_l2_pmu
1313
qcom_l3_pmu
1414
arm-ccn
15+
arm-cmn
1516
xgene-pmu
1617
arm_dsu_pmu
1718
thunderx2-pmu

drivers/perf/Kconfig

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,13 @@ config ARM_CCN
4141
PMU (perf) driver supporting the ARM CCN (Cache Coherent Network)
4242
interconnect.
4343

44+
config ARM_CMN
45+
tristate "Arm CMN-600 PMU support"
46+
depends on ARM64 || (COMPILE_TEST && 64BIT)
47+
help
48+
Support for PMU events monitoring on the Arm CMN-600 Coherent Mesh
49+
Network interconnect.
50+
4451
config ARM_PMU
4552
depends on ARM || ARM64
4653
bool "ARM PMU framework"

drivers/perf/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# SPDX-License-Identifier: GPL-2.0
22
obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
33
obj-$(CONFIG_ARM_CCN) += arm-ccn.o
4+
obj-$(CONFIG_ARM_CMN) += arm-cmn.o
45
obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
56
obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
67
obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o

0 commit comments

Comments
 (0)