[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] [Intel] GNR PMU fixes and new platform Clearwater Forest PMU support #1356

Avenger-285714 · 2025-12-06T08:46:56Z

upstream commits:
GNR:
6d64273,perf/x86/intel/uncore: Support more units on Granite Rapids,2025-01-10 18:16:50,Kan Liang kan.liang@linux.intel.com,v6.14-rc1
3f710be,perf/x86/intel/uncore: Clean up func_id,2025-01-10 18:16:50,Kan Liang kan.liang@linux.intel.com,v6.14-rc1

ClearWater Forest:
CWF events:
e415c14,perf vendor events: Add Clearwaterforest events,2025-02-12 19:54:38,Ian Rogers irogers@google.com,v6.15-rc1,v6.15-rc1

CWF uncore:
fca24bf,perf/x86/intel/uncore: Support customized MMIO map size,2025-07-09 13:40:19,Kan Liang kan.liang@linux.intel.com,v6.17-rc1,v6.17-rc1
cf002da,perf/x86/intel/uncore: Support MSR portal for discovery tables,2025-07-09 13:40:19,Kan Liang kan.liang@linux.intel.com,v6.17-rc1,v6.17-rc1
b6ccddd,perf/x86/intel/uncore: Add Clearwater Forest support,2024-12-17 17:47:23,Kan Liang kan.liang@linux.intel.com,v6.13-rc5,v6.13-rc5
9828a1c,perf/x86/intel/uncore: Switch to new Intel CPU model defines,2024-04-29 10:30:39,Tony Luck tony.luck@intel.com,v6.10-rc1

CWF core:
3e830f6,perf/x86: Optimize the is_x86_event,2025-04-25 14:55:22,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
efd4485,perf/x86/intel: Check the X86 leader for ACR group,2025-04-25 14:55:22,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
e9988ad,perf/x86/intel: Check the X86 leader for pebs_counter_event_group,2025-04-25 14:55:19,Kan Liang kan.liang@linux.intel.com,v6.15-rc5,v6.15-rc5
75aea4b,perf/x86/intel: Only check the group flag for X86 leader,2025-04-25 14:55:19,Kan Liang kan.liang@linux.intel.com,v6.15-rc5,v6.15-rc5
25c623f,perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs,2025-04-17 14:21:24,Dapeng Mi dapeng1.mi@linux.intel.com,v6.16-rc1,v6.16-rc1
48d66c8,perf/x86/intel: Add PMU support for Clearwater Forest,2025-04-17 14:21:23,Dapeng Mi dapeng1.mi@linux.intel.com,v6.16-rc1,v6.16-rc1
a5f5e12,perf/x86/intel: Don't clear perf metrics overflow bit unconditionally,2025-04-17 14:19:07,Dapeng Mi dapeng1.mi@linux.intel.com,v6.15-rc3,v6.15-rc3
ec980e4,perf/x86/intel: Support auto counter reload,2025-04-08 20:55:49,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
1856c6c,perf/x86/intel: Add CPUID enumeration for the auto counter reload,2025-04-08 20:55:49,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
c9449c8,perf: Extend the bit width of the arch-specific flag,2025-04-08 20:55:49,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
0a65579,perf/x86/intel: Track the num of events needs late setup,2025-04-08 20:55:48,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
4dfe323,perf/x86: Add dynamic constraint,2025-04-08 20:55:48,Kan Liang kan.liang@linux.intel.com,v6.16-rc1,v6.16-rc1
47a973f,perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF,2025-02-08 15:47:25,Kan Liang kan.liang@linux.intel.com,v6.14-rc3
e02e9b0,perf/x86/intel: Support PEBS counters snapshotting,2025-02-05 10:29:45,Kan Liang kan.liang@linux.intel.com,v6.15-rc1,v6.15-rc1
0e45818,perf/x86/intel: Support RDPMC metrics clear mode,2024-12-20 15:31:22,Kan Liang kan.liang@linux.intel.com,v6.14-rc1,v6.14-rc1
b8c3a25,perf/x86/intel/ds: Add PEBS format 6,2024-12-17 17:47:23,Kan Liang kan.liang@linux.intel.com,v6.13-rc5,v6.13-rc5
ae55e30,perf/x86/intel/ds: Simplify the PEBS records processing for adaptive PEBS,2024-12-02 12:01:34,Kan Liang kan.liang@linux.intel.com,v6.14-rc1,v6.14-rc1
3c00ed3,perf/x86/intel/ds: Factor out functions for PEBS records processing,2024-12-02 12:01:34,Kan Liang kan.liang@linux.intel.com,v6.14-rc1,v6.14-rc1
7087bfb,perf/x86/intel/ds: Clarify adaptive PEBS processing,2024-12-02 12:01:34,Kan Liang kan.liang@linux.intel.com,v6.14-rc1,v6.14-rc1
149fd47,perf/x86/intel: Support Perfmon MSRs aliasing,2024-07-04 16:00:40,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
dce0c74,perf/x86/intel: Support PERFEVTSEL extension,2024-07-04 16:00:40,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
e8fb5d6,perf/x86: Add config_mask to represent EVENTSEL bitmask,2024-07-04 16:00:39,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
608f697,perf/x86/intel: Support new data source for Lunar Lake,2024-07-04 16:00:38,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
0902624,perf/x86/intel: Rename model-specific pebs_latency_data functions,2024-07-04 16:00:38,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
a932aa0,perf/x86: Add Lunar Lake and Arrow Lake support,2024-07-04 16:00:37,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
722e42e,perf/x86: Support counter mask,2024-07-04 16:00:36,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
a23eb2f,perf/x86/intel: Support the PEBS event mask,2024-07-04 16:00:36,Kan Liang kan.liang@linux.intel.com,v6.11-rc1,v6.11-rc1
d142df1,perf/x86/intel: Switch to new Intel CPU model defines,2024-05-28 10:59:02,Tony Luck tony.luck@intel.com,v6.11-rc1

TEST RESULT - all pass

uncore test
ls /sys/devices/*
/sys/devices/uncore_b2cmi_0:
/sys/devices/uncore_b2cmi_1:
/sys/devices/uncore_b2cmi_10:
/sys/devices/uncore_b2cmi_11:
/sys/devices/uncore_b2cmi_2:
/sys/devices/uncore_b2cmi_3:
/sys/devices/uncore_b2cmi_4:
/sys/devices/uncore_b2cmi_5:
/sys/devices/uncore_b2cmi_6:
/sys/devices/uncore_b2cmi_7:
/sys/devices/uncore_b2cmi_8:
/sys/devices/uncore_b2cmi_9:
/sys/devices/uncore_b2cxl_0:
/sys/devices/uncore_b2cxl_1:
/sys/devices/uncore_b2cxl_10:
/sys/devices/uncore_b2cxl_11:
/sys/devices/uncore_b2cxl_12:
/sys/devices/uncore_b2cxl_13:
/sys/devices/uncore_b2cxl_14:
/sys/devices/uncore_b2cxl_15:
/sys/devices/uncore_b2cxl_2:
/sys/devices/uncore_b2cxl_3:
/sys/devices/uncore_b2cxl_4:
/sys/devices/uncore_b2cxl_5:
/sys/devices/uncore_b2cxl_6:
/sys/devices/uncore_b2cxl_7:
/sys/devices/uncore_b2cxl_8:
/sys/devices/uncore_b2cxl_9:
/sys/devices/uncore_b2hot_0:
/sys/devices/uncore_b2hot_1:
/sys/devices/uncore_b2hot_10:
/sys/devices/uncore_b2hot_11:
/sys/devices/uncore_b2hot_12:
/sys/devices/uncore_b2hot_13:
/sys/devices/uncore_b2hot_14:
/sys/devices/uncore_b2hot_15:
/sys/devices/uncore_b2hot_16:
/sys/devices/uncore_b2hot_17:
/sys/devices/uncore_b2hot_18:
/sys/devices/uncore_b2hot_19:
/sys/devices/uncore_b2hot_2:
/sys/devices/uncore_b2hot_3:
/sys/devices/uncore_b2hot_4:
/sys/devices/uncore_b2hot_5:
/sys/devices/uncore_b2hot_6:
/sys/devices/uncore_b2hot_7:
/sys/devices/uncore_b2hot_8:
/sys/devices/uncore_b2hot_9:
/sys/devices/uncore_b2upi_0:
/sys/devices/uncore_b2upi_1:
/sys/devices/uncore_b2upi_2:
/sys/devices/uncore_b2upi_3:
/sys/devices/uncore_b2upi_4:
/sys/devices/uncore_b2upi_5:
/sys/devices/uncore_cha_0:
/sys/devices/uncore_cha_1:
/sys/devices/uncore_cha_10:
/sys/devices/uncore_cha_11:
/sys/devices/uncore_cha_12:
/sys/devices/uncore_cha_13:
/sys/devices/uncore_cha_14:
/sys/devices/uncore_cha_15:
/sys/devices/uncore_cha_16:
/sys/devices/uncore_cha_17:
/sys/devices/uncore_cha_18:
/sys/devices/uncore_cha_19:
/sys/devices/uncore_cha_2:
/sys/devices/uncore_cha_20:
/sys/devices/uncore_cha_21:
/sys/devices/uncore_cha_22:
/sys/devices/uncore_cha_23:
/sys/devices/uncore_cha_24:
/sys/devices/uncore_cha_25:
/sys/devices/uncore_cha_26:
/sys/devices/uncore_cha_27:
/sys/devices/uncore_cha_28:
/sys/devices/uncore_cha_29:
/sys/devices/uncore_cha_3:
/sys/devices/uncore_cha_30:
/sys/devices/uncore_cha_31:
/sys/devices/uncore_cha_32:
/sys/devices/uncore_cha_33:
/sys/devices/uncore_cha_34:
/sys/devices/uncore_cha_35:
/sys/devices/uncore_cha_36:
/sys/devices/uncore_cha_37:
/sys/devices/uncore_cha_38:
/sys/devices/uncore_cha_39:
/sys/devices/uncore_cha_4:
/sys/devices/uncore_cha_40:
/sys/devices/uncore_cha_41:
/sys/devices/uncore_cha_42:
/sys/devices/uncore_cha_43:
/sys/devices/uncore_cha_44:
/sys/devices/uncore_cha_45:
/sys/devices/uncore_cha_46:
/sys/devices/uncore_cha_47:
/sys/devices/uncore_cha_48:
/sys/devices/uncore_cha_49:
/sys/devices/uncore_cha_5:
/sys/devices/uncore_cha_50:
/sys/devices/uncore_cha_51:
/sys/devices/uncore_cha_52:
/sys/devices/uncore_cha_53:
/sys/devices/uncore_cha_54:
/sys/devices/uncore_cha_55:
/sys/devices/uncore_cha_56:
/sys/devices/uncore_cha_57:
/sys/devices/uncore_cha_58:
/sys/devices/uncore_cha_59:
/sys/devices/uncore_cha_6:
/sys/devices/uncore_cha_60:
/sys/devices/uncore_cha_61:
/sys/devices/uncore_cha_62:
/sys/devices/uncore_cha_63:
/sys/devices/uncore_cha_64:
/sys/devices/uncore_cha_65:
/sys/devices/uncore_cha_7:
/sys/devices/uncore_cha_8:
/sys/devices/uncore_cha_9:
/sys/devices/uncore_cxlcm_16:
/sys/devices/uncore_cxlcm_18:
/sys/devices/uncore_cxlcm_2:
/sys/devices/uncore_cxlcm_4:
/sys/devices/uncore_cxlcm_6:
/sys/devices/uncore_cxlcm_8:
/sys/devices/uncore_cxldp_17:
/sys/devices/uncore_cxldp_19:
/sys/devices/uncore_cxldp_3:
/sys/devices/uncore_cxldp_5:
/sys/devices/uncore_cxldp_7:
/sys/devices/uncore_cxldp_9:
/sys/devices/uncore_iio_1:
/sys/devices/uncore_iio_11:
/sys/devices/uncore_iio_12:
/sys/devices/uncore_iio_14:
/sys/devices/uncore_iio_2:
/sys/devices/uncore_iio_3:
/sys/devices/uncore_iio_4:
/sys/devices/uncore_iio_5:
/sys/devices/uncore_iio_6:
/sys/devices/uncore_iio_9:
/sys/devices/uncore_iio_free_running_0:
/sys/devices/uncore_iio_free_running_1:
/sys/devices/uncore_iio_free_running_10:
/sys/devices/uncore_iio_free_running_11:
/sys/devices/uncore_iio_free_running_12:
/sys/devices/uncore_iio_free_running_13:
/sys/devices/uncore_iio_free_running_14:
/sys/devices/uncore_iio_free_running_2:
/sys/devices/uncore_iio_free_running_3:
/sys/devices/uncore_iio_free_running_4:
/sys/devices/uncore_iio_free_running_5:
/sys/devices/uncore_iio_free_running_6:
/sys/devices/uncore_iio_free_running_7:
/sys/devices/uncore_iio_free_running_8:
/sys/devices/uncore_iio_free_running_9:
/sys/devices/uncore_imc:
/sys/devices/uncore_irp_1:
/sys/devices/uncore_irp_11:
/sys/devices/uncore_irp_12:
/sys/devices/uncore_irp_14:
/sys/devices/uncore_irp_2:
/sys/devices/uncore_irp_3:
/sys/devices/uncore_irp_4:
/sys/devices/uncore_irp_5:
/sys/devices/uncore_irp_6:
/sys/devices/uncore_irp_9:
/sys/devices/uncore_mdf_sbo_0:
/sys/devices/uncore_mdf_sbo_1:
/sys/devices/uncore_mdf_sbo_10:
/sys/devices/uncore_mdf_sbo_11:
/sys/devices/uncore_mdf_sbo_12:
/sys/devices/uncore_mdf_sbo_13:
/sys/devices/uncore_mdf_sbo_14:
/sys/devices/uncore_mdf_sbo_15:
/sys/devices/uncore_mdf_sbo_16:
/sys/devices/uncore_mdf_sbo_17:
/sys/devices/uncore_mdf_sbo_18:
/sys/devices/uncore_mdf_sbo_19:
/sys/devices/uncore_mdf_sbo_2:
/sys/devices/uncore_mdf_sbo_20:
/sys/devices/uncore_mdf_sbo_21:
/sys/devices/uncore_mdf_sbo_22:
/sys/devices/uncore_mdf_sbo_23:
/sys/devices/uncore_mdf_sbo_24:
/sys/devices/uncore_mdf_sbo_25:
/sys/devices/uncore_mdf_sbo_26:
/sys/devices/uncore_mdf_sbo_27:
/sys/devices/uncore_mdf_sbo_28:
/sys/devices/uncore_mdf_sbo_29:
/sys/devices/uncore_mdf_sbo_3:
/sys/devices/uncore_mdf_sbo_30:
/sys/devices/uncore_mdf_sbo_31:
/sys/devices/uncore_mdf_sbo_32:
/sys/devices/uncore_mdf_sbo_33:
/sys/devices/uncore_mdf_sbo_34:
/sys/devices/uncore_mdf_sbo_35:
/sys/devices/uncore_mdf_sbo_36:
/sys/devices/uncore_mdf_sbo_37:
/sys/devices/uncore_mdf_sbo_38:
/sys/devices/uncore_mdf_sbo_39:
/sys/devices/uncore_mdf_sbo_4:
/sys/devices/uncore_mdf_sbo_40:
/sys/devices/uncore_mdf_sbo_41:
/sys/devices/uncore_mdf_sbo_42:
/sys/devices/uncore_mdf_sbo_43:
/sys/devices/uncore_mdf_sbo_44:
/sys/devices/uncore_mdf_sbo_45:
/sys/devices/uncore_mdf_sbo_46:
/sys/devices/uncore_mdf_sbo_47:
/sys/devices/uncore_mdf_sbo_48:
/sys/devices/uncore_mdf_sbo_49:
/sys/devices/uncore_mdf_sbo_5:
/sys/devices/uncore_mdf_sbo_50:
/sys/devices/uncore_mdf_sbo_51:
/sys/devices/uncore_mdf_sbo_52:
/sys/devices/uncore_mdf_sbo_53:
/sys/devices/uncore_mdf_sbo_54:
/sys/devices/uncore_mdf_sbo_55:
/sys/devices/uncore_mdf_sbo_56:
/sys/devices/uncore_mdf_sbo_57:
/sys/devices/uncore_mdf_sbo_58:
/sys/devices/uncore_mdf_sbo_59:
/sys/devices/uncore_mdf_sbo_6:
/sys/devices/uncore_mdf_sbo_60:
/sys/devices/uncore_mdf_sbo_61:
/sys/devices/uncore_mdf_sbo_62:
/sys/devices/uncore_mdf_sbo_63:
/sys/devices/uncore_mdf_sbo_64:
/sys/devices/uncore_mdf_sbo_65:
/sys/devices/uncore_mdf_sbo_66:
/sys/devices/uncore_mdf_sbo_67:
/sys/devices/uncore_mdf_sbo_68:
/sys/devices/uncore_mdf_sbo_69:
/sys/devices/uncore_mdf_sbo_7:
/sys/devices/uncore_mdf_sbo_70:
/sys/devices/uncore_mdf_sbo_71:
/sys/devices/uncore_mdf_sbo_72:
/sys/devices/uncore_mdf_sbo_73:
/sys/devices/uncore_mdf_sbo_74:
/sys/devices/uncore_mdf_sbo_75:
/sys/devices/uncore_mdf_sbo_76:
/sys/devices/uncore_mdf_sbo_77:
/sys/devices/uncore_mdf_sbo_78:
/sys/devices/uncore_mdf_sbo_79:
/sys/devices/uncore_mdf_sbo_8:
/sys/devices/uncore_mdf_sbo_9:
/sys/devices/uncore_pciex16_2:
/sys/devices/uncore_pciex16_3:
/sys/devices/uncore_pciex16_8:
/sys/devices/uncore_pciex16_9:
/sys/devices/uncore_pciex8:
/sys/devices/uncore_pcu_0:
/sys/devices/uncore_pcu_1:
/sys/devices/uncore_pcu_2:
/sys/devices/uncore_pcu_3:
/sys/devices/uncore_pcu_4:
/sys/devices/uncore_ubox:
/sys/devices/uncore_upi_0:
/sys/devices/uncore_upi_1:
/sys/devices/uncore_upi_2:
/sys/devices/uncore_upi_3:
/sys/devices/uncore_upi_4:
/sys/devices/uncore_upi_5:
./perf stat -e uncore_upi/event=0x1/,uncore_cha/event=0x1/,uncore_imc/event=0x1/ -a sleep 1

Performance counter stats for 'system wide':

10,432,220,068 uncore_upi/event=0x1/
75,878,604,600 uncore_cha/event=0x1/
840,576,348 uncore_imc/event=0x1/

1.002999881 seconds time elapsed
core test
./perf stat -a sleep 1

Performance counter stats for 'system wide':

392,913.63 msec cpu-clock                        #  384.673 CPUs utilized
     1,345      context-switches                 #    3.423 /sec
       386      cpu-migrations                   #    0.982 /sec
        85      page-faults                      #    0.216 /sec

816,854,365 cycles # 0.002 GHz
214,062,498 instructions # 0.26 insn per cycle
45,335,327 branches # 115.382 K/sec
992,884 branch-misses # 2.19% of all branches

1.021422273 seconds time elapsed
./perf record -e instructions -Iax,bx -b -c 100000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.040 MB perf.data (17 samples) ]
./perf record -e branches -Iax,bx -b -c 10000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.053 MB perf.data (39 samples) ]

event test
./perf stat -e LONGEST_LAT_CACHE.MISS,LONGEST_LAT_CACHE.REFERENCE -a sleep 1

Performance counter stats for 'system wide':

   770,872      LONGEST_LAT_CACHE.MISS
10,448,146      LONGEST_LAT_CACHE.REFERENCE

1.019260018 seconds time elapsed
./perf record -e instructions -Iax,bx -b -c 100000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.036 MB perf.data (17 samples) ]
./perf record -e branches -Iax,bx -b -c 10000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.054 MB perf.data (39 samples) ]

commit 3f710be upstream. The below warning may be triggered on GNR when the PCIE uncore units are exposed. WARNING: CPU: 4 PID: 1 at arch/x86/events/intel/uncore.c:1169 uncore_pci_pmu_register+0x158/0x190 The current uncore driver assumes that all the devices in the same PMU have the exact same devfn. It's true for the previous platforms. But it doesn't work for the new PCIE uncore units on GNR. The assumption doesn't make sense. There is no reason to limit the devices from the same PMU to the same devfn. Also, the current code just throws the warning, but still registers the device. The WARN_ON_ONCE() should be removed. The func_id is used by the later event_init() to check if a event->pmu has valid devices. For cpu and mmio uncore PMUs, they are always valid. For pci uncore PMUs, it's set when the PMU is registered. It can be replaced by the pmu->registered. Clean up the func_id. Intel-SIG: commit 3f710be perf/x86/intel/uncore: Clean up func_id. PMU GNR support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Eric Hu <eric.hu@intel.com> Link: https://lkml.kernel.org/r/20250108143017.1793781-1-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 6d64273 upstream. The same CXL PMONs support is also avaiable on GNR. Apply spr_uncore_cxlcm and spr_uncore_cxldp to GNR as well. The other units were broken on early HW samples, so they were ignored in the early enabling patch. The issue has been fixed and verified on the later production HW. Add UPI, B2UPI, B2HOT, PCIEX16 and PCIEX8 for GNR. Intel-SIG: commit 6d64273 perf/x86/intel/uncore: Support more units on Granite Rapids. PMU GNR support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Eric Hu <eric.hu@intel.com> Link: https://lkml.kernel.org/r/20250108143017.1793781-2-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit e415c14 upstream. Add events v1.00. Bring in the events from: https://github.com/intel/perfmon/tree/main/CWF/events Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Intel-SIG: commit e415c14 perf vendor events: Add Clearwaterforest events. PMU Clearwater Forest support Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit the upstream. same as the previous Sierra Forest. The only difference is the event list, which will be supported in the perf tool later. Intel-SIG: commit the perf/x86/intel/uncore: Add Clearwater Forest support. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241211161146.235253-1-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit cf002da upstream. Starting from the Panther Lake, the discovery table mechanism is also supported in client platforms. The difference is that the portal of the global discovery table is retrieved from an MSR. The layout of discovery tables are the same as the server platforms. Factor out __parse_discovery_table() to parse discover tables. The uncore PMON is Die scope. Need to parse the discovery tables for each die. Intel-SIG: commit cf002da perf/x86/intel/uncore: Support MSR portal for discovery tables. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-2-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit fca24bf upstream. For a server platform, the MMIO map size is always 0x4000. However, a client platform may have a smaller map size. Make the map size customizable. Intel-SIG: commit fca24bf perf/x86/intel/uncore: Support customized MMIO map size. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-3-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit a23eb2f upstream. The current perf assumes that the counters that support PEBS are contiguous. But it's not guaranteed with the new leaf 0x23 introduced. The counters are enumerated with a counter mask. There may be holes in the counter mask for future platforms or in a virtualization environment. Store the PEBS event mask rather than the maximum number of PEBS counters in the x86 PMU structures. Intel-SIG: commit a23eb2f perf/x86/intel: Support the PEBS event mask. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-2-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit d142df1 upstream. New CPU #defines encode vendor and family as well as model. Intel-SIG: commit d142df1 perf/x86/intel: Switch to new Intel CPU model defines. PMU Clearwater Forest support Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20240520224620.9480-32-tony.luck%40intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

sourcery-ai

Sorry @Avenger-285714, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

Copilot

Pull request overview

This pull request backports critical Intel PMU infrastructure improvements and adds Clearwater Forest platform support to the Linux 6.6-y stable kernel. The changes refactor the x86 PMU subsystem to use bitmask-based counter representation instead of simple counter counts, enabling more flexible hardware configurations and advanced features like Auto Counter Reload (ACR) and PEBS counter snapshotting.

Key Changes:

Architectural refactoring: Converted counter tracking from integer counts to 64-bit bitmasks for general-purpose and fixed counters
Added support for Intel Clearwater Forest (Darkmont) and Lunarlake platforms with new PMU capabilities
Implemented ACR (Auto Counter Reload) and PEBS counter snapshotting features for v6+ PMU architectures
Added support for MSR aliasing, extended EVENTSEL fields (EQ, UMASK2), and dynamic constraints

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`arch/x86/events/perf_event.h`	Core header changes: counter masks, ACR/PEBS support structures, helper functions
`arch/x86/events/core.c`	Core counter handling refactored from counts to bitmasks
`arch/x86/events/intel/core.c`	Intel PMU driver: ACR support, new platforms (Lunarlake, Clearwater Forest), event constraints
`arch/x86/events/intel/ds.c`	PEBS format 6 support, counter snapshotting, latency data handling for new platforms
`arch/x86/events/intel/uncore*.c`	Uncore driver updates: MSR portal support, GNR unit additions
`arch/x86/events/zhaoxin/core.c`	CRITICAL BUG: Typo in macro name (ENMASK_ULL → GENMASK_ULL)
`arch/x86/events/amd/core.c`	AMD driver updated to use bitmask representation
`arch/x86/include/asm/perf_event.h`	New MSR definitions, PEBS/ACR constants, data structure updates
`tools/perf/pmu-events/arch/x86/clearwaterforest/*.json`	Performance monitoring event definitions for Clearwater Forest

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-06T08:50:50Z

arch/x86/events/zhaoxin/core.c


 	x86_pmu.version = version;
-	x86_pmu.num_counters = eax.split.num_counters;
+	x86_pmu.cntr_mask64 = ENMASK_ULL(eax.split.num_counters - 1, 0);


Typo in macro name: ENMASK_ULL should be GENMASK_ULL.

Suggested change

x86_pmu.cntr_mask64 = ENMASK_ULL(eax.split.num_counters - 1, 0);

x86_pmu.cntr_mask64 = GENMASK_ULL(eax.split.num_counters - 1, 0);

commit 722e42e upstream. The current perf assumes that both GP and fixed counters are contiguous. But it's not guaranteed on newer Intel platforms or in a virtualization environment. Use the counter mask to replace the number of counters for both GP and the fixed counters. For the other ARCHs or old platforms which don't support a counter mask, using GENMASK_ULL(num_counter - 1, 0) to replace. There is no functional change for them. The interface to KVM is not changed. The number of counters still be passed to KVM. It can be updated later separately. Intel-SIG: commit 722e42e perf/x86: Support counter mask. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-3-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn> Signed-off-by: WangYuli <wangyuli@aosc.io>

commit a932aa0 upstream. From PMU's perspective, Lunar Lake and Arrow Lake are similar to the previous generation Meteor Lake. Both are hybrid platforms, with e-core and p-core. The key differences include: - The e-core supports 3 new fixed counters - The p-core supports an updated PEBS Data Source format - More GP counters (Updated event constraint table) - New Architectural performance monitoring V6 (New Perfmon MSRs aliasing, umask2, eq). - New PEBS format V6 (Counters Snapshotting group) - New RDPMC metrics clear mode The legacy features, the 3 new fixed counters and updated event constraint table are enabled in this patch. The new PEBS data source format, the architectural performance monitoring V6, the PEBS format V6, and the new RDPMC metrics clear mode are supported in the following patches. Intel-SIG: commit a932aa0 perf/x86: Add Lunar Lake and Arrow Lake support. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-4-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 0902624 upstream. The model-specific pebs_latency_data functions of ADL and MTL use the "small" as a postfix to indicate the e-core. The postfix is too generic for a model-specific function. It cannot provide useful information that can directly map it to a specific uarch, which can facilitate the development and maintenance. Use the abbr of the uarch to rename the model-specific functions. Intel-SIG: commit 0902624 perf/x86/intel: Rename model-specific pebs_latency_data functions. PMU Clearwater Forest support Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-5-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 608f697 upstream. A new PEBS data source format is introduced for the p-core of Lunar Lake. The data source field is extended to 8 bits with new encodings. A new layout is introduced into the union intel_x86_pebs_dse. Introduce the lnl_latency_data() to parse the new format. Enlarge the pebs_data_source[] accordingly to include new encodings. Only the mem load and the mem store events can generate the data source. Introduce INTEL_HYBRID_LDLAT_CONSTRAINT and INTEL_HYBRID_STLAT_CONSTRAINT to mark them. Add two new bits for the new cache-related data src, L2_MHB and MSC. The L2_MHB is short for L2 Miss Handling Buffer, which is similar to LFB (Line Fill Buffer), but to track the L2 Cache misses. The MSC stands for the memory-side cache. Intel-SIG: commit 608f697 perf/x86/intel: Support new data source for Lunar Lake. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-6-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit e8fb5d6 upstream. Different vendors may support different fields in EVENTSEL MSR, such as Intel would introduce new fields umask2 and eq bits in EVENTSEL MSR since Perfmon version 6. However, a fixed mask X86_RAW_EVENT_MASK is used to filter the attr.config. Introduce a new config_mask to record the real supported EVENTSEL bitmask. Only apply it to the existing code now. No functional change. Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Intel-SIG: commit e8fb5d6 perf/x86: Add config_mask to represent EVENTSEL bitmask. PMU Clearwater Forest support Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-7-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit dce0c74 upstream. Two new fields (the unit mask2, and the equal flag) are added in the IA32_PERFEVTSELx MSRs. They can be enumerated by the CPUID.23H.0.EBX. Update the config_mask in x86_pmu and x86_hybrid_pmu for the true layout of the PERFEVTSEL. Expose the new formats into sysfs if they are available. The umask extension reuses the same format attr name "umask" as the previous umask. Add umask2_show to determine/display the correct format for the current machine. Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Intel-SIG: commit dce0c74 perf/x86/intel: Support PERFEVTSEL extension. PMU Clearwater Forest support Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-8-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 149fd47 upstream. The architectural performance monitoring V6 supports a new range of counters' MSRs in the 19xxH address range. They include all the GP counter MSRs, the GP control MSRs, and the fixed counter MSRs. The step between each sibling counter is 4. Add intel_pmu_addr_offset() to calculate the correct offset. Add fixedctr in struct x86_pmu to store the address of the fixed counter 0. It can be used to calculate the rest of the fixed counters. The MSR address of the fixed counter control is not changed. Intel-SIG: commit 149fd47 perf/x86/intel: Support Perfmon MSRs aliasing. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-9-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 7087bfb upstream. Modify the pebs_basic and pebs_meminfo structs to make the bitfields more explicit to ease readability of the code. Co-developed-by: Stephane Eranian <eranian@google.com> Intel-SIG: commit 7087bfb perf/x86/intel/ds: Clarify adaptive PEBS processing. PMU Clearwater Forest support Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119135504.1463839-3-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 3c00ed3 upstream. Factor out functions to process normal and the last PEBS records, which can be shared with the later patch. Move the event updating related codes (intel_pmu_save_and_restart()) to the end, where all samples have been processed. For the current usage, it doesn't matter when perf updates event counts and reset the counter. Because all counters are stopped when the PEBS buffer is drained. Drop the return of the !intel_pmu_save_and_restart(event) check. Because it never happen. The intel_pmu_save_and_restart(event) only returns 0, when !hwc->event_base or the period_left > 0. - The !hwc->event_base is impossible for the PEBS event, since the PEBS event is only available on GP and fixed counters, which always have a valid hwc->event_base. - The check only happens for the case of non-AUTO_RELOAD and single PEBS, which implies that the event must be overflowed. The period_left must be always <= 0 for an overflowed event after the x86_pmu_update(). Co-developed-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Intel-SIG: commit 3c00ed3 perf/x86/intel/ds: Factor out functions for PEBS records processing. PMU Clearwater Forest support Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119135504.1463839-4-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

…PEBS commit ae55e30 upstream. The current code may iterate all the PEBS records in the DS area several times. The first loop is to find all active events and calculate the available records for each event. Then iterate the whole buffer again and again to process available records until all active events are processed. The algorithm is inherited from the old generations. The old PEBS hardware does not deal well with the situation when events happen near each other. SW has to drop the error records. Multiple iterations are required. The hardware limit has been addressed on newer platforms with adaptive PEBS. A simple one-iteration algorithm is introduced. The samples are output by record order with the patch, rather than the event order. It doesn't impact the post-processing. The perf tool always sorts the records by time before presenting them to the end user. In an NMI, the last record has to be specially handled. Add a last[] variable to track the last unprocessed record of each event. Test: 11 PEBS events are used in the perf test. Only the basic information is collected. perf record -e instructions:up,...,instructions:up -c 2000003 benchmark The ftrace is used to record the duration of the intel_pmu_drain_pebs_icl(). The average duration reduced from 62.04us to 57.94us. A small improvement can be observed with the new algorithm. Also, the implementation becomes simpler and more straightforward. Intel-SIG: commit ae55e30 perf/x86/intel/ds: Simplify the PEBS records processing for adaptive PEBS. PMU Clearwater Forest support Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20241119135504.1463839-5-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit b8c3a25 upstream. The only difference between 5 and 6 is the new counters snapshotting group, without the following counters snapshotting enabling patches, it's impossible to utilize the feature in a PEBS record. It's safe to share the same code path with format 5. Add format 6, so the end user can at least utilize the legacy PEBS features. Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Intel-SIG: commit b8c3a25 perf/x86/intel/ds: Add PEBS format 6. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241216204505.748363-1-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 0e45818 upstream. The new RDPMC enhancement, metrics clear mode, is to clear the PERF_METRICS-related resources as well as the fixed-function performance monitoring counter 3 after the read is performed. It is available for ring 3. The feature is enumerated by the IA32_PERF_CAPABILITIES.RDPMC_CLEAR_METRICS[bit 19]. To enable the feature, the IA32_FIXED_CTR_CTRL.METRICS_CLEAR_EN[bit 14] must be set. Two ways were considered to enable the feature. - Expose a knob in the sysfs globally. One user may affect the measurement of other users when changing the knob. The solution is dropped. - Introduce a new event format, metrics_clear, for the slots event to disable/enable the feature only for the current process. Users can utilize the feature as needed. The latter solution is implemented in the patch. The current KVM doesn't support the perf metrics yet. For virtualization, the feature can be enabled later separately. Intel-SIG: commit 0e45818 perf/x86/intel: Support RDPMC metrics clear mode. PMU Clearwater Forest support Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20241211160318.235056-1-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit e02e9b0 upstream. The counters snapshotting is a new adaptive PEBS extension, which can capture programmable counters, fixed-function counters, and performance metrics in a PEBS record. The feature is available in the PEBS format V6. The target counters can be configured in the new fields of MSR_PEBS_CFG. Then the PEBS HW will generate the bit mask of counters (Counters Group Header) followed by the content of all the requested counters into a PEBS record. The current Linux perf sample read feature can read all events in the group when any event in the group is overflowed. But the rdpmc in the NMI/overflow handler has a small gap from overflow. Also, there is some overhead for each rdpmc read. The counters snapshotting feature can be used as an accurate and low-overhead replacement. Extend intel_update_topdown_event() to accept the value from PEBS records. Add a new PEBS_CNTR flag to indicate a sample read group that utilizes the counters snapshotting feature. When the group is scheduled, the PEBS configure can be updated accordingly. To prevent the case that a PEBS record value might be in the past relative to what is already in the event, perf always stops the PMU and drains the PEBS buffer before updating the corresponding event->count. Intel-SIG: commit e02e9b0 perf/x86/intel: Support PEBS counters snapshotting. PMU Clearwater Forest support Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250121152303.3128733-4-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 4dfe323 upstream. More and more features require a dynamic event constraint, e.g., branch counter logging, auto counter reload, Arch PEBS, etc. Add a generic flag, PMU_FL_DYN_CONSTRAINT, to indicate the case. It avoids keeping adding the individual flag in intel_cpuc_prepare(). Add a variable dyn_constraint in the struct hw_perf_event to track the dynamic constraint of the event. Apply it if it's updated. Apply the generic dynamic constraint for branch counter logging. Many features on and after V6 require dynamic constraint. So unconditionally set the flag for V6+. Intel-SIG: commit 4dfe323 perf/x86: Add dynamic constraint. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lkml.kernel.org/r/20250327195217.2683619-2-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 0a65579 upstream. When a machine supports PEBS v6, perf unconditionally searches the cpuc->event_list[] for every event and check if the late setup is required, which is unnecessary. The late setup is only required for special events, e.g., events support counters snapshotting feature. Add n_late_setup to track the num of events that needs the late setup. Other features, e.g., auto counter reload feature, require the late setup as well. Add a wrapper, intel_pmu_pebs_late_setup, for the events that support counters snapshotting feature. Intel-SIG: commit 0a65579 perf/x86/intel: Track the num of events needs late setup. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lkml.kernel.org/r/20250327195217.2683619-3-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit c9449c8 upstream. The auto counter reload feature requires an event flag to indicate an auto counter reload group, which can only be scheduled on specific counters that enumerated in CPUID. However, the hw_perf_event.flags has run out on X86. Two solutions were considered to address the issue. - Currently, 20 bits are reserved for the architecture-specific flags. Only the bit 31 is used for the generic flag. There is still plenty of space left. Reserve 8 more bits for the arch-specific flags. - Add a new X86 specific hw_perf_event.flags1 to support more flags. The former is implemented. Enough room is still left in the global generic flag. Intel-SIG: commit c9449c8 perf: Extend the bit width of the arch-specific flag. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lkml.kernel.org/r/20250327195217.2683619-4-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 1856c6c upstream. The counters that support the auto counter reload feature can be enumerated in the CPUID Leaf 0x23 sub-leaf 0x2. Add acr_cntr_mask to store the mask of counters which are reloadable. Add acr_cause_mask to store the mask of counters which can cause reload. Since the e-core and p-core may have different numbers of counters, track the masks in the struct x86_hybrid_pmu as well. Intel-SIG: commit 1856c6c perf/x86/intel: Add CPUID enumeration for the auto counter reload. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lkml.kernel.org/r/20250327195217.2683619-5-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit ec980e4 upstream. The relative rates among two or more events are useful for performance analysis, e.g., a high branch miss rate may indicate a performance issue. Usually, the samples with a relative rate that exceeds some threshold are more useful. However, the traditional sampling takes samples of events separately. To get the relative rates among two or more events, a high sample rate is required, which can bring high overhead. Many samples taken in the non-hotspot area are also dropped (useless) in the post-process. The auto counter reload (ACR) feature takes samples when the relative rate of two or more events exceeds some threshold, which provides the fine-grained information at a low cost. To support the feature, two sets of MSRs are introduced. For a given counter IA32_PMC_GPn_CTR/IA32_PMC_FXm_CTR, bit fields in the IA32_PMC_GPn_CFG_B/IA32_PMC_FXm_CFG_B MSR indicate which counter(s) can cause a reload of that counter. The reload value is stored in the IA32_PMC_GPn_CFG_C/IA32_PMC_FXm_CFG_C. The details can be found at Intel SDM (085), Volume 3, 21.9.11 Auto Counter Reload. In the hw_config(), an ACR event is specially configured, because the cause/reloadable counter mask has to be applied to the dyn_constraint. Besides the HW limit, e.g., not support perf metrics, PDist and etc, a SW limit is applied as well. ACR events in a group must be contiguous. It facilitates the later conversion from the event idx to the counter idx. Otherwise, the intel_pmu_acr_late_setup() has to traverse the whole event list again to find the "cause" event. Also, add a new flag PERF_X86_EVENT_ACR to indicate an ACR group, which is set to the group leader. The late setup() is also required for an ACR group. It's to convert the event idx to the counter idx, and saved it in hw.config1. The ACR configuration MSRs are only updated in the enable_event(). The disable_event() doesn't clear the ACR CFG register. Add acr_cfg_b/acr_cfg_c in the struct cpu_hw_events to cache the MSR values. It can avoid a MSR write if the value is not changed. Expose an acr_mask to the sysfs. The perf tool can utilize the new format to configure the relation of events in the group. The bit sequence of the acr_mask follows the events enabled order of the group. Example: Here is the snippet of the mispredict.c. Since the array has a random numbers, jumps are random and often mispredicted. The mispredicted rate depends on the compared value. For the Loop1, ~11% of all branches are mispredicted. For the Loop2, ~21% of all branches are mispredicted. main() { ... for (i = 0; i < N; i++) data[i] = rand() % 256; ... /* Loop 1 */ for (k = 0; k < 50; k++) for (i = 0; i < N; i++) if (data[i] >= 64) sum += data[i]; ... ... /* Loop 2 */ for (k = 0; k < 50; k++) for (i = 0; i < N; i++) if (data[i] >= 128) sum += data[i]; ... } Usually, a code with a high branch miss rate means a bad performance. To understand the branch miss rate of the codes, the traditional method usually samples both branches and branch-misses events. E.g., perf record -e "{cpu_atom/branch-misses/ppu, cpu_atom/branch-instructions/u}" -c 1000000 -- ./mispredict [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.925 MB perf.data (5106 samples) ] The 5106 samples are from both events and spread in both Loops. In the post-process stage, a user can know that the Loop 2 has a 21% branch miss rate. Then they can focus on the samples of branch-misses events for the Loop 2. With this patch, the user can generate the samples only when the branch miss rate > 20%. For example, perf record -e "{cpu_atom/branch-misses,period=200000,acr_mask=0x2/ppu, cpu_atom/branch-instructions,period=1000000,acr_mask=0x3/u}" -- ./mispredict (Two different periods are applied to branch-misses and branch-instructions. The ratio is set to 20%. If the branch-instructions is overflowed first, the branch-miss rate < 20%. No samples should be generated. All counters should be automatically reloaded. If the branch-misses is overflowed first, the branch-miss rate > 20%. A sample triggered by the branch-misses event should be generated. Just the counter of the branch-instructions should be automatically reloaded. The branch-misses event should only be automatically reloaded when the branch-instructions is overflowed. So the "cause" event is the branch-instructions event. The acr_mask is set to 0x2, since the event index in the group of branch-instructions is 1. The branch-instructions event is automatically reloaded no matter which events are overflowed. So the "cause" events are the branch-misses and the branch-instructions event. The acr_mask should be set to 0x3.) [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.098 MB perf.data (2498 samples) ] $perf report Percent │154: movl $0x0,-0x14(%rbp) │ ↓ jmp 1af │ for (i = j; i < N; i++) │15d: mov -0x10(%rbp),%eax │ mov %eax,-0x18(%rbp) │ ↓ jmp 1a2 │ if (data[i] >= 128) │165: mov -0x18(%rbp),%eax │ cltq │ lea 0x0(,%rax,4),%rdx │ mov -0x8(%rbp),%rax │ add %rdx,%rax │ mov (%rax),%eax │ ┌──cmp $0x7f,%eax 100.00 0.00 │ ├──jle 19e │ │sum += data[i]; The 2498 samples are all from the branch-misses events for the Loop 2. The number of samples and overhead is significantly reduced without losing any information. Intel-SIG: commit ec980e4 perf/x86/intel: Support auto counter reload. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lkml.kernel.org/r/20250327195217.2683619-6-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit a5f5e12 upstream. The below code would always unconditionally clear other status bits like perf metrics overflow bit once PEBS buffer overflows: status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI; This is incorrect. Perf metrics overflow bit should be cleared only when fixed counter 3 in PEBS counter group. Otherwise perf metrics overflow could be missed to handle. Closes: https://lore.kernel.org/all/20250225110012.GK31462@noisy.programming.kicks-ass.net/ Fixes: 7b2c05a ("perf/x86/intel: Generic support for hardware TopDown metrics") Intel-SIG: commit a5f5e12 perf/x86/intel: Don't clear perf metrics overflow bit unconditionally. PMU Clearwater Forest support Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250415104135.318169-1-dapeng1.mi@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 48d66c8 upstream. From the PMU's perspective, Clearwater Forest is similar to the previous generation Sierra Forest. The key differences are the ARCH PEBS feature and the new added 3 fixed counters for topdown L1 metrics events. The ARCH PEBS is supported in the following patches. This patch provides support for basic perfmon features and 3 new added fixed counters. Intel-SIG: commit 48d66c8 perf/x86/intel: Add PMU support for Clearwater Forest. PMU Clearwater Forest support Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20250415114428.341182-3-dapeng1.mi@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 25c623f upstream. CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU level's PMU capabilities on non-hybrid processors as well. This patch supports to parse archPerfmonExt leaves on non-hybrid processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4 and 0x5 to enumerate the PEBS capabilities as well. This patch is a precursor of the subsequent arch-PEBS enabling patches. Intel-SIG: commit 25c623f perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs. PMU Clearwater Forest support Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20250415114428.341182-4-dapeng1.mi@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit e9988ad upstream. The PEBS counters snapshotting group also requires a group flag in the leader. The leader must be a X86 event. Fixes: e02e9b0 ("perf/x86/intel: Support PEBS counters snapshotting") Intel-SIG: commit e9988ad perf/x86/intel: Check the X86 leader for pebs_counter_event_group. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250424134718.311934-3-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit efd4485 upstream. The auto counter reload group also requires a group flag in the leader. The leader must be a X86 event. Intel-SIG: commit efd4485 perf/x86/intel: Check the X86 leader for ACR group. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250424134718.311934-4-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

commit 3e830f6 upstream. The current is_x86_event has to go through the hybrid_pmus list to find the matched pmu, then check if it's a X86 PMU and a X86 event. It's not necessary. The X86 PMU has a unique type ID on a non-hybrid machine, and a unique capability type. They are good enough to do the check. Intel-SIG: commit 3e830f6 perf/x86: Optimize the is_x86_event. PMU Clearwater Forest support Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250424134718.311934-5-kan.liang@linux.intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

Copilot

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

opsiff · 2025-12-09T08:00:17Z

有些补丁修复没合入 perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq
fix: perf/x86: Support counter mask

commit aa5d2ca
Author: Kan Liang kan.liang@linux.intel.com
Date: Mon Dec 16 08:02:52 2024 -0800

perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC

fix:perf/x86: Add Lunar Lake and Arrow Lake support

commit 782cffe
Author: Kan Liang kan.liang@linux.intel.com
Date: Wed Feb 19 06:10:05 2025 -0800

perf/x86/intel: Fix event constraints for LNC

fix:perf/x86: Add Lunar Lake and Arrow Lake support
commit 0ba6502
Author: Dapeng Mi dapeng1.mi@linux.intel.com
Date: Tue Oct 28 14:42:14 2025 +0800

perf/x86/intel: Fix KASAN global-out-of-bounds warning

fix:perf/x86/intel: Rename model-specific pebs_latency_data functions

commit 7da9960
Author: Kan Liang kan.liang@linux.intel.com
Date: Thu Apr 24 06:47:18 2025 -0700

perf/x86/intel/ds: Fix counter backwards of non-precise events counters-snapshotting

The counter backwards may be observed in the PMI handler when
counters-snapshotting some non-precise events in the freq mode.

For the non-precise events, it's possible the counters-snapshotting
records a positive value for an overflowed PEBS event. Then the HW
auto-reload mechanism reset the counter to 0 immediately. Because the
pebs_event_reset is cleared in the freq mode, which doesn't set the
PERF_X86_EVENT_AUTO_RELOAD.
In the PMI handler, 0 will be read rather than the positive value
recorded in the counters-snapshotting record.

The counters-snapshotting case has to be specially handled. Since the
event value has been updated when processing the counters-snapshotting
record, only needs to set the new period for the counter via
x86_pmu_set_period().

Fixes: e02e9b0374c3 ("perf/x86/intel: Support PEBS counters snapshotting")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250424134718.311934-6-kan.liang@linux.intel.com

commit 43796f3
Author: Dapeng Mi dapeng1.mi@linux.intel.com
Date: Wed Aug 20 10:30:27 2025 +0800

perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error

When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
 access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.

[   55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[   55.611280] Call Trace:
[   55.611282]  <TASK>
[   55.611284]  ? intel_pmu_config_acr+0x87/0x160
[   55.611289]  intel_pmu_enable_acr+0x6d/0x80
[   55.611291]  intel_pmu_enable_event+0xce/0x460
[   55.611293]  x86_pmu_start+0x78/0xb0
[   55.611297]  x86_pmu_enable+0x218/0x3a0
[   55.611300]  ? x86_pmu_enable+0x121/0x3a0
[   55.611302]  perf_pmu_enable+0x40/0x50
[   55.611307]  ctx_resched+0x19d/0x220
[   55.611309]  __perf_install_in_context+0x284/0x2f0
[   55.611311]  ? __pfx_remote_function+0x10/0x10
[   55.611314]  remote_function+0x52/0x70
[   55.611317]  ? __pfx_remote_function+0x10/0x10
[   55.611319]  generic_exec_single+0x84/0x150
[   55.611323]  smp_call_function_single+0xc5/0x1a0
[   55.611326]  ? __pfx_remote_function+0x10/0x10
[   55.611329]  perf_install_in_context+0xd1/0x1e0
[   55.611331]  ? __pfx___perf_install_in_context+0x10/0x10
[   55.611333]  __do_sys_perf_event_open+0xa76/0x1040
[   55.611336]  __x64_sys_perf_event_open+0x26/0x30
[   55.611337]  x64_sys_call+0x1d8e/0x20c0
[   55.611339]  do_syscall_64+0x4f/0x120
[   55.611343]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR

commit 86aa94c
Author: Dapeng Mi dapeng1.mi@linux.intel.com
Date: Thu May 29 08:02:36 2025 +0000

perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()

The MSR offset calculations in intel_pmu_config_acr() are buggy.

To calculate fixed counter MSR addresses in intel_pmu_config_acr(),
the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED.

The released OCR and FRONTEND events utilized more bits on Lunar Lake p-core. The corresponding mask in the extra_regs has to be extended to unblock the extra bits. Add a dedicated intel_lnc_extra_regs. Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20241216160252.430858-1-kan.liang@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

According to the latest event list, update the event constraint tables for Lion Cove core. The general rule (the event codes < 0x90 are restricted to counters 0-3.) has been removed. There is no restriction for most of the performance monitoring events. Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Amiri Khalil <amiri.khalil@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20250219141005.2446823-1-kan.liang@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

When running "perf mem record" command on CWF, the below KASAN global-out-of-bounds warning is seen. ================================================================== BUG: KASAN: global-out-of-bounds in cmt_latency_data+0x176/0x1b0 Read of size 4 at addr ffffffffb721d000 by task dtlb/9850 Call Trace: kasan_report+0xb8/0xf0 cmt_latency_data+0x176/0x1b0 setup_arch_pebs_sample_data+0xf49/0x2560 intel_pmu_drain_arch_pebs+0x577/0xb00 handle_pmi_common+0x6c4/0xc80 The issue is caused by below code in __grt_latency_data(). The code tries to access x86_hybrid_pmu structure which doesn't exist on non-hybrid platform like CWF. WARN_ON_ONCE(hybrid_pmu(event->pmu)->pmu_type == hybrid_big) So add is_hybrid() check before calling this WARN_ON_ONCE to fix the global-out-of-bounds access issue. Fixes: 0902624 ("perf/x86/intel: Rename model-specific pebs_latency_data functions") Reported-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zide Chen <zide.chen@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251028064214.1451968-1-dapeng1.mi@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

…rs-snapshotting The counter backwards may be observed in the PMI handler when counters-snapshotting some non-precise events in the freq mode. For the non-precise events, it's possible the counters-snapshotting records a positive value for an overflowed PEBS event. Then the HW auto-reload mechanism reset the counter to 0 immediately. Because the pebs_event_reset is cleared in the freq mode, which doesn't set the PERF_X86_EVENT_AUTO_RELOAD. In the PMI handler, 0 will be read rather than the positive value recorded in the counters-snapshotting record. The counters-snapshotting case has to be specially handled. Since the event value has been updated when processing the counters-snapshotting record, only needs to set the new period for the counter via x86_pmu_set_period(). Fixes: e02e9b0 ("perf/x86/intel: Support PEBS counters snapshotting") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250424134718.311934-6-kan.liang@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

When running perf_fuzzer on PTL, sometimes the below "unchecked MSR access error" is seen when accessing IA32_PMC_x_CFG_B MSRs. [ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30) [ 55.611280] Call Trace: [ 55.611282] <TASK> [ 55.611284] ? intel_pmu_config_acr+0x87/0x160 [ 55.611289] intel_pmu_enable_acr+0x6d/0x80 [ 55.611291] intel_pmu_enable_event+0xce/0x460 [ 55.611293] x86_pmu_start+0x78/0xb0 [ 55.611297] x86_pmu_enable+0x218/0x3a0 [ 55.611300] ? x86_pmu_enable+0x121/0x3a0 [ 55.611302] perf_pmu_enable+0x40/0x50 [ 55.611307] ctx_resched+0x19d/0x220 [ 55.611309] __perf_install_in_context+0x284/0x2f0 [ 55.611311] ? __pfx_remote_function+0x10/0x10 [ 55.611314] remote_function+0x52/0x70 [ 55.611317] ? __pfx_remote_function+0x10/0x10 [ 55.611319] generic_exec_single+0x84/0x150 [ 55.611323] smp_call_function_single+0xc5/0x1a0 [ 55.611326] ? __pfx_remote_function+0x10/0x10 [ 55.611329] perf_install_in_context+0xd1/0x1e0 [ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10 [ 55.611333] __do_sys_perf_event_open+0xa76/0x1040 [ 55.611336] __x64_sys_perf_event_open+0x26/0x30 [ 55.611337] x64_sys_call+0x1d8e/0x20c0 [ 55.611339] do_syscall_64+0x4f/0x120 [ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e On PTL, GP counter 0 and 1 doesn't support auto counter reload feature, thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR which requires to enable auto counter reload on GP counter 0. The root cause of causing this issue is the check for auto counter reload (ACR) counter mask from user space is incorrect in intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter mask from user space could be set into hw.config1 and then written into CFG_B MSRs and trigger the MSR access warning. e.g., User may create a perf event with ACR counter mask (config2=0xcb), and there is only 1 event created, so "cpuc->n_events" is 1. The correct check condition should be "i + idx >= cpuc->n_events" instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise, the counter mask would traverse twice and an invalid "cpuc->assign[1]" bit (bit 0) is set into hw.config1 and cause MSR accessing error. Besides, also check if the ACR counter mask corresponding events are ACR events. If not, filter out these counter mask. If a event is not a ACR event, it could be scheduled to an HW counter which doesn't support ACR. It's invalid to add their counter index in ACR counter mask. Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as user could set any invalid ACR counter mask and the warning message could mislead users. Fixes: ec980e4 ("perf/x86/intel: Support auto counter reload") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-3-dapeng1.mi@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

…fig_acr() The MSR offset calculations in intel_pmu_config_acr() are buggy. To calculate fixed counter MSR addresses in intel_pmu_config_acr(), the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED. This leads to the ACR mask value of fixed counters to be incorrectly saved to the positions of GP counters in acr_cfg_b[], e.g. For fixed counter 0, its ACR counter mask should be saved to acr_cfg_b[32], but it's saved to acr_cfg_b[0] incorrectly. Fix this issue. [ mingo: Clarified & improved the changelog. ] Fixes: ec980e4 ("perf/x86/intel: Support auto counter reload") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529080236.2552247-2-dapeng1.mi@linux.intel.com Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>

Avenger-285714 · 2025-12-12T11:18:27Z

有些补丁修复没合入 perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq fix: perf/x86: Support counter mask

commit aa5d2ca Author: Kan Liang kan.liang@linux.intel.com Date: Mon Dec 16 08:02:52 2024 -0800

perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC

fix:perf/x86: Add Lunar Lake and Arrow Lake support

commit 782cffe Author: Kan Liang kan.liang@linux.intel.com Date: Wed Feb 19 06:10:05 2025 -0800

perf/x86/intel: Fix event constraints for LNC

fix:perf/x86: Add Lunar Lake and Arrow Lake support commit 0ba6502 Author: Dapeng Mi dapeng1.mi@linux.intel.com Date: Tue Oct 28 14:42:14 2025 +0800

perf/x86/intel: Fix KASAN global-out-of-bounds warning

fix:perf/x86/intel: Rename model-specific pebs_latency_data functions

commit 7da9960 Author: Kan Liang kan.liang@linux.intel.com Date: Thu Apr 24 06:47:18 2025 -0700

perf/x86/intel/ds: Fix counter backwards of non-precise events counters-snapshotting

The counter backwards may be observed in the PMI handler when
counters-snapshotting some non-precise events in the freq mode.

For the non-precise events, it's possible the counters-snapshotting
records a positive value for an overflowed PEBS event. Then the HW
auto-reload mechanism reset the counter to 0 immediately. Because the
pebs_event_reset is cleared in the freq mode, which doesn't set the
PERF_X86_EVENT_AUTO_RELOAD.
In the PMI handler, 0 will be read rather than the positive value
recorded in the counters-snapshotting record.

The counters-snapshotting case has to be specially handled. Since the
event value has been updated when processing the counters-snapshotting
record, only needs to set the new period for the counter via
x86_pmu_set_period().

Fixes: e02e9b0374c3 ("perf/x86/intel: Support PEBS counters snapshotting")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250424134718.311934-6-kan.liang@linux.intel.com

commit 43796f3 Author: Dapeng Mi dapeng1.mi@linux.intel.com Date: Wed Aug 20 10:30:27 2025 +0800

perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error

When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
 access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.

[   55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[   55.611280] Call Trace:
[   55.611282]  <TASK>
[   55.611284]  ? intel_pmu_config_acr+0x87/0x160
[   55.611289]  intel_pmu_enable_acr+0x6d/0x80
[   55.611291]  intel_pmu_enable_event+0xce/0x460
[   55.611293]  x86_pmu_start+0x78/0xb0
[   55.611297]  x86_pmu_enable+0x218/0x3a0
[   55.611300]  ? x86_pmu_enable+0x121/0x3a0
[   55.611302]  perf_pmu_enable+0x40/0x50
[   55.611307]  ctx_resched+0x19d/0x220
[   55.611309]  __perf_install_in_context+0x284/0x2f0
[   55.611311]  ? __pfx_remote_function+0x10/0x10
[   55.611314]  remote_function+0x52/0x70
[   55.611317]  ? __pfx_remote_function+0x10/0x10
[   55.611319]  generic_exec_single+0x84/0x150
[   55.611323]  smp_call_function_single+0xc5/0x1a0
[   55.611326]  ? __pfx_remote_function+0x10/0x10
[   55.611329]  perf_install_in_context+0xd1/0x1e0
[   55.611331]  ? __pfx___perf_install_in_context+0x10/0x10
[   55.611333]  __do_sys_perf_event_open+0xa76/0x1040
[   55.611336]  __x64_sys_perf_event_open+0x26/0x30
[   55.611337]  x64_sys_call+0x1d8e/0x20c0
[   55.611339]  do_syscall_64+0x4f/0x120
[   55.611343]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR

commit 86aa94c Author: Dapeng Mi dapeng1.mi@linux.intel.com Date: Thu May 29 08:02:36 2025 +0000

perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()

The MSR offset calculations in intel_pmu_config_acr() are buggy.

To calculate fixed counter MSR addresses in intel_pmu_config_acr(),
the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED.

done

deepin-ci-robot · 2025-12-12T11:18:40Z

deepin pr auto review

我来帮你分析这个git diff。这是一个较大的改动，主要涉及x86架构下性能监控单元(PMU)的改进。我将从语法逻辑、代码质量、性能和安全几个方面进行分析。

语法逻辑改进：

使用位掩码(cntr_mask64)替代计数器数量(num_counters)的表示方式，这是一个更灵活的表示方法
添加了新的PEBS(基于事件的采样)和ACR(自动计数器重载)功能支持
改进了事件约束处理机制，增加了动态约束支持
统一了混合架构下的PMU处理方式

代码质量改进：

增加了更多的错误检查和边界条件处理
改进了代码结构，将相关功能封装到独立函数中
添加了更多的注释说明
使用了更清晰的变量命名

性能改进：

优化了计数器访问方式，使用位运算代替循环
改进了PEBS缓冲区处理效率
添加了计数器快照功能，减少不必要的读取操作

安全性改进：

增加了对无效配置的检查
改进了资源分配和释放的配对检查
添加了对特殊情况的防护

具体改进建议：

对于新增的PEBS和ACR功能，建议添加更多的文档说明其使用方法和注意事项。
在性能关键路径上，如事件处理函数中，建议进一步优化内存访问模式。
对于混合架构的处理，建议添加更多的单元测试来确保不同架构下的正确性。
建议添加更多的运行时检查，特别是在处理用户提供的配置参数时。
对于新增的MSR(Model Specific Register)访问，建议添加更多的错误处理和恢复机制。
在事件约束处理中，建议添加更多的调试信息，方便问题诊断。

这些改动整体上是积极的，提高了PMU子系统的灵活性和可扩展性，同时也改善了代码的可维护性。建议在合并前进行充分的测试，特别是在各种不同的硬件配置下。

deepin-ci-robot · 2025-12-15T03:11:02Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: opsiff

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~deepin/OWNERS~~ [opsiff]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Intel inclusion category: bugfix bugzilla: https://gitee.com/openeuler/intel-kernel/issues/ICZHEB CVE: NA -------------------------------- Following upstream commits introduced 2 fields (config1 and dyn_constraint) in struct hw_perf_event, which breaks kABI. ec980e4 ("perf/x86/intel: Support auto counter reload") 4dfe323 ("perf/x86: Add dynamic constraint") To fix this kABI breakage, we introduce struct hw_perf_event_ext, and use one KABI_RESERVE field in struct perf_event as pointer to this struct hw_perf_event_ext. This is viable because hw_perf_event is always embedded in struct perf_event, so we can always access hw_perf_event_ext from perf_event when needed. We also create a kmem_cache for struct hw_per_event_ext. Another kABI changes are caused by the following commit: 0e102ce ("KVM: x86/pmu: Change ambiguous _mask suffix to _rsvd in kvm_pmu") But the fix is trivial. Fixes: ec980e4 ("perf/x86/intel: Support auto counter reload") Fixes: 4dfe323 ("perf/x86: Add dynamic constraint") Signed-off-by: Jason Zeng <jason.zeng@intel.com> Link: deepin-community#1356 [Backport: drop arch/x86/include/asm/kvm_host.h for no rename it] Signed-off-by: Wentao Guan <guanwentao@uniontech.com>

Intel inclusion category: bugfix bugzilla: https://gitee.com/openeuler/intel-kernel/issues/ICZHEB CVE: NA -------------------------------- Following upstream commits introduced 2 fields (config1 and dyn_constraint) in struct hw_perf_event, which breaks kABI. ec980e4 ("perf/x86/intel: Support auto counter reload") 4dfe323 ("perf/x86: Add dynamic constraint") To fix this kABI breakage, we introduce struct hw_perf_event_ext, and use one KABI_RESERVE field in struct perf_event as pointer to this struct hw_perf_event_ext. This is viable because hw_perf_event is always embedded in struct perf_event, so we can always access hw_perf_event_ext from perf_event when needed. We also create a kmem_cache for struct hw_per_event_ext. Another kABI changes are caused by the following commit: 0e102ce ("KVM: x86/pmu: Change ambiguous _mask suffix to _rsvd in kvm_pmu") But the fix is trivial. Fixes: ec980e4 ("perf/x86/intel: Support auto counter reload") Fixes: 4dfe323 ("perf/x86: Add dynamic constraint") Signed-off-by: Jason Zeng <jason.zeng@intel.com> Link: #1356 [Backport: drop arch/x86/include/asm/kvm_host.h for no rename it] Signed-off-by: Wentao Guan <guanwentao@uniontech.com>

Kan Liang and others added 8 commits December 6, 2025 16:07

Copilot AI review requested due to automatic review settings December 6, 2025 08:46

sourcery-ai bot reviewed Dec 6, 2025

View reviewed changes

deepin-ci-robot requested a review from Wenlp December 6, 2025 08:47

Avenger-285714 requested a review from opsiff December 6, 2025 08:47

Copilot started reviewing on behalf of Avenger-285714 December 6, 2025 08:48 View session

Copilot AI reviewed Dec 6, 2025

View reviewed changes

Kan Liang added 16 commits December 9, 2025 00:14

Kan Liang and others added 8 commits December 9, 2025 00:14

Avenger-285714 force-pushed the GNRPMUfixes branch from be84f95 to 7654b02 Compare December 8, 2025 16:15

Avenger-285714 requested a review from Copilot December 8, 2025 17:44

Copilot started reviewing on behalf of Avenger-285714 December 8, 2025 17:46 View session

Copilot AI reviewed Dec 8, 2025

View reviewed changes

Kan Liang and others added 6 commits December 12, 2025 19:04

opsiff approved these changes Dec 15, 2025

View reviewed changes

opsiff merged commit ab0f9b7 into deepin-community:linux-6.6.y Dec 15, 2025
11 checks passed

deepin-ci-robot added the approved label Dec 15, 2025

opsiff mentioned this pull request Dec 16, 2025

[Deepin-Kernel-SIG] [linux 6.6-y] [Deepin] Fix kabi for CWF PMU support #1378

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] [Intel] GNR PMU fixes and new platform Clearwater Forest PMU support #1356

[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] [Intel] GNR PMU fixes and new platform Clearwater Forest PMU support #1356

Uh oh!

Avenger-285714 commented Dec 6, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 6, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

opsiff commented Dec 9, 2025

Uh oh!

Avenger-285714 commented Dec 12, 2025

Uh oh!

deepin-ci-robot commented Dec 12, 2025

Uh oh!

Uh oh!

deepin-ci-robot commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	x86_pmu.cntr_mask64 = ENMASK_ULL(eax.split.num_counters - 1, 0);
	x86_pmu.cntr_mask64 = GENMASK_ULL(eax.split.num_counters - 1, 0);

[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] [Intel] GNR PMU fixes and new platform Clearwater Forest PMU support #1356

[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] [Intel] GNR PMU fixes and new platform Clearwater Forest PMU support #1356

Uh oh!

Conversation

Avenger-285714 commented Dec 6, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

opsiff commented Dec 9, 2025

Uh oh!

Avenger-285714 commented Dec 12, 2025

Uh oh!

deepin-ci-robot commented Dec 12, 2025

deepin pr auto review

Uh oh!

Uh oh!

deepin-ci-robot commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants