Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Nvidia Grace #585

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Add support for Nvidia Grace #585

wants to merge 8 commits into from

Conversation

TomTheBear
Copy link
Member

@TomTheBear TomTheBear commented Dec 7, 2023

  • Added units:
    • hwthread-local PMU armv8_pmuv3_0
    • nvidia_scf_pmu_0
    • nvidia_pcie_pmu_0
    • nvidia_nvlink_c2c0_pmu_0
    • nvidia_cnvlink_pmu_0 (only Grace-Hopper)
    • nvidia_nvlink_c2c1_pmu_0 (only Grace-Hopper)

This PR also contains some fixes for the COMPILER setting GCCARM.

Documentation: https://docs.nvidia.com/grace-performance-tuning-guide.pdf
Uncore events based on perf data in /sys/devices/nvidia*/events

@TomTheBear
Copy link
Member Author

NVIDIA recommends to use CMEM_WR_TOTAL_BYTES instead of CMEM_WR_DATA to measure memory write data.
@TomTheBear
Copy link
Member Author

We identified a problem on GraceGrace systems with the current PR. The memory controller devices SCF* cover only the first socket. The second socket is not yet supported.

Thanks @JanLJL for finding this.

@TomTheBear
Copy link
Member Author

The last commit fixes the Uncore device handling with perf_event mode on GraceGrace systems. Commonly one Uncore unit covers all sockets but in case of Nvidia GraceGrace, there are separate devices per socket.

Moreover, the GraceGrace test system reports CPU socket IDs 36 and 2364 which confused the lock setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants