Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardware Prefetch for Atom E-Cores #470

Closed
cyring opened this issue Dec 13, 2023 · 9 comments
Closed

Hardware Prefetch for Atom E-Cores #470

cyring opened this issue Dec 13, 2023 · 9 comments

Comments

@cyring
Copy link
Owner

cyring commented Dec 13, 2023

In this Intel whitepaper 357930-001US we are reading new MSR dedicated to Atom sub-architecture within hybrid processors.

Feature Register
LLC Streamer 0x1320
L2 NLP 0x1321
SELECTION 0x1323

Whereas the known MSR 0x1A4 is said to be available for both P-Cores and E-Cores


I'll appreciate if one can rdmsr the new MSR on E-Core CPU numbers.

Next, still on E-Cores, toggle the Enable bit using a wrmsr ; rdmsr modification sequence.

It would be also interesting to check the specs:

  • try a read access of those Atom MSR on P-Cores
  • verify the scope of writing: MSR should be module scope: flipping a MSR bit on an unique E-Core should be reflected on the others. But what is the module cluster formed with ?

Thanks for helping.

@cyring
Copy link
Owner Author

cyring commented Dec 13, 2023

contributors @gundami @Technologicat @justanerd @BugReporterZ @svmlegacy @kocoman1 @jowa2021 @Betaminos @huajian628 @vitaly-zdanevich @rushvora

Can your guys please help to investigate those registers on Hybrid architecture ?

@BugReporterZ
Copy link

I tried this.

#rdmsr -a 0x1321
CPU 19: 250122000001
CPU 18: 250122000001
CPU 17: 250122000001
CPU 16: 250122000001
rdmsr: CPU 15 cannot read MSR 0x00001321

#rdmsr -a 0x1323
CPU 19: 1f9cc00000000
CPU 18: 1f9cc00000000
CPU 17: 1f9cc00000000
CPU 16: 1f9cc00000000
rdmsr: CPU 15 cannot read MSR 0x00001323

===============================================

#rdmsr -a 0x1320
CPU 19: 100007e041000004
CPU 18: 100007e041000004
CPU 17: 100007e041000004
CPU 16: 100007e041000004
rdmsr: CPU 15 cannot read MSR 0x00001320

#wrmsr -p 19 0x1320 4
#rdmsr -a -x -0 0x1320
CPU 19: 0000000000000004
CPU 18: 0000000000000004
CPU 17: 0000000000000004
CPU 16: 0000000000000004
rdmsr: CPU 15 cannot read MSR 0x00001320

#wrmsr -p 19 0x1320 1152930164351434756
#rdmsr -a -x -0 0x1320
CPU 19: 100007e041000004
CPU 18: 100007e041000004
CPU 17: 100007e041000004
CPU 16: 100007e041000004
rdmsr: CPU 15 cannot read MSR 0x00001320

@cyring
Copy link
Owner Author

cyring commented Dec 13, 2023

I tried this.

Excellent.
I can code now.
Thank you

@cyring
Copy link
Owner Author

cyring commented Dec 15, 2023

  • ROG STRIX Z790 Series BIOS Manual
Hardware Prefetcher
Allows you to enable or disable the MLC streamer prefetcher.
Configuration options: [Disabled] [Enabled]

Adjacent Cache Line Prefetch
Allows you to prefetch adjacent cache lines, reducing the DRAM loading time and improving 
the system performance.
Configuration options: [Disabled] [Enabled]
  • SUPERMICRO C9Z790 USER MANUAL
Hardware Prefetcher
If set to Enabled, the hardware prefetcher will prefetch streams of data and instructions 
from the main memory to the L2 cache to improve CPU performance. The options are 
Disabled and Enabled.

Adjacent Cache Line Prefetch 
Select Enabled for the CPU to prefetch both cache lines for 128 bytes as comprised. Select 
Disabled for the CPU to prefetch both cache lines for 64 bytes. The options are Disabled 
and Enabled.
  • ASRock Intel Z790 Motherboard
    <same as above>

@cyring
Copy link
Owner Author

cyring commented Dec 16, 2023

  • MS03-6L0
Enable LP [Global]
Enables Logical processor (Software Method to Enable/Disable Logical Processor threads).
Options available: ALL LPs, Single LP. Default setting is ALL LPs.

Hardware Prefetcher Select whether to enable the speculative prefetch unit of the processor.
Options available: Enable, Disable. Default setting is Enable.
L2 RF0 Prefetch Disable Options available: Enable, Disable. Default setting is Disable.

Adjacent Cache Prefetch
When enabled, cache lines are fetched in pairs. When disabled, only the required cache line is fetched. 
Options available: Enable, Disable. Default setting is Enable.

DCU Streamer Prefetcher Enable/Disable DCU streamer prefetcher.
Options available: Enable, Disable. Default setting is Enable.

DCU IP Prefetcher
Enable/Disable DCU IP Prefetcher.
Options available: Enable, Disable. Default setting is Enable.

@cyring
Copy link
Owner Author

cyring commented Dec 16, 2023

The Intel Atom cores are placed in a group of four per module with private L1 caches for each core.

Each core has a set of L1 hardware prefetchers

The L2 prefetch block is shared for all cores in the module

trackers are shared between the cores

DCU – Data Caching Unit is the block that holds the L1 data cache.

@cyring
Copy link
Owner Author

cyring commented Dec 16, 2023

+ @BugReporterZ : Hello,

2023-12-16-133435_720x425_scrot

Above is the last commit 0a0daea which adds DCU L1 NLP, a bit of MSR 0x1A4 unveiled in the Whitepaper.


Since commit cc5c327 are added for E-Cores:

  • L1 NPP
  • L2 NLP
  • LLC Streamer

2023-12-17-092835_720x425_scrot

Remark: Screenshots above are made from virtualization

@cyring
Copy link
Owner Author

cyring commented Dec 18, 2023

In "Hardware LLC prefetch feature
on 4th Gen Intel® Xeon®
Scalable Processor (Codename
Sapphire Rapids)"
we are reading L3 prefetch disable at bit 42 of MSR 0x6D

Does it work on Desktop or Mobile processors ?

@cyring
Copy link
Owner Author

cyring commented Jan 18, 2024

Pre-release in progress
#472
If things are missing, please let me know.

@cyring cyring closed this as completed Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants