-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAPL for Intel/AMD architectures #57
Comments
Measurements issue with i7-3770 (IvyBridge). TDP specifications have to be 77 W |
Workaround for IVB [Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz] Line 443 in 68db124
Shm->Proc.Power.Unit.Watts = Proc->PowerThermal.Unit.PU > 0 ?
1.0 / (double) (1 << Proc->PowerThermal.Unit.PU) : 0;
Shm->Proc.Power.Unit.Watts /= (Proc->CPU.Count >> Proc->Features.HTT_Enable); |
Issue closed until other hardware is available for testings. |
Remarks
|
Proposal for the
|
# |
Freq MHz | VID | Vcore | Energy(J) | Counter(Raw) |
---|---|---|---|---|---|
#0 |
4015.73 |
. | . | Package 131.852859497 |
9876543210 |
#1 |
4015.67 |
. | . | Cores 122.611679077 |
. |
#2 |
4015.74 |
54 |
1.2125 |
Uncore 10.000000000 |
9876543210 |
#3 |
4015.73 |
. | . | Memory 5.000000000 |
. |
# |
Freq MHz | VID | Vcore | Power(W) | Core(W) |
---|---|---|---|---|---|
#0 |
4015.73 |
. | . | Package 131.852859497 |
13.000000000 |
#1 |
4015.67 |
. | . | Cores 122.611679077 |
. |
#2 |
4015.74 |
54 |
1.2125 |
Uncore 10.000000000 |
11.000000000 |
#3 |
4015.73 |
. | . | Memory 5.000000000 |
. |
Intel
# |
Freq MHz | VID | Vcore | Energy(J) | Counter(Raw) |
---|---|---|---|---|---|
#0 |
4015.73 |
. | . | Package 131.852859497 |
876543210987654321 |
#1 |
4015.67 |
. | . | Cores 122.611679077 |
876543210987654321 |
#2 |
4015.74 |
54 |
1.2125 |
Uncore 10.000000000 |
106543210987654321 |
#3 |
4015.73 |
. | . | Memory 5.000000000 |
056543210987654321 |
# |
Freq MHz | VID | Vcore | Power(W) | Core(W) |
---|---|---|---|---|---|
#0 |
4015.73 |
. | . | Package 131.852859497 |
. |
#1 |
4015.67 |
. | . | Cores 122.611679077 |
. |
#2 |
4015.74 |
54 |
1.2125 |
Uncore 10.000000000 |
. |
#3 |
4015.73 |
. | . | Memory 5.000000000 |
. |
Remarks
- One shortcut to toggle between the Energy(J) and Power(W) layouts
- To my knowledge, only the AMD Zen architecture is providing a per physical core RAPL counter;
whereas Intel will only show the cumulative raw counter value.
RAPL in the AMD Zen architectureEnergy consumedRyzen 3xxxRyzen 2xxxTopologyRyzen 2xxx
Ryzen 3xxx
Threadripper 2950XRemark: Threadripper results below are from an old CoreFreq version
IssueAMD specifications
Improvements
Questions
|
Core Complex IDRyzen 3xxx [SMT ON]
CCX_ID = ( leaf8000001e.EAX.ExtApicId & 0b1000 ) >> 1
|
After long hours of the 3700X debugging, it appears that the RAPL Package Energy Counter delta remains the same, whatever the CPU is used to read the msr Edit: as specified, the RAPL Energy status is package scope which returns the same value whatever is the Core used for reading. Call for help on Reddit |
|
Here are my Screenshots + Output as requested in #129
|
To all reading this issue, look in the above screenshots for the Package Power measurements; Cores all stressed :
Both cases, same algorithm based on RAPL registers; what can we conclude ? |
IMPORTANT NOTE:
|
EDIT: Some additional info... Please read IMPORTANT NOTE section for X370/X470, especially the first. |
What I wrote very much confirmed:
|
As mentioned In the SDM specifications, there is no guarantee this counter exists for the listed architecture families. Tested with a Skylake i7-6700, this MSR returns a zero value on all cores. |
enum POWER_FORMULAS {
POWER_FORMULA_NONE = \
(0b000000000000000000000000 << 8) | FORMULA_SCOPE_NONE,
POWER_FORMULA_INTEL = \
(0b000000000000000000000001 << 8) | FORMULA_SCOPE_NONE,
POWER_FORMULA_INTEL_ATOM = \
(0b000000000000000000000011 << 8) | FORMULA_SCOPE_NONE,
POWER_FORMULA_AMD = \
(0b000000000001000000000000 << 8) | FORMULA_SCOPE_CORE,
POWER_FORMULA_AMD_17h = \
(0b000100000001000000000000 << 8) | FORMULA_SCOPE_CORE
};
|
@adatum : thank you
|
My understanding of the Zen SMT architecture is that CPU 10 is the logical peer of the core CPU 2, and they have the Vcore in common. |
Yes, that's what I meant to highlight. It makes sense for the two virtual cores (what I meant by "threads") to have the same Vcore since it is the same physical core. I'm not sure if that necessarily has to be the case, but at least it makes sense. |
The Topology is the clue. Just checking at the screenshots inside the Wiki CPU Support and both 2700X and 3600X have a similar Topology, where the (CCX, CoreID, ThreadID) forms the cluster. |
This comment has been minimized.
This comment has been minimized.
If I stress 2 to the max, I get pretty much expected results. So using They're divided into CCX0 and CCX1, you can see from screenshot, counting down 4 first (CCX0), then counting down just 2 after that, since then it's come to CCX1 (for my CPU with 12 threads), and then counting down 4 again, still CCX1, but goes 4 down again, I assume for the CPU logic, which again I assume is made that way so it comes down to efficient heat spreading, later handling over to other cores with the same pattern to maintain performance. So makes perfect sense how CoreFreq represents the "spread", like Ryzen Master would if using say CPU-Z on Windows, choosing same amount of threads. On a 8c/16t CPU like the 3700X I would assume the same pattern, just different "counting". For what I've seen from your debug output it seems CoreFreq can already show CCXs', and it would be nice if the GUI to separate them, like showing a column for CCX{X} or whatever. I doubt AMD will do any significant changes there when it comes to "grouping" in 4th gen. So far still best tool for monitoring I'd say, Watts per core etc my NUC doesn't show (and IIRC not RM either, gotta check later, maybe another popular tool at least does it). Ryzen Master of course an advantage of very nice GUI (not that I expect that from you!), drag and slide OC and whatnot, for RAM stuff as well, but it changes UEFI settings of course, it doesn't come with a Windows CPU driver or anything, so anyone can do it. Just easier for the regular user. But as posted in the chat thread, OC is very limited anyway, performance is great out of the box and the quick settings are easy peasy in UEFI. In the end most will end up with Auto Overclock in Ryzen Master anyway, a Precision Boost Override of 100 MHz, which can be increased there to 200 MHz - kind of more obvious for anyone into OC on Intel as well, in UEFI. Except AMD has put it into both Tuning and XFR parts of UEFI (one set to freq, another just enabled an all other values "Auto", is what RM does). Point is, CoreFreq seems basically complete for me on X570. Let's see with 3950X, but shouldn't be any different. Same arch, (much) more cores, basically. I really don't see you have to do more for X570... Got all the data I need + more stress test than Windows can do. Only showing CCXs'... Cosmetics and stuff. Maybe showing Now a power user should probably understand, but well, just a suggestion. If anyone were to want to make a Ryzen Master desktop GUI equivalent on Linux they should be able to do so using say Qt + CoreFreq as backbone. At least monitoring will be the same, if not better. |
That was indeed the purpose of all these changes requested by CoreFreq's users:
It took hundreds of source code lines to refactor this View, please feel free to draw an ascii proposal of your ideas. After debate, changes could be engaged. There are tons of remaining things to do with the Zen uarch: we are just at the middle of the subject. |
True that! It's available in Topology. I'm not very skilled at image editing, but I know GIMP well enough I guess. Anyway I think you'll always have a better idea (like a "rejection"). I mainly just think a column for CCX just as in Topology... I totally understand your hard work and don't ask for anything requiring tons of code. You're already a super FOSS hero! You should know I think that way about you by now. Like heck - no other tools even shows temperatures for Ryzen 3rd Gen, so! Sometimes I even wish you didn't respond so quickly. I mean if you really enjoy the project, go on! But if you're tired, don't be afraid of taking a week offline at least... Any more requests regarding debugging my NUC or my X570+3600X, you'll get it. More specific the better. NUC has that 0 Watts per core, still all cores Watts shown, correctly I assume, BUT then interpreted false by many I guess. I assume you refer to this as "masking zero values for Intel". Well, maybe just show total Cores Watts then. Highlighted. Like all cores showing total or "See below" (as a stupid example) I'm not as skillful as you in this nor as understanding of the vocabulary so the easier instructions the better :) I think you more or less know my knowledge. I might sometimes be a sysadmin of hundreds of crucial govt servers. Doesn't mean I read hundreds of pages of CPU specs. We simply don't have to, we have absolutely no reason to. Things work since we always buy compatible HW for Linux and VMs. What I have at home is another case. My monster laptop - I think you asked me to try some code changes on it - sorry haven't been able to get the time to it yet. All my previous posts are from my phone too. Haven't been close to a workstation for weeks. All is SSH. That laptop needs SATA2 disk moved to SATA1 as it apparently won't boot otherwise, although HDD set as first boot device and installation successfull. Tired of live USBs so it has a solid installation that doesn't boot basically. Also the arch is so old, is it very important or more of a "challenge/curiosity" from your side? CPU from 2009 you know... I seems to me it can help you with some newer ones, but still worth it? I mean for me, not even being were it is and bad health? Not to play that card, but yeah. I might be stupid but let's say you support only CPUs from 2014+ minimum and get rid of legacy code, and officially support mainly Intel, but AMD is basically supported as well, Alpha or Beta depending. I assume you know the difference, but since you add more features it seems Alpha for both 2nd Gen and 3rd Gen. A very stable Alpha though! still adjusting/adding features it's per definition Alpha. But never caused a crash or nothing of the sort. And CoreFreq does deep level sh*t I assume 1 wrong like could easily freeze the system just loading the module or starting a stress test or whatever. Still it might be considerable to stop supporting like ALL 64-bit x86 CPUs out there? How many would use CoreFreq on my monster laptop's CPU you think? I hate to throw away still good HW as I guess you do, but most do, or they at least don't expect GitHub projects to work on their 9+ years old HW. For cosmetics I think Power Usage has "lost" some of it's easiness of reading. People won't necessarily look at the bottom strip and seeing "Package" and "Cores". They're not highlighted nor do they have even |
EDIT: Just added last paragraph. Think it's kind of important. Should be more present and clearly shows power usage WITH what type of measurement. Like RM uses % of Watts but at least clearly says (and shows) it's X % of Watts, if you get it. Easy to see, a top priority measurement shown as a "wheel" AT THE TOP, meaning highlighted, not asking for a wheel, but before it was more clear. Of course I know where to look know, now, but even I had to "look around". Remember new users just following your instructions to compile and run may "dumb in our book", MANY DON'T EVEN HAVE A GITHUB ACCOUNT, so they'll never report and issue. I assume you haven't added hidden analytics or whatever, haven't even bothered to check, but probably a lot testing, running, trying stuff and expect like official tool experience. Which you actually have, and more, for Intel CPUs regarding changing stuff on demand, and for AMD monitoring (the most important after all). I'd always thought about your Point is, however how powerful this software is, keep installation, first use and navigation and main highlights and switches people look for KISS (Keep It Simple Stupid). I assume you know the acronym. You can delete this comment after reading the last added paragraph to the comment before. BTW: I'm sure there's still lot to do regarding Zen2 on X570 and later, but I think if you perfect Zen2 on X570, it'll work perfectly fine, maybe with some slight mods for 4th Gen. Just saying it may be worth the while. Although I'm perfectly happy, just saying doubt AMD will change anything drastically for 4th Gen, if you keep up reading on 3rd Gen, you'll probably have 4th Gen working perfectly. AMD seems to have set a path. I assume any 4000 CPU to just slide into my X570 socket and work. If not, AMD has not kept its promise, and with 3rd Gen it really seems they mean business. If you think 4th Gen will give you more core insight, fine. I'm happy. But kind of assume 4th Gen will be an incremental update to 3rd Gen using more or less same chipset, just more cores etc. Do you think they'll go say PCIe5? It's out as a standard, but nobody needs it - nobody use PCIe4 to its max at all yet. IDK just feels it'll be yet another "hush hush, surprisingly bash Intel on price vs performance with all manufacturers on board" (kind of incredible how AMD pulled of 3rd Gen without any leaks basically, and having MBs from ALL major manufacturers already lined up and ready to ship at launch just days later). |
TDPFormula
Where PU is an unsigned integer (default value is 0011b, indicating power unit is in 1/8 Watts increment). Skylake i7-6700
TDP of Skylake i7-6700 = 65 Watts SandyBridge i7-2710QE
TDP of SandyBridge i7-2710QE = 45 Watts Next stepsSo far the Power Unit MSR is available for Intel and AMD Zen but the Power Info MSR is only found on Intel |
That's cool! My NUC shows 15 TDP, although a full stress test gives 6+ Watts. Haven't checked specs for the CPU but 99 % sure it's 15 Watts (or in that area). EDIT: Need a screenshot for that? Just gone through the hassle of BIOS update, latest Windows Insider Update, Radeon Update (really just GPU, no chipset), reset BIOS and adjusted all settings again in case new settings. Using Ryzen Master (no updates to it yet) for PBO I know very well what it does to UEFI now. Thank God AMD it's at least OS transparent (you can do exactly the same without Windows, just gotta change like 3 things in UEFI for a RM standard 100 MHz PBO, where 200 is max, it's not recommended nor gives really any better results, really). I've done Geekbench again, this time v 4.4.2, even though GB 5 is out, it uses very different scores and is quite new, and previous results were from 4.4.1. Will post screenshots but can say Linux vs Windows totally idle and CLI, Linux crushes Windows again, maybe even more. Gotta check Blender Benchmark tool though, if there's still like a 3-4 minute gain. Heck it's the only bench tool on Windows that it's possible to get the CPU Package Watts to max... CoreFreq is just Conic... There's a new version of CPU-Z and HWMonitor, but at least CPU-Z isn't close to stress the CPU to max set to max threads, but it seems HWMonitor is more precise, but still waaay too inaccurate compared to official RM, where CoreFreq gives basically the same result as RM. So again CoreFreq is just as good as the official Windows tool for Watts, temperature and Voltage ;-) |
Yes, please, such as the IvyBridge screenshot above: showing the TDP in Power window, plus the view Voltage with Package and Cores watt, doing a full stressed processor, with algorithm Conics - 2 plans Indeed, same issue encountered with IVB where TDP is computed to 130W but the stressed processor does not consume that TDP power. About AMD and pre SandyBridge processors, I've not found any MSR, PCI registers to compute the TDP. I'll avoid to maintain a values table for those thus they are showing a zero TDP result. |
With "Conic 2 plans" do you mean the "Two parallel planes"? Well on AMD the TDP can at least be obtained by simply stressing the CPU with Conic and look at the Package Watts usage, which corresponds to the TDP (on my CPU at least). Actually getting a little above, like 96.x Watts (probably due to PBO which is an official tiny overclock). Windows don't seem to be able to reach TDP completely, even using the installed Maybe that's why Linux crushes Windows in benchmarks...? Weird anyway. Didn't do a full test yesterday as the renders take 20+ minutes, but saw Ryzen Master showing a significant higher % of 128 Watts (which is how RM shows Power Usage) than if stressing, choosing all cores, with say CPU-Z. Anyway Watts % tends to drop slightly in RM as the CPU gets hotter, so I doubt a full render will reach max TDP, although the second render (classroom) is harder on the CPU than the first (bmw). |
Here you go from my NUC (Conic Compute - Two Parallel Planes): EDIT: Indeed the TDP is 15 Watts according to the Intel datasheet. As said for AMD it seems to just be to stress the CPU using whatever Conic Compute stresstest, and the Package Watts will raise to the TDP it seems (perfectly for me, unless you've changed some UEFI stuff that touches that). As CoreFreq shows exactly what CPU model you have, it's just a Google search away anyway. AMD has the basics people care about, like TDP, listed for every CPU on that CPU's "homepage" at the bottom. I can only assume when I get a 3950X, which has a TDP of 105 Watts, stressing it with Conic will raise it to that. Well this for at least X570 + 3rd Gen it seems. Maybe write "Missing" instead of "0" for TDP on AMD? Since you use the words "Capable" and "Missing". People will understand, that of course a TDP exists, but that CoreFreq can't read it. Cheers. |
The general formula for the energy units might be the issue. My formula based on Intel specs is OK for SandyBridge and afterward architectures. However, specs also mention an exception for Atom kind processors. Line 638 in e746dff
I presume other architectures, such as NUC and IVB-EP, may be subject to other computation exceptions to resolve the power unit formula. |
Well the CPU in the NUC is from the third quarter of 2013 so I don't expect full support. Don't think many users with Atom processors do either (if you buy that then well you're probably not a power user, a 4th Gen Raspberry Pi can probably be faster, at least iGPU). The cheapest NUCs now come with Pentium Gold CPUs, and as mine the second cheapest has an i3. The lower-than-TDP usage may be just the way Intel has built the board for maximum power saving - just putting in a CPU otherwise sold in laptops - but for the NUCs they maybe don't need as much, and headroom for peripherals. Remember when I had the peripherals plugged in? With HDMI, dongle for wireless keyboard and mouse and a USB stick, the power usage was quite higher on the CPU, several Watts higher IIRC, so Intel has made some headroom there in case you use all the ports, which seems to make the CPU use several Watts more. In this screenshot it's headless, no desktop, only Ethernet and power. It has 4 USB ports, 1 mini-HDMI, 1 mini-DisplayPort, analog audio out and in (microphone) and an Infrared sensor as well (which works perfectly with LIRC BTW, like mapping a generic or common IR remote control). |
Todo
|
Feature is stable |
CoreFreq/corefreqk.c
Line 5216 in a0f440a
CoreFreq/corefreqk.c
Line 5320 in a0f440a
Replace
PWR_ACCU_SandyBridge
withPWR_ACCU_Skylake
The text was updated successfully, but these errors were encountered: