RAPL for Intel/AMD architectures #57

cyring · 2018-06-18T16:08:26Z

RAPL

CoreFreq/corefreqk.c

Line 5216 in a0f440a

static enum hrtimer_restart Cycle_Skylake(struct hrtimer *pTimer)

CoreFreq/corefreqk.c

Line 5320 in a0f440a

static void Start_Skylake(void *arg)

Replace PWR_ACCU_SandyBridge with PWR_ACCU_Skylake

#define PWR_ACCU_Skylake(Pkg, T)                                        \
({                                                                      \
        RDCOUNTER(Pkg->Counter[T].Power.ACCU[PWR_DOMAIN(PKG)],          \
                                                MSR_PKG_ENERGY_STATUS); \
                                                                        \
        RDCOUNTER(Pkg->Counter[T].Power.ACCU[PWR_DOMAIN(CORES)],        \
                                                MSR_PP0_ENERGY_STATUS); \
                                                                        \
        RDCOUNTER(Pkg->Counter[T].Power.ACCU[PWR_DOMAIN(UNCORE)],       \
                                                MSR_PP1_ENERGY_STATUS); \
                                                                        \
        RDCOUNTER(Pkg->Counter[T].Power.ACCU[PWR_DOMAIN(RAM)],          \
                                                MSR_DRAM_ENERGY_STATUS);\
})

The text was updated successfully, but these errors were encountered:

cyring · 2018-06-28T08:20:28Z

KBL RAPL and Voltage during high load ...

... during low load

cyring · 2018-07-30T15:59:36Z

Measurements issue with i7-3770 (IvyBridge). TDP specifications have to be 77 W

cyring · 2018-08-03T15:13:21Z

Skylake: add missing code to measure DRAM power

CoreFreq/corefreqk.c

Line 6072 in 6712f73

Delta_PWR_ACCU(Proc, UNCORE);

			Delta_PWR_ACCU(Proc, RAM);

CoreFreq/corefreqk.c

Line 6090 in 6712f73

Save_PWR_ACCU(Proc, UNCORE);

			Save_PWR_ACCU(Proc, RAM);

cyring · 2018-09-13T10:06:26Z

Workaround for IVB [Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz]

CoreFreq/corefreqd.c

Line 443 in 68db124

switch (Proc->powerFormula) {

	Shm->Proc.Power.Unit.Watts = Proc->PowerThermal.Unit.PU > 0 ?
			1.0 / (double) (1 << Proc->PowerThermal.Unit.PU) : 0;

	Shm->Proc.Power.Unit.Watts /= (Proc->CPU.Count >> Proc->Features.HTT_Enable);

Accurate ?

cyring · 2018-12-01T17:17:59Z

Issue closed until other hardware is available for testings.

cyring · 2019-07-21T07:08:42Z

Intel

Architecture	Processor	ESU (J)¹	Pkg load (J)²	Cores load (J)²	TDP (W)³
Skylake/S	i5-6600K	0.000061035	84.761535645	75.259704590	91
Skylake/S	i7-6700	0.000061035	55.686950684	44.473632812	65
Haswell/U	i3-4010U	0.000061035	6.958801270	4.341308594	15
Haswell/U	i7-4650U	0.000061035	18.376159668	14.542602539	15
IvyBridge/EP	E5-1607	0.000015259	34.55101032	26.025909424	130
SandyBridge	i7-2710QE	0.000015259	44.356430054	40.928970337	45

AMD Zen

Architecture	Processor	ESU (J)	Pkg load (J)	Cores load (J)	TDP (W)
Pinnacle Ridge	2700X	0.000015259	115.171279907	116.535186768	105

Remarks

RAPL Units Register.
All Cores fully loaded
One second interval
RAPL Energy architectural counter
Manufacturer specification

cyring · 2019-07-21T09:53:08Z

Proposal for the `Power & Voltage` view

AMD Zen

`#`	Freq MHz	VID	Vcore	Energy(J)	Counter(Raw)
`#0`	`4015.73`	.	.	`Package 131.852859497`	`9876543210`
`#1`	`4015.67`	.	.	`Cores 122.611679077`	.
`#2`	`4015.74`	`54`	`1.2125`	`Uncore 10.000000000`	`9876543210`
`#3`	`4015.73`	.	.	`Memory 5.000000000`	.

`#`	Freq MHz	VID	Vcore	Power(W)	Core(W)
`#0`	`4015.73`	.	.	`Package 131.852859497`	`13.000000000`
`#1`	`4015.67`	.	.	`Cores 122.611679077`	.
`#2`	`4015.74`	`54`	`1.2125`	`Uncore 10.000000000`	`11.000000000`
`#3`	`4015.73`	.	.	`Memory 5.000000000`	.

Intel

`#`	Freq MHz	VID	Vcore	Energy(J)	Counter(Raw)
`#0`	`4015.73`	.	.	`Package 131.852859497`	`876543210987654321`
`#1`	`4015.67`	.	.	`Cores 122.611679077`	`876543210987654321`
`#2`	`4015.74`	`54`	`1.2125`	`Uncore 10.000000000`	`106543210987654321`
`#3`	`4015.73`	.	.	`Memory 5.000000000`	`056543210987654321`

`#`	Freq MHz	VID	Vcore	Power(W)	Core(W)
`#0`	`4015.73`	.	.	`Package 131.852859497`	.
`#1`	`4015.67`	.	.	`Cores 122.611679077`	.
`#2`	`4015.74`	`54`	`1.2125`	`Uncore 10.000000000`	.
`#3`	`4015.73`	.	.	`Memory 5.000000000`	.

Remarks

One shortcut to toggle between the Energy(J) and Power(W) layouts
To my knowledge, only the AMD Zen architecture is providing a per physical core RAPL counter;
whereas Intel will only show the cumulative raw counter value.

cyring · 2019-07-27T10:51:38Z

RAPL in the AMD Zen architecture

Energy consumed

Ryzen 3xxx

Ryzen 2xxx

Topology

Ryzen 2xxx

Processor                              [AMD Ryzen 7 2700X Eight-Core Processor ]
|- Architecture                                            [Zen+ Pinnacle Ridge]
|- Vendor ID                                                      [AuthenticAMD]
|- Microcode                                                        [ 134251019]
|- Signature                                                            [ 8F_08]
|- Stepping                                                             [     2]
|- Online CPU                                                           [ 16/16]
...
Power & Thermal Monitoring:                                                     
...
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000015259]
   |- Window                                            second   [  0.000976562]

Ryzen 3xxx

Processor                                  [AMD Ryzen 7 3700X 8-Core Processor ]
|- Architecture                                               [Zen/Summit Ridge]
|- Vendor ID                                                      [AuthenticAMD]
|- Microcode                                                        [ 141561873]
|- Signature                                                            [ 8F_71]
|- Stepping                                                             [     0]
|- Online CPU                                                           [ 16/16]
...
Power & Thermal                                                                 
...
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000015259]
   |- Window                                            second   [  0.000976562]

CPU Pkg  Apic  Core Thread  Caches      (w)rite-Back (i)nclusive              
 #   ID   ID    ID     ID  L1-Inst Way  L1-Data Way      L2  Way      L3  Way 
00: BSP     0     0      0      32  8        32  8       512  8     32768  9  
01:   0     2     1      0      32  8        32  8       512  8     32768  9  
02:   0     4     2      0      32  8        32  8       512  8     32768  9  
03:   0     6     3      0      32  8        32  8       512  8     32768  9  
04:   0     8     4      0      32  8        32  8       512  8     32768  9  
05:   0    10     5      0      32  8        32  8       512  8     32768  9  
06:   0    12     6      0      32  8        32  8       512  8     32768  9  
07:   0    14     7      0      32  8        32  8       512  8     32768  9  
08:   0     1     0      1      32  8        32  8       512  8     32768  9  
09:   0     3     1      1      32  8        32  8       512  8     32768  9  
10:   0     5     2      1      32  8        32  8       512  8     32768  9  
11:   0     7     3      1      32  8        32  8       512  8     32768  9  
12:   0     9     4      1      32  8        32  8       512  8     32768  9  
13:   0    11     5      1      32  8        32  8       512  8     32768  9  
14:   0    13     6      1      32  8        32  8       512  8     32768  9  
15:   0    15     7      1      32  8        32  8       512  8     32768  9

Threadripper 2950X

Remark: Threadripper results below are from an old CoreFreq version

Processor                      [AMD Ryzen Threadripper 2950X 16-Core Processor ]
|- Architecture                                                    [Zen+ Colfax]
|- Vendor ID                                                      [AuthenticAMD]
|- Microcode                                                        [ 134251019]
|- Signature                                                            [ 8F_08]
|- Stepping                                                             [     2]
|- Online CPU                                                           [ 32/32]
...
Power & Thermal                                                                 
...
|- Units                                                                        
   |- Power                                               watt   [  0.007812500]
   |- Energy                                             joule   [  0.000000954]
   |- Window                                            second   [  0.000976562]

CPU Pkg  Apic  Core Thread  Caches      (w)rite-Back (i)nclusive              
 #   ID   ID    ID     ID  L1-Inst Way  L1-Data Way      L2  Way      L3  Way 
00: BSP     0     0      0      64  4        32  8       512  8     32768 10  
01:   0     2     1      0      64  4        32  8       512  8     32768 10  
02:   0     4     2      0      64  4        32  8       512  8     32768 10  
03:   0     6     3      0      64  4        32  8       512  8     32768 10  
04:   0     8     4      0      64  4        32  8       512  8     32768 10  
05:   0    10     5      0      64  4        32  8       512  8     32768 10  
06:   0    12     6      0      64  4        32  8       512  8     32768 10  
07:   0    14     7      0      64  4        32  8       512  8     32768 10  
08:   1    16     0      0      64  4        32  8       512  8     32768 10  
09:   1    18     1      0      64  4        32  8       512  8     32768 10  
10:   1    20     2      0      64  4        32  8       512  8     32768 10  
11:   1    22     3      0      64  4        32  8       512  8     32768 10  
12:   1    24     4      0      64  4        32  8       512  8     32768 10  
13:   1    26     5      0      64  4        32  8       512  8     32768 10  
14:   1    28     6      0      64  4        32  8       512  8     32768 10  
15:   1    30     7      0      64  4        32  8       512  8     32768 10  
16:   0     1     0      1      64  4        32  8       512  8     32768 10  
17:   0     3     1      1      64  4        32  8       512  8     32768 10  
18:   0     5     2      1      64  4        32  8       512  8     32768 10  
19:   0     7     3      1      64  4        32  8       512  8     32768 10  
20:   0     9     4      1      64  4        32  8       512  8     32768 10  
21:   0    11     5      1      64  4        32  8       512  8     32768 10  
22:   0    13     6      1      64  4        32  8       512  8     32768 10  
23:   0    15     7      1      64  4        32  8       512  8     32768 10  
24:   1    17     0      1      64  4        32  8       512  8     32768 10  
25:   1    19     1      1      64  4        32  8       512  8     32768 10  
26:   1    21     2      1      64  4        32  8       512  8     32768 10  
27:   1    23     3      1      64  4        32  8       512  8     32768 10  
28:   1    25     4      1      64  4        32  8       512  8     32768 10  
29:   1    27     5      1      64  4        32  8       512  8     32768 10  
30:   1    29     6      1      64  4        32  8       512  8     32768 10  
31:   1    31     7      1      64  4        32  8       512  8     32768 10

Issue

AMD specifications

MSRC001_029B [Package Energy Status] (Core::X86::Msr::PKG_ENERGY_STAT)
Read-only,Volatile. Reset: 0000_0000_0000_0000h. 
_lthree[1:0]; MSRC001_029B
Bits Description
63:32 Reserved. 
31:0 TotalEnergyConsumed.

CCX: Core Complex where more than one core shares L3 resources.

ApicId Enumeration Requirements

Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}.

CPUID 0x8000001e

CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
Read-only.
If Core::X86::Cpuid::FeatureExtIdEcx[TopologyExtensions] == 0 then CPUID Fn8000001E_E[D,C,B,A]X are reserved.
If (Core::X86::Msr::APIC_BAR[ApicEn] == 0) then Core::X86::Cpuid::ExtApicId[ExtendedApicId] is reserved.
_lthree[1:0]_core[3:0]_thread[1:0]; CPUID_Fn8000001E_EAX
Bits Description
31:0 ExtendedApicId: extended APIC ID. Read-only. See 2.1.12.2.1.3 [ApicId Enumeration Requirements].
Reset: Core::X86::Msr::APIC_BAR[ApicEn] ? Fixed,{00_0000h , Core::X86::Apic::ApicId[ApicId]} :
Fixed,0000_0000h.


CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)
Read-only.
See Core::X86::Cpuid::ExtApicId.
_lthree[1:0]_core[3:0]_thread[1:0]; CPUID_Fn8000001E_EBX
Bits Description
31:16 Reserved.
15:8 ThreadsPerCore: threads per core. Read-only. Reset: XXh. The number of threads per core is
ThreadsPerCore+1.
7:0 CoreId: core ID. Read-only. Reset: Fixed,XXh.
Description: For Family 17, Model 1, Revision 1 and later:
CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0], LogicalThreadId[2:0]} >> SMT).


CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)
Read-only.
_lthree[1:0]_core[3:0]_thread[1:0]; CPUID_Fn8000001E_ECX
Bits Description
31:11 Reserved.
10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
ValidValues:
Value Description
0h 1 node per processor.
1h 2 nodes per processor.
2h Reserved.
3h 4 nodes per processor.
7h-4h Reserved.
7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
Description: For Family 17, Model 1, Revision 1 and later:
{5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}.

Improvements

find the CCX identifier into the CPU topology
read the Package energy counter per CCX
sum the per CCX values into a Package value

Questions

what about the multi dies processors (TR, Naples, Rome)

cyring · 2019-07-27T12:21:46Z

Core Complex ID

Ryzen 3xxx [SMT ON]

CPU#	Pkg ID	Apic Ext_ID	Core ID	Thread ID	Apic[3:0]	CCX ID¹	_lthree scope²
00	BSP	0	0	0	0000	0	Y
01	0	2	1	0	0010	0	.
02	0	4	2	0	0100	0	.
03	0	6	3	0	0110	0	.
04	0	8	4	0	1000	1	Y
05	0	10	5	0	1010	1	.
06	0	12	6	0	1100	1	.
07	0	14	7	0	1110	1	.
08	0	1	0	1	0001	0	.
09	0	3	1	1	0011	0	.
10	0	5	2	1	0101	0	.
11	0	7	3	1	0111	0	.
12	0	9	4	1	1001	1	.
13	0	11	5	1	1011	1	.
14	0	13	6	1	1101	1	.
15	0	15	7	1	1111	1	.

ApicId[3] = Logical CCX L3 complex ID

CCX_ID = ( leaf8000001e.EAX.ExtApicId & 0b1000 ) >> 1

First ID of the CCX instance

cyring · 2019-07-27T22:46:00Z

After long hours of the 3700X debugging, it appears that the RAPL Package Energy Counter delta remains the same, whatever the CPU is used to read the msr

Edit: as specified, the RAPL Energy status is package scope which returns the same value whatever is the Core used for reading.
CCX is not involved in the issue.

Call for help on Reddit

cyring · 2019-08-04T15:57:19Z

To link with issue WIP and Undocumented registers #60 for PL1 RAPL registers

olejon · 2019-08-12T17:01:05Z

Here are my Screenshots + Output as requested in #129

https://www.olejon.net/files/CF-X570-Ryzen-3600X/
Also added the benchmarks from Blender Benchmark & Geekbench, because it shows Linux is already superior here 👍
Both benchmarks run under same conditions (SMT on, Precision Boost Override on, System totally idle with no other programs open and Geekbench run in CLI-mode on Windows as well, which also run the latest and greatest build from the Insider Program and latest chipset drivers)
Note to others: Yes, bought the 3600X... because it was the only one in stock country-wide in online stores in Norway, showing the interest in Ryzen. When released in Taiwan, long queues outside stores on launch day. When launched in the US, long queues outside physical stores country-wide to get the new MBs and CPUs. When was the last time one saw that after an Intel/AMD keynote? Probably some iPhones ago LOL - though those iSheeps and not an AMD/Intel keynote. I'm not stupid - had ordered the 3900X, but didn't come in when stores said they might, so canceled and thought, "Hey, system will probably run much faster anyway, even with the stock cooler (and yes: 30k+ in Geekbench vs 17k+ in pretty Overclocked i5-6600K with proper cooler) so can save those money on a 3900X to buy the 3950X + good cooler ASAP it becomes available!". Better plan IMO

cyring · 2019-08-12T17:17:39Z

To all reading this issue, look in the above screenshots for the Package Power measurements; Cores all stressed :

X370 (Crosshair VI Hero) = 9.28W
X570 (Prime X570 Pro) = 91.76W

Both cases, same algorithm based on RAPL registers; what can we conclude ?

olejon · 2019-08-12T20:18:52Z

That only 9.28W is definitely not right... Intel CPUs you have tested well and CoreFreq supports well, with more or less similar Cores & Threads, when they're all stressed, use way more Watts
My Skylake 4-core i5-6600K has a 91W TDP so the below makes pretty much sense
When my Skylake 4-core i5-6600K used ~85W with All Cores Stressed using Conic Compute
4.4 GHz is the max turbo speed for my 3600X, and has a "Default TDP/TDPROM-06a" of 95W, which makes sense for my result of 91.76W no?
It can manage 4.4 GHz 1 Thread and 4.2 GHz for 2 Threads at a time, but does not seem to get to 4.4 GHz with SMT = ON. Seems to take advantage of the latter, 4.2 GHz with 2 Threads, that with regular use, offloading load from 2 active cores to 2 sleeping cores and so on, managing cooling very well
Of course running a Stress Test is different. All Cores = ~4.0 GHz, as per the specs of the 3600X
On Windows running Blender Benchmark - which is the bench that causes the most load (adding in more like CPU-Z or Geekbench does NOT stress it more) Ryzen Master peaks at 70% of "128W PPT (CPU)"
This is calculates to 89.6W, which is very close to the CoreFreq reading
CoreFreq showing a little more I think is because "Conic Compute" stresses the CPU even more than Blender Benchmark, since on Windows the CPU, according to the most reliable tool, Ryzen Master peaks at temp 79.xx-80.xx C (stays just a couple of seconds at 80.xx C and then goes below 80 C for a couple of seconds, then up again and so on, and above 80 C the color of the bar turns orange so easy to see)
CoreFreq manages to get it to at least more than this, using "Conic Compute" IIRC, even maybe 82-84 C IIRC

Conclusion: CoreFreq is right for my 3600X on an X570 MB
So 9.28W can't be right
Also my benchs have shown Linux is much faster than Windows running Blender Benchmark, suggesting maybe higher temps, though didn't monitor temps
Can run Blender Benchmark again and monitor with CoreFreq

IMPORTANT NOTE:

As far as I've read and seen/heard on YouTube videos, AMD straight out said they do NOT recommend Ryzen 3rd Gen on X370 MBs!
According to AMD, if using an older MB, the X470 ones can they be used just fine, but that one will probably run into problems with a CPU with a TDP above 65W at stock, so not recommended for the X version of the 3600, BUT the 3700X has a TDP of 65W though. Problem is it costs more and according to benchs for gamers with a good enough GPU is just wasting money...
AND one MUST remember to tell the seller/retailer to update to the latest BIOS (usually 2 updates, 1st to get it ready for the 2nd) before shipping, as if NOT it will NOT RUN 3rd Gen, and without a 2nd Gen laying around you're stuck without that BIOS update which as far as I've read must be done with a USB stick
So one should stick to a CPU like the 3600 (without the X) with an X470, which is 65W, AND is the CPU of choice recommended by the most respected YouTubers for gamers looking for a cheap update, as good X470 MBs are very cheap now, and the 3600 is cheap
With an X370 you're "on your own" it seems according to what AMD said
But with an X470 one can probably find a cheaper and better performing 2nd Gen Ryzen CPU, and you'll lose PCIe4 support if not using X570 anyways, so why not go for a probably cheaper 2nd Gen? At least they will drop in prices if not already
As an example, with CPU-Z on Windows and its references, the 2700X outperforms the 3600X, like the 3700X it has 8c/16t compared to 6c/12t. Also the 1700X and 1800X are better (8c/16t) than the 3600X. For Ryzen I must go down to the 1600 (no X) 6c/12t to see my 3600X beating it clearly. On the Intel side my 3600X also clearly beats the i7-8700K (same 6c/12t), but of course not the i9-9900K since it has 8c/16t and costs a lot compared to a 8c/16t Ryzen like the 3700X, which I think the Intel i9 beats since it has Turbo to 5 GHz, but the 5800X (but don't buy that one, only 0.1 GHz more Turbo! but 105W TDP vs 65W on the 3700X), the 3800X may beat it with its also 8c/16t like the 3700X, while Turbo is lower at 4.5 GHz than the i9-9900K, so I doubt it, but it has more cache and newer tech in general. As said not much more Turbo compared to the 3700X's 4.4 GHz (0.1 GHz more), but a TDP of 105W compared to the 3700X's 65W. The 3900X beats the i9 of course with 12c/24t, and clearly the 3950X when launching will beat it with 16c/32t.

olejon · 2019-08-12T20:30:34Z

EDIT: Some additional info... Please read IMPORTANT NOTE section for X370/X470, especially the first.

olejon · 2019-08-13T00:04:47Z

What I wrote very much confirmed:

CoreFreq manages to get it to at least more than this, using "Conic Compute" IIRC, even maybe 82-84 C IIRC

Just running Conic Compute for ~1 minute and the CPU reached 85 C
System otherwise totally idle. Not even logged in to desktop. Did this over SSH
Much more than I manage on Windows with any tool/bench
All fans are set to 100 % Duty Cycle above 75 C so not because of that
Must be CPU using more Power. Since Limit according to Ryzen Master in Windows is 95 C, there's no reason it should stay around 80 C using popular tools that maxes out the CPU for a long time
Basically Conic Compute in CoreFreq is better at stressing the CPU and hence is able to go to 91.5W. Great job!
As said on Windows it calculates to 89.6W (see comment above how it was calculated)
Can explain so much better Blender Benchmark score(?)

cyring · 2019-10-16T17:50:37Z

As a reminder: MSR 0x64d for the SOC Power Domain added on the development roadmap.

As mentioned In the SDM specifications, there is no guarantee this counter exists for the listed architecture families. Tested with a Skylake i7-6700, this MSR returns a zero value on all cores.

cyring · 2019-10-21T08:43:24Z

Bellow my tests with version 1.67.6
Need to fix the Power Formula Scope at this line:

CoreFreq/coretypes.h

Line 342 in bb548eb

enum POWER_FORMULAS {

enum POWER_FORMULAS {
	POWER_FORMULA_NONE =						\
	(0b000000000000000000000000 << 8) | FORMULA_SCOPE_NONE,
	POWER_FORMULA_INTEL =						\
	(0b000000000000000000000001 << 8) | FORMULA_SCOPE_NONE,
	POWER_FORMULA_INTEL_ATOM =					\
	(0b000000000000000000000011 << 8) | FORMULA_SCOPE_NONE,
	POWER_FORMULA_AMD =						\
	(0b000000000001000000000000 << 8) | FORMULA_SCOPE_CORE,
	POWER_FORMULA_AMD_17h =						\
	(0b000100000001000000000000 << 8) | FORMULA_SCOPE_CORE
};

Rebuild all and test again
Results as expected for Skylake/S:

Voltage per Package
Temperature per Thread
No Power per CPU
Power for whole Package

--Update--

Results for SandyBridge : OK
Results for IvyBridge/EP : OK

Please, let me know about yours -;)

adatum · 2019-10-21T17:33:43Z

cyring · 2019-10-21T18:04:32Z

@adatum : thank you

Cores power looks OK
Single temperature sensor: OK
Voltage per Core or per SMT: I'm not sure what to display ?
If a single CPU is stressed (first a CPU Core, next a CPU thread), do we read discret Vcore ?

adatum · 2019-10-21T18:14:26Z

Looks like the voltages per core are independent, but the threads on the same core (threads 2 & 10 in screenshot) show the same voltage:

cyring · 2019-10-21T23:38:01Z

Looks like the voltages per core are independent, but the threads on the same core (threads 2 & 10 in screenshot) show the same voltage:

My understanding of the Zen SMT architecture is that CPU 10 is the logical peer of the core CPU 2, and they have the Vcore in common.
The view is showing the relation between them.

adatum · 2019-10-22T03:02:49Z

CPU 10 is the logical peer of the core CPU 2, and they have the Vcore in common.

Yes, that's what I meant to highlight. It makes sense for the two virtual cores (what I meant by "threads") to have the same Vcore since it is the same physical core. I'm not sure if that necessarily has to be the case, but at least it makes sense.

cyring · 2019-10-22T06:25:10Z

The Topology is the clue. Just checking at the screenshots inside the Wiki CPU Support and both 2700X and 3600X have a similar Topology, where the (CCX, CoreID, ThreadID) forms the cluster.
Tests will also be interesting when facing the topology of the dual EPYC

olejon · 2019-10-23T00:32:33Z

If I stress 2 to the max, stress does the job (CoreFreq cannot) almost as good as Conic Compute (to raise temperature) on AMD, and on Intel it's even or even better at stressing the CPU (above Windows tools always).

I get pretty much expected results. So using stress -c 2:

They're divided into CCX0 and CCX1, you can see from screenshot, counting down 4 first (CCX0), then counting down just 2 after that, since then it's come to CCX1 (for my CPU with 12 threads), and then counting down 4 again, still CCX1, but goes 4 down again, I assume for the CPU logic, which again I assume is made that way so it comes down to efficient heat spreading, later handling over to other cores with the same pattern to maintain performance.

So makes perfect sense how CoreFreq represents the "spread", like Ryzen Master would if using say CPU-Z on Windows, choosing same amount of threads.

On a 8c/16t CPU like the 3700X I would assume the same pattern, just different "counting".

For what I've seen from your debug output it seems CoreFreq can already show CCXs', and it would be nice if the GUI to separate them, like showing a column for CCX{X} or whatever. I doubt AMD will do any significant changes there when it comes to "grouping" in 4th gen.

So far still best tool for monitoring I'd say, Watts per core etc my NUC doesn't show (and IIRC not RM either, gotta check later, maybe another popular tool at least does it).

Ryzen Master of course an advantage of very nice GUI (not that I expect that from you!), drag and slide OC and whatnot, for RAM stuff as well, but it changes UEFI settings of course, it doesn't come with a Windows CPU driver or anything, so anyone can do it. Just easier for the regular user. But as posted in the chat thread, OC is very limited anyway, performance is great out of the box and the quick settings are easy peasy in UEFI. In the end most will end up with Auto Overclock in Ryzen Master anyway, a Precision Boost Override of 100 MHz, which can be increased there to 200 MHz - kind of more obvious for anyone into OC on Intel as well, in UEFI. Except AMD has put it into both Tuning and XFR parts of UEFI (one set to freq, another just enabled an all other values "Auto", is what RM does).

Point is, CoreFreq seems basically complete for me on X570. Let's see with 3950X, but shouldn't be any different. Same arch, (much) more cores, basically.

I really don't see you have to do more for X570... Got all the data I need + more stress test than Windows can do. Only showing CCXs'... Cosmetics and stuff. Maybe showing N/A, or maybe simply the same value as the "cores" above for the corresponding below - the "cores" with Watts reading, since on a 12t CPU only the first 6 shows that reading, show same on corresponding core with the same thread (voltage reading). I think you know what I mean, 6c shows as 12c basically but isn't, and there's a corresponding thread with the same reading as the actual core, of course (Hyperthreading yeah).

Now a power user should probably understand, but well, just a suggestion.

If anyone were to want to make a Ryzen Master desktop GUI equivalent on Linux they should be able to do so using say Qt + CoreFreq as backbone. At least monitoring will be the same, if not better.

cyring · 2019-10-23T04:11:00Z

Only showing CCXs'... Cosmetics and stuff. Maybe showing N/A, or maybe simply the same value as the "cores" above for the corresponding below ,,,

That was indeed the purpose of all these changes requested by CoreFreq's users:

provide Temperature along with the Voltage and Power
show the Power per Core for capable processors like the Zen uarch
mask zero values if Intel processor is in used

It took hundreds of source code lines to refactor this View, please feel free to draw an ascii proposal of your ideas. After debate, changes could be engaged.
FYI, CCX is available in the Topology view

There are tons of remaining things to do with the Zen uarch: we are just at the middle of the subject.
Yesterday, reading the September's revision of the specifications update just left me feel that there are still many unpublished registers yet among those erratum ...

olejon · 2019-10-23T04:55:49Z

True that! It's available in Topology. I'm not very skilled at image editing, but I know GIMP well enough I guess. Anyway I think you'll always have a better idea (like a "rejection").

I mainly just think a column for CCX just as in Topology...

I totally understand your hard work and don't ask for anything requiring tons of code. You're already a super FOSS hero! You should know I think that way about you by now. Like heck - no other tools even shows temperatures for Ryzen 3rd Gen, so!

Sometimes I even wish you didn't respond so quickly. I mean if you really enjoy the project, go on! But if you're tired, don't be afraid of taking a week offline at least...

Any more requests regarding debugging my NUC or my X570+3600X, you'll get it. More specific the better. NUC has that 0 Watts per core, still all cores Watts shown, correctly I assume, BUT then interpreted false by many I guess. I assume you refer to this as "masking zero values for Intel". Well, maybe just show total Cores Watts then. Highlighted. Like all cores showing total or "See below" (as a stupid example)

I'm not as skillful as you in this nor as understanding of the vocabulary so the easier instructions the better :) I think you more or less know my knowledge. I might sometimes be a sysadmin of hundreds of crucial govt servers. Doesn't mean I read hundreds of pages of CPU specs. We simply don't have to, we have absolutely no reason to. Things work since we always buy compatible HW for Linux and VMs. What I have at home is another case.

My monster laptop - I think you asked me to try some code changes on it - sorry haven't been able to get the time to it yet. All my previous posts are from my phone too. Haven't been close to a workstation for weeks. All is SSH. That laptop needs SATA2 disk moved to SATA1 as it apparently won't boot otherwise, although HDD set as first boot device and installation successfull. Tired of live USBs so it has a solid installation that doesn't boot basically. Also the arch is so old, is it very important or more of a "challenge/curiosity" from your side? CPU from 2009 you know... I seems to me it can help you with some newer ones, but still worth it? I mean for me, not even being were it is and bad health? Not to play that card, but yeah.

I might be stupid but let's say you support only CPUs from 2014+ minimum and get rid of legacy code, and officially support mainly Intel, but AMD is basically supported as well, Alpha or Beta depending.

I assume you know the difference, but since you add more features it seems Alpha for both 2nd Gen and 3rd Gen. A very stable Alpha though! still adjusting/adding features it's per definition Alpha. But never caused a crash or nothing of the sort. And CoreFreq does deep level sh*t I assume 1 wrong like could easily freeze the system just loading the module or starting a stress test or whatever.

Still it might be considerable to stop supporting like ALL 64-bit x86 CPUs out there? How many would use CoreFreq on my monster laptop's CPU you think? I hate to throw away still good HW as I guess you do, but most do, or they at least don't expect GitHub projects to work on their 9+ years old HW.

For cosmetics I think Power Usage has "lost" some of it's easiness of reading. People won't necessarily look at the bottom strip and seeing "Package" and "Cores". They're not highlighted nor do they have even W behind them. Should be the entire Watts. Before that was completely seen immediately, I mean Power Usage in Joules and Watts. Now you gotta know where to seek and to look AND know it's Watts estimate.

olejon · 2019-10-23T05:14:57Z

EDIT: Just added last paragraph. Think it's kind of important. Should be more present and clearly shows power usage WITH what type of measurement. Like RM uses % of Watts but at least clearly says (and shows) it's X % of Watts, if you get it. Easy to see, a top priority measurement shown as a "wheel" AT THE TOP, meaning highlighted, not asking for a wheel, but before it was more clear.

Of course I know where to look know, now, but even I had to "look around".

Remember new users just following your instructions to compile and run may "dumb in our book", MANY DON'T EVEN HAVE A GITHUB ACCOUNT, so they'll never report and issue. I assume you haven't added hidden analytics or whatever, haven't even bothered to check, but probably a lot testing, running, trying stuff and expect like official tool experience. Which you actually have, and more, for Intel CPUs regarding changing stuff on demand, and for AMD monitoring (the most important after all).

I'd always thought about your README.md, *that needs some update to show the true power of this software. Good sections, or very understandable links to the Wiki with no "crazy" (for a regular person) CPU topology talk etc. Installation and usage. I can probably commit an update to it.

Point is, however how powerful this software is, keep installation, first use and navigation and main highlights and switches people look for KISS (Keep It Simple Stupid). I assume you know the acronym.

You can delete this comment after reading the last added paragraph to the comment before.

BTW: I'm sure there's still lot to do regarding Zen2 on X570 and later, but I think if you perfect Zen2 on X570, it'll work perfectly fine, maybe with some slight mods for 4th Gen. Just saying it may be worth the while. Although I'm perfectly happy, just saying doubt AMD will change anything drastically for 4th Gen, if you keep up reading on 3rd Gen, you'll probably have 4th Gen working perfectly. AMD seems to have set a path. I assume any 4000 CPU to just slide into my X570 socket and work. If not, AMD has not kept its promise, and with 3rd Gen it really seems they mean business. If you think 4th Gen will give you more core insight, fine. I'm happy. But kind of assume 4th Gen will be an incremental update to 3rd Gen using more or less same chipset, just more cores etc. Do you think they'll go say PCIe5? It's out as a standard, but nobody needs it - nobody use PCIe4 to its max at all yet. IDK just feels it'll be yet another "hush hush, surprisingly bash Intel on price vs performance with all manufacturers on board" (kind of incredible how AMD pulled of 3rd Gen without any leaks basically, and having MBs from ALL major manufacturers already lined up and ready to ship at launch just days later).

cyring · 2019-10-28T14:44:55Z

TDP

Formula

PU = 2 << ( ( <val1> & 0x0f ) - 1 )

TDP = ( ( <val2> & 0x7fff ) / PU )

Where PU is an unsigned integer (default value is 0011b, indicating power unit is in 1/8 Watts increment). <val1> is the value of CPU MSR 0x606 and <val2> of MSR 0x614

Skylake i7-6700

Read the Power Unit and the Power Info

# rdmsr -acx 0x606
0xa0e03
0xa0e03
0xa0e03
0xa0e03
0xa0e03
0xa0e03
0xa0e03
0xa0e03

# rdmsr -acx 0x614
0x208
0x208
0x208
0x208
0x208
0x208
0x208
0x208

PU = 2 << ( ( 0xa0e03 & 0x0f ) - 1 )
PU = 2 << 2
PU = 8

TDP = ( ( 0x208 & 0x7fff ) / 8 )
TDP = 0x208 / 8
TDP = 65

TDP of Skylake i7-6700 = 65 Watts

SandyBridge i7-2710QE

# rdmsr -acx 0x606
0xa1003
0xa1003
0xa1003
0xa1003
0xa1003
0xa1003
0xa1003
0xa1003

# rdmsr -acx 0x614
0x10024001200168
0x10024001200168
0x10024001200168
0x10024001200168
0x10024001200168
0x10024001200168
0x10024001200168
0x10024001200168

PU = 2 << ( ( <val1> & 0x0f ) - 1 ) = 2 << ( ( 0xa1003 & 0x0f ) - 1 ) = 8

TDP = ( ( <val2> & 0x7fff ) / PU ) = ( ( 0x10024001200168 & 0x7fff ) / 8 )
TDP = 45

TDP of SandyBridge i7-2710QE = 45 Watts

Next steps

So far the Power Unit MSR is available for Intel and AMD Zen but the Power Info MSR is only found on Intel

cyring · 2019-11-06T09:20:53Z

Here are the results of the last feature to get the TDP (Intel only)
Please post yours !

olejon · 2019-11-07T04:26:31Z

That's cool! My NUC shows 15 TDP, although a full stress test gives 6+ Watts. Haven't checked specs for the CPU but 99 % sure it's 15 Watts (or in that area).

EDIT: Need a screenshot for that?

Just gone through the hassle of BIOS update, latest Windows Insider Update, Radeon Update (really just GPU, no chipset), reset BIOS and adjusted all settings again in case new settings.

Using Ryzen Master (no updates to it yet) for PBO I know very well what it does to UEFI now. Thank God AMD it's at least OS transparent (you can do exactly the same without Windows, just gotta change like 3 things in UEFI for a RM standard 100 MHz PBO, where 200 is max, it's not recommended nor gives really any better results, really).

I've done Geekbench again, this time v 4.4.2, even though GB 5 is out, it uses very different scores and is quite new, and previous results were from 4.4.1. Will post screenshots but can say Linux vs Windows totally idle and CLI, Linux crushes Windows again, maybe even more. Gotta check Blender Benchmark tool though, if there's still like a 3-4 minute gain. Heck it's the only bench tool on Windows that it's possible to get the CPU Package Watts to max... CoreFreq is just Conic... There's a new version of CPU-Z and HWMonitor, but at least CPU-Z isn't close to stress the CPU to max set to max threads, but it seems HWMonitor is more precise, but still waaay too inaccurate compared to official RM, where CoreFreq gives basically the same result as RM. So again CoreFreq is just as good as the official Windows tool for Watts, temperature and Voltage ;-)

cyring · 2019-11-07T07:20:59Z

EDIT: Need a screenshot for that?

Yes, please, such as the IvyBridge screenshot above: showing the TDP in Power window, plus the view Voltage with Package and Cores watt, doing a full stressed processor, with algorithm Conics - 2 plans

Indeed, same issue encountered with IVB where TDP is computed to 130W but the stressed processor does not consume that TDP power.

About AMD and pre SandyBridge processors, I've not found any MSR, PCI registers to compute the TDP. I'll avoid to maintain a values table for those thus they are showing a zero TDP result.

olejon · 2019-11-07T16:52:06Z

With "Conic 2 plans" do you mean the "Two parallel planes"?

Well on AMD the TDP can at least be obtained by simply stressing the CPU with Conic and look at the Package Watts usage, which corresponds to the TDP (on my CPU at least). Actually getting a little above, like 96.x Watts (probably due to PBO which is an official tiny overclock).

Windows don't seem to be able to reach TDP completely, even using the installed Ryzen Performance Power Plan == Performance Governor on Linux == CPU always at highest frequency, C0, at all times, when stressed to the max, which only Blender Benchmark manages on Windows.

Maybe that's why Linux crushes Windows in benchmarks...? Weird anyway.

Didn't do a full test yesterday as the renders take 20+ minutes, but saw Ryzen Master showing a significant higher % of 128 Watts (which is how RM shows Power Usage) than if stressing, choosing all cores, with say CPU-Z.

Anyway Watts % tends to drop slightly in RM as the CPU gets hotter, so I doubt a full render will reach max TDP, although the second render (classroom) is harder on the CPU than the first (bmw).

olejon · 2019-11-07T18:33:56Z

Here you go from my NUC (Conic Compute - Two Parallel Planes):

EDIT: Indeed the TDP is 15 Watts according to the Intel datasheet.

As said for AMD it seems to just be to stress the CPU using whatever Conic Compute stresstest, and the Package Watts will raise to the TDP it seems (perfectly for me, unless you've changed some UEFI stuff that touches that).

As CoreFreq shows exactly what CPU model you have, it's just a Google search away anyway. AMD has the basics people care about, like TDP, listed for every CPU on that CPU's "homepage" at the bottom. I can only assume when I get a 3950X, which has a TDP of 105 Watts, stressing it with Conic will raise it to that. Well this for at least X570 + 3rd Gen it seems.

Maybe write "Missing" instead of "0" for TDP on AMD? Since you use the words "Capable" and "Missing". People will understand, that of course a TDP exists, but that CoreFreq can't read it.

Cheers.

cyring · 2019-11-07T19:43:35Z

The general formula for the energy units might be the issue. My formula based on Intel specs is OK for SandyBridge and afterward architectures. However, specs also mention an exception for Atom kind processors.
I put those conditions here:

CoreFreq/corefreqd.c

Line 638 in e746dff

switch (Proc->powerFormula) {

I presume other architectures, such as NUC and IVB-EP, may be subject to other computation exceptions to resolve the power unit formula.
Unfortunately I have not found any datasheets which specify some good values or equation to apply.
We could tune the NUC power formula until we reach the TDP but how to be sure this approximation is true for the whole architecture...

olejon · 2019-11-07T20:53:42Z

Well the CPU in the NUC is from the third quarter of 2013 so I don't expect full support. Don't think many users with Atom processors do either (if you buy that then well you're probably not a power user, a 4th Gen Raspberry Pi can probably be faster, at least iGPU). The cheapest NUCs now come with Pentium Gold CPUs, and as mine the second cheapest has an i3.

The lower-than-TDP usage may be just the way Intel has built the board for maximum power saving - just putting in a CPU otherwise sold in laptops - but for the NUCs they maybe don't need as much, and headroom for peripherals.

Remember when I had the peripherals plugged in? With HDMI, dongle for wireless keyboard and mouse and a USB stick, the power usage was quite higher on the CPU, several Watts higher IIRC, so Intel has made some headroom there in case you use all the ports, which seems to make the CPU use several Watts more.

In this screenshot it's headless, no desktop, only Ethernet and power.

It has 4 USB ports, 1 mini-HDMI, 1 mini-DisplayPort, analog audio out and in (microphone) and an Infrared sensor as well (which works perfectly with LIRC BTW, like mapping a generic or common IR remote control).

cyring · 2019-11-10T11:39:46Z

Todo

Find the RAPL unit factor to apply as an exception to the power and energy formula
Impacted architectures:

IVB-EP and probably SNB-EP
Various Atoms
NUC platforms

cyring · 2020-02-27T18:37:33Z

Feature is stable

cyring added the enhancement label Jun 18, 2018

cyring self-assigned this Jun 18, 2018

cyring changed the title ~~RAPL for Skylake architecture~~ Skylake architecture Jun 19, 2018

cyring changed the title ~~Skylake architecture~~ RAPL for Skylake architecture Jun 19, 2018

cyring added a commit that referenced this issue Jun 19, 2018

Skylake, Kabylake: RAPL-DRAM; Dimm size, ECC, geometry (issues #56 #57)

a1953db

cyring changed the title ~~RAPL for Skylake architecture~~ RAPL for Intel architectures Jul 30, 2018

cyring added a commit that referenced this issue Aug 3, 2018

Code backport for RAPL issue #57 and Uncore overclocking issue #67

2a4af30

cyring closed this as completed Dec 1, 2018

cyring changed the title ~~RAPL for Intel architectures~~ RAPL for Intel/AMD architectures Jul 21, 2019

cyring reopened this Jul 21, 2019

cyring pinned this issue Jul 21, 2019

This was referenced Jul 21, 2019

FIXED. Chat & Screens & Benchs: Various Intel Gen CPUs & X570 + Ryzen 3rd Gen #129

Closed

[SOLVED] No temperature report on Ryzen 2700X #54

Closed

cyring mentioned this issue Jul 21, 2019

corefreq-cli seg faults on Arch Linux + Ryzen 3000 #133

Closed

This comment has been minimized.

Sign in to view

cyring added a commit that referenced this issue Oct 22, 2019

Changed the scope of the power formulas (issue #57)

fe23ff9

cyring pinned this issue Nov 3, 2019

cyring added a commit that referenced this issue Nov 5, 2019

[Intel:TDP] Computes the Thermal Design Power (issue #57)

962bf3c

cyring closed this as completed Feb 27, 2020

cyring unpinned this issue Feb 27, 2020

This was referenced Dec 28, 2022

Ryzen 9 7950X #378

Closed

Epyc 7V12 tests #388

Closed

RAPL for Intel/AMD architectures #57

RAPL for Intel/AMD architectures #57

Comments

cyring commented Jun 18, 2018 • edited Loading

cyring commented Jun 28, 2018

cyring commented Jul 30, 2018 • edited Loading

cyring commented Aug 3, 2018

cyring commented Sep 13, 2018

cyring commented Dec 1, 2018

cyring commented Jul 21, 2019 • edited Loading

cyring commented Jul 21, 2019 • edited Loading

Proposal for the Power & Voltage view

AMD Zen

Intel

cyring commented Jul 27, 2019 • edited Loading

RAPL in the AMD Zen architecture

Energy consumed

Ryzen 3xxx

Ryzen 2xxx

Topology

Ryzen 2xxx

Ryzen 3xxx

Threadripper 2950X

Issue

AMD specifications

Improvements

Questions

cyring commented Jul 27, 2019 • edited Loading

Core Complex ID

Ryzen 3xxx [SMT ON]

cyring commented Jul 27, 2019 • edited Loading

cyring commented Aug 4, 2019 • edited Loading

olejon commented Aug 12, 2019 • edited Loading

cyring commented Aug 12, 2019 • edited Loading

olejon commented Aug 12, 2019 • edited Loading

olejon commented Aug 12, 2019 • edited Loading

olejon commented Aug 13, 2019 • edited Loading

cyring commented Oct 16, 2019

cyring commented Oct 21, 2019 • edited Loading

adatum commented Oct 21, 2019

cyring commented Oct 21, 2019

adatum commented Oct 21, 2019

cyring commented Oct 21, 2019

adatum commented Oct 22, 2019

cyring commented Oct 22, 2019 • edited Loading

This comment has been minimized.

olejon commented Oct 23, 2019 • edited Loading

cyring commented Oct 23, 2019

olejon commented Oct 23, 2019 • edited Loading

olejon commented Oct 23, 2019 • edited Loading

cyring commented Oct 28, 2019 • edited Loading

TDP

Formula

Skylake i7-6700

SandyBridge i7-2710QE

Next steps

cyring commented Nov 6, 2019

olejon commented Nov 7, 2019 • edited Loading

cyring commented Nov 7, 2019 • edited Loading

olejon commented Nov 7, 2019 • edited Loading

olejon commented Nov 7, 2019 • edited Loading

cyring commented Nov 7, 2019

olejon commented Nov 7, 2019 • edited Loading

cyring commented Nov 10, 2019 • edited Loading

Todo

cyring commented Feb 27, 2020

cyring commented Jun 18, 2018 •

edited

Loading

cyring commented Jul 30, 2018 •

edited

Loading

cyring commented Jul 21, 2019 •

edited

Loading

cyring commented Jul 21, 2019 •

edited

Loading

Proposal for the `Power & Voltage` view

cyring commented Jul 27, 2019 •

edited

Loading

cyring commented Jul 27, 2019 •

edited

Loading

cyring commented Jul 27, 2019 •

edited

Loading

cyring commented Aug 4, 2019 •

edited

Loading

olejon commented Aug 12, 2019 •

edited

Loading

cyring commented Aug 12, 2019 •

edited

Loading

olejon commented Aug 12, 2019 •

edited

Loading

olejon commented Aug 12, 2019 •

edited

Loading

olejon commented Aug 13, 2019 •

edited

Loading

cyring commented Oct 21, 2019 •

edited

Loading

cyring commented Oct 22, 2019 •

edited

Loading

olejon commented Oct 23, 2019 •

edited

Loading

olejon commented Oct 23, 2019 •

edited

Loading

olejon commented Oct 23, 2019 •

edited

Loading

cyring commented Oct 28, 2019 •

edited

Loading

olejon commented Nov 7, 2019 •

edited

Loading

cyring commented Nov 7, 2019 •

edited

Loading

olejon commented Nov 7, 2019 •

edited

Loading

olejon commented Nov 7, 2019 •

edited

Loading

olejon commented Nov 7, 2019 •

edited

Loading

cyring commented Nov 10, 2019 •

edited

Loading