Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rasdaemon dimm label format / sysfs content missing #52

Open
dgcampea opened this issue Sep 26, 2021 · 0 comments
Open

rasdaemon dimm label format / sysfs content missing #52

dgcampea opened this issue Sep 26, 2021 · 0 comments

Comments

@dgcampea
Copy link
Contributor

Labels don't seem to be correctly applied when using rasdaemon.
If I edit /etc/edac/labels.db and add:

Vendor: ASRock
    Model: X99M Killer
        DDR4_A1: 0.0.0;
        DDR4_B1: 0.0.1;
        DDR4_C1: 0.0.2;
        DDR4_D1: 0.0.3;

edac-ctl seems to detect the correct sysfs dimm directories:

# edac-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0/csrow0/ch0_dimm_label           DDR4_A1              CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
mc0/csrow0/ch1_dimm_label           DDR4_B1              CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
mc0/csrow0/ch2_dimm_label           DDR4_C1              CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
mc0/csrow0/ch3_dimm_label           DDR4_D1              CPU_SrcID#0_Ha#0_Chan#3_DIMM#0

But doing the same for rasdaemon by adding the same content I added in /etc/edac/labels.db to /etc/ras/dimm_labels.d/asrock I get:

# ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 channel 0 slot 0 
              DDR4_A1              CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
                                    DDR4_B1              0:0:1 missing       
                                    DDR4_C1              0:0:2 missing       
                                    DDR4_D1              0:0:3 missing       

System Info

Motherboard: Fatal1ty X99M Killer
CPU: Intel(R) Xeon(R) CPU E5-2620 v4
rasdaemon version: 0.6.4
Kernel: 5.13.19-200.fc34.x86_64
Distribution: Fedora 34

sysfs contents by searching for dimm:

# find /sys/ -iname '*dimm*'
/sys/devices/system/edac/mc/mc0/dimm3
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_label
/sys/devices/system/edac/mc/mc0/dimm3/dimm_location
/sys/devices/system/edac/mc/mc0/dimm3/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/csrow0/ch2_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch3_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label
/sys/devices/system/edac/mc/mc0/dimm6
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_label
/sys/devices/system/edac/mc/mc0/dimm6/dimm_location
/sys/devices/system/edac/mc/mc0/dimm6/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_label
/sys/devices/system/edac/mc/mc0/dimm0/dimm_location
/sys/devices/system/edac/mc/mc0/dimm0/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm9
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_label
/sys/devices/system/edac/mc/mc0/dimm9/dimm_location
/sys/devices/system/edac/mc/mc0/dimm9/dimm_edac_mode

ras-mc-ctl --layout output

# ras-mc-ctl --layout
Use of uninitialized value $max_pos[3] in modulus (%) at /usr/sbin/ras-mc-ctl line 868.
Use of uninitialized value $d in numeric ge (>=) at /usr/sbin/ras-mc-ctl line 869.
Use of uninitialized value $d in sprintf at /usr/sbin/ras-mc-ctl line 872.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
    +-----------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                                      mc0                                                                      |
    |             channel0              |             channel1              |             channel2              |             channel3              |
    |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+

0: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+

ras-mc-ctl --error-count output:

# ras-mc-ctl --error-count
Label                         	CE	UE
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#3_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#2_DIMM#0	0	0

dmidecode -t memory output:

# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.

Handle 0x000E, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: Multi-bit ECC
	Maximum Capacity: 256 GB
	Error Information Handle: Not Provided
	Number Of Devices: 4

Handle 0x0010, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_A1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 1323637C
	Asset Tag: DIMM_A1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0012, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_B1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236327
	Asset Tag: DIMM_B1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0014, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_C1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236324
	Asset Tag: DIMM_C1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0016, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_D1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236332
	Asset Tag: DIMM_D1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant