Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

Bare-Metal Block Device Failures HP #2390

Open
jordy25519 opened this issue Mar 27, 2018 · 2 comments
Open

Bare-Metal Block Device Failures HP #2390

jordy25519 opened this issue Mar 27, 2018 · 2 comments

Comments

@jordy25519
Copy link

jordy25519 commented Mar 27, 2018

Issue Report

Bug

Block devices operations are failing with the following error in journal logs.

NMI: PCI system error (SERR) for reason b1 on CPU 0.
Mar 27 20:53:20 localhost kernel: Dazed and confused, but trying to continue
Mar 27 20:53:20 localhost kernel: DMAR: DRHD: handling fault status reg 2
Mar 27 20:53:20 localhost kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffdda000 [fault reason 06] PTE Read access is not set

however:
lsblk shows

loop0    7:0    0 295.7M  0 loop /usr
sdb      8:16   0 279.4G  0 disk 
sdc      8:32   0 279.4G  0 disk 
`-sdc1   8:33   0 279.4G  0 part 
sdd      8:48   0 558.9G  0 disk 
`-sdd1   8:49   0 558.9G  0 part 
sde      8:64   0 931.5G  0 disk 
`-sde1   8:65   0 931.5G  0 part 

I suspect it may be related to HW / raid array drivers etc. I am able to successfully installed Ubuntu on this server with all disks in operation.

Container Linux Version

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1632.3.0
VERSION_ID=1632.3.0
BUILD_ID=2018-02-14-0338
PRETTY_NAME="Container Linux by CoreOS 1632.3.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Bare Metal, HP ProLiant Server (DL380G6)

Expected Behavior

block devices are mapped and accessible

Reproduction Steps

Matchbox PXE boot a bare-metal HPDL380G6 server

@mwthink
Copy link

mwthink commented Apr 20, 2018

I'm currently experiencing this issue and have been debugging it for awhile now. Pretty sure I've narrowed it down to the RAID controller firmware. By any chance, are you using a P410 card for your disks?

Also, can you check the output of fdisk -l? For me, I can see the devices in lsblk, but they won't show on fdisk.

@jordy25519
Copy link
Author

I believe it was a 410i card. As far as I remember fdisk -l showed no disks. I've since left the company so can't be much more help. In the past I've been able to install CoreOS by using random older versions and channels and then updating to the current release if the install succeeded.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants