New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve bios raid handling on system disk #1151

Closed
phillxnet opened this Issue Feb 5, 2016 · 16 comments

Comments

Projects
None yet
2 participants
@phillxnet
Member

phillxnet commented Feb 5, 2016

When using bios raid, such as Intel Raid Storage Technology on the system disk there is no system disk identification due to differences in the device naming scheme, ie /dev/md126p3 rather than the expected /dev/sda3. This causes the base md126 as well as the swap partition to be listed in the disks page all with inappropriate flags.

This has been seen in the forum in the following post:-
http://forum.rockstor.com/t/issues-cant-wipe-some-drives-no-smart-diagnostics-disk-serial-numbers-not-legitamate/820/11

I have reproduced the above system disk layout locally:-
3 8-11-default-install-irst

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 5, 2016

I am having a quick look at this issue currently.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 5, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 5, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 5, 2016

add lsblk check for type md as well as parted rockstor#1151
scan_disks identifies partitions by their type of "parted"
but md partitions are of type md so add this as an "or"
clause to all parted checks.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 5, 2016

Not there yet but Flags looking better and root identified and base root dev not showing.
Outstanding issue is swap partition still showing up and not sure why.
intel-raid-partly-accomodated

Will try and continue with this tomorrow (UK).

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 5, 2016

There may also be an opportunity to use the md devices uuid in lieu of our device serial number but will have a look at that when the above issue is sorted.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

add logging and comments and revert type md check rockstor#1151
md type check broke non md systems and didn't work as intended

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

don't try and sense smart capability on md devices rockstor#1151
This avoids filling the logs with exceptions from trying to assess
their smart availability and enabled status.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

improve swap device or partition identification rockstor#1151
Previously we relied on a complex fall though to identify
swap partitions. Since we are not interested in swap dev
or partitions then skip any further processing of any device
found to have fstype swap.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 6, 2016

add md capability to get_disk_serial rockstor#1151
We can use the built in unique MD_UUID in lieu of our
hardware serial numbers to keep track of md devices.
This avoids having to attribute a SERIAL via udev rules.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 6, 2016

Progress so far: the above patches now more elegantly deal with the same Intel Raid Storage bios boot device used as the system disk:-
intel-raid-accomodated

I am now planning to test the same patches for compatibility with a multi md device install such as that included in the docs:- http://rockstor.com/docs/mdraid-mirror/boot_drive_howto.html which is an all software raid arrangement with no bios raid component.

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 6, 2016

I hope to continue with this issue tomorrow (UK).

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 7, 2016

When installing according to the docs howto for mirrored os drives via triple md devices ie /boot / swap, the current 3.8-11.07 presents the drives as follows:-
md-howto-3 8-11 07

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 7, 2016

improve root_disk reporting when root on md rockstor#1151
Update comments accordingly. Tested on identifying /
when directly on an md rather than when in a partition
in an md. Works is both cases now.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 7, 2016

The following is the result of these patches when used with the non btrfs root md howto install:-
howto-md-results
I believe this is as expected due to assumptions of root being btrfs and in a partition. It did however lead to a cleaner root_disk reporting implementation which can now deal with root directly on an md device.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 7, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 7, 2016

call get_md_members for root on md device rockstor#1151
This way we can populate the MODEL of our root md device
with a string indicating it's members. Cheap indicator for now.
Also remove some earlier logging.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 7, 2016

Remaining problems:-
Need to address properly returning root=True on root md126p3 type device.
Debug and test new get_md_members function.
Test with device that requires get_disk_serial that isn't an md device (ie sdcard) as did minor change to re.match test.

Hope to pick this up again tomorrow.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

begin refactoring and further commenting scan_disks rockstor#1151
Required to clarify logic prior to extending capabilities re
device categorization.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

provide default on get_disk_serial test parameter rockstor#1151
This way we don't have to call with None all the time.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

more comments and var re-factoring in scan_disks rockstor#1151
Also added temp logging to check continued function as changes
are made in this issue.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

modify scan_disks logic to sense md partitions rockstor#1151
This is required for hosting our system drive on an md partition
such as md126p3 akin to sda3. Note that helpers to
scan_disks have been updated to accommodate this change and
an earlier swap partition improvement were also necessary.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 8, 2016

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 8, 2016

Current state of this patch set with the same initial bios raid install as at the top of this issue:-
bios-md-working-with-model-info
Now properly ascribes root=True in info for root partition md126p3.
ToDo:-
Test get_disk_serial on non md device that requires udev fail over mechanism (ie sdcard).
Possibly use serial numbers with or instead of dev names in raid model info.
Further testing.
Remove issue related logging.

Hope to continue this tomorrow.

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 8, 2016

If I leave scope pretty much as is then the resulting pull request from this issue should be available soon.

@schakrava

This comment has been minimized.

Member

schakrava commented Feb 9, 2016

I skimmed through your comments and some changes, nice work @phillxnet. thanks! looking forward to testing and merging the upcoming pr. good luck

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 9, 2016

To address the issue of raid member disks having the cog icon and a popup suggesting they be wiped we could add an extension to the Disk model in:-
src/rockstor/storageadmin/models/disk.py
This extension could be the addition of a role field such as has just been added to the pool model in:-
9bc9795
and then the js view:-
src/rockstor/storageadmin/static/storageadmin/js/views/disks.js
could be informed by this role to display a different icon which in turn links to a details page of that device from an md perspective, likewise the same role mechanism could be ascribed (by scan_disks) to the resulting md126p3 device to access it's md view. The mouse over popup hint could give a brief description, ie lsblk gives the sda device an FSTYPE="isw_raid_member" this could probably go straight into the disk role when seen.

This additional role field could also accommodate the labelling of devices such as external temp connected USB disks, unique by their serial and their disk model "role" would return upon their re-connection. See proposed feature in issue #1157 .

I think this disk model addition is beyond the scope of this "Part 1" of "improve bios raid handling on system disk" issue so I will leave these cogs on raid members dangling for the time being so the disk model discussion can take place.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 9, 2016

include mmcblk* devs in smart exclusion rockstor#1151
This is the dev name presented by a popular sdcard
reader but it doesn't seem to support smart so to
avoid exceptions in logs we auto disable smart on these.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 9, 2016

use serial num not dev names in root md model info rockstor#1151
Also split fields by space as will re-size better and we can split
on space later as serials often use "-" seperator. Values are
number of members, serials of those members followed by
their member index in [] brackets, then the raid level.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 9, 2016

Serial retrieval that uses udev fail over tested as still working after changed. or virtio and mmcblk devs.
Changed over and tested md model info to reference serials of devs listed as members by udevadm. This way one can see which drives were members and we don't have the confusion caused by dev names moving. udevadm info name gives current dev names so we can use them I believe but because we store this info in Disk model it should not be ephemeral, plus if device down or missing then only it's serial will be visible in disks page.

The following shows the same initial bios raid install but having been updated and the 3 changed files overwritten in place. Disk refresh then indicates disks interpretation.
Note the correct labelling of the external usb disk used on another Rockstor with multiple shares that is correctly labelled as btrfs and importable.
bios-md-serial-with-import

See previous comment in this issue for dangling cogs on bios raid members.

@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 9, 2016

And the same system having just imported from the usb disk:-
bios-md-serial-after-import

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 9, 2016

remove ext4 root exception silence, now btrfs root only again rocksto…
…r#1151

If the root mount point is found to be non btrfs then a
NonBTRFSRootException is raised. Here a temporary ext4
exception to the btrfs only root was removed.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 9, 2016

remark out old 'update_disk_state() Called' logging rockstor#1151
This was left over from a previous issue and now tends to
fill the logs when a browser is viewing for example the disks
page.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Feb 9, 2016

remove debug and info logging used in this issue rockstor#1151
Also removed todo associated with recent ext4 root commit.
@phillxnet

This comment has been minimized.

Member

phillxnet commented Feb 9, 2016

The same patches applied to a non md system with a ready to use vda (ie blank) and a FAT32 single partition sdc flagged as expected:-
non-md-with-md-patches

sda      8:0    0    8G  0 disk 
├─sda1   8:1    0  500M  0 part /boot
├─sda2   8:2    0  820M  0 part [SWAP]
└─sda3   8:3    0  6.7G  0 part /mnt2/rockstor_rockstor
sdb      8:16   0   10G  0 disk /mnt2/rock-pool
sdc      8:32   0    2G  0 disk 
└─sdc1   8:33   0    2G  0 part 
sr0     11:0    1 1024M  0 rom  
vda    253:0    0    2G  0 disk 

schakrava added a commit to schakrava/rockstor-core that referenced this issue Feb 9, 2016

@schakrava

This comment has been minimized.

Member

schakrava commented Feb 9, 2016

Fixed by #1158

@schakrava schakrava closed this Feb 9, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment