Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scylla_setup failed to setup RAID on Ubuntu 20.04 #7627

Closed
enaydanov opened this issue Nov 17, 2020 · 8 comments
Closed

scylla_setup failed to setup RAID on Ubuntu 20.04 #7627

enaydanov opened this issue Nov 17, 2020 · 8 comments
Assignees
Milestone

Comments

@enaydanov
Copy link
Contributor

scylla_setup unable to setup RAID because Ubuntu 20.04 used /dev/md0 already.

Both checks for existence of /dev/md0 (4.3) and /sys/block/md0/md/array_state (master) failed.

Tested on AWS and GCE, both w/ and w/o NVMe's

Installation details

Scylla version (or git commit hash): 4.3.rc1-0.20201115.b2271800a55 and 4.4.dev-0.20201113.0a2adf4555
AWS: ami-0fb4089e765e1ea02 (eu-north-1) [Official Ubuntu 20.04 LTS AMI]
GCE: https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/family/ubuntu-2004-lts

$ sudo /usr/lib/scylla/scylla_setup --nic ens5 --disks /dev/nvme0n1 --setup-nic-and-disks  --swap-directory / 
...
/dev/md0 is already using
RAID setup failed.

#6343

@tzach
Copy link
Contributor

tzach commented Nov 25, 2020

@enaydanov is this issue related on Ubuntu 20? does the same test pass for Ubuntu 18?

@syuu1228
Copy link
Contributor

I think we can close this since we merged 587b909

@avikivity
Copy link
Member

@enaydanov please verify.

@enaydanov
Copy link
Contributor Author

@enaydanov is this issue related on Ubuntu 20? does the same test pass for Ubuntu 18?

Yes, it's related to Ubuntu 20 only.

@enaydanov
Copy link
Contributor Author

enaydanov commented Dec 1, 2020

Artifacts test passed: https://jenkins.scylladb.com/view/master/job/scylla-master/job/artifacts/job/artifacts-ubuntu2004-test/10/

ScyllaDB version: 4.4.dev-0.20201130.ea9c058be3
ScyllaDB repo: http://downloads.scylladb.com/deb/unstable/unified/master/2020-11-30T15:41:07Z/scylladb-master/scylla.list
Actual gce image used: ubuntu-2004-focal-v20201111
GCE region: us-east1
Instance type: n1-standard-2

@avikivity
Copy link
Member

Fixed by 587b909.

@amoskong
Copy link
Contributor

@avikivity @tzach we need to backport this fix, if we want scylla 4.3 work with Ubuntu 20.04.

avikivity pushed a commit that referenced this issue Jan 3, 2021
If scylla_raid_setup script called without --raiddev argument
then try to use any of /dev/md[0-9] devices instead of only
one /dev/md0.  Do it in this way because on Ubuntu 20.04
/dev/md0 used by OS already.

Closes #7628

(cherry picked from commit 587b909)

Fixes #7627.
@avikivity
Copy link
Member

Backported to 4.3.

syuu1228 added a commit to syuu1228/scylla that referenced this issue Mar 4, 2021
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue scylladb#7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes scylladb#8219
avikivity pushed a commit that referenced this issue Mar 7, 2021
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue #7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes #8219

Closes #8220
avikivity pushed a commit that referenced this issue Mar 7, 2021
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue #7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes #8219

Closes #8220

(cherry picked from commit 2d9feaa)
avikivity pushed a commit that referenced this issue Mar 8, 2021
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue #7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes #8219

Closes #8220

(cherry picked from commit 2d9feaa)
denesb pushed a commit to denesb/scylla that referenced this issue Oct 20, 2021
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue scylladb#7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes scylladb#8219

Closes scylladb#8220

(cherry picked from commit 2d9feaa)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants