New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

md: document how to use software RAID with CoreOS #222

Open
philips opened this Issue Jun 4, 2014 · 50 comments

Comments

Projects
None yet
@philips
Member

philips commented Jun 4, 2014

People may want to software RAID CoreOS. Can we do this today? Can we help with the coreos-install script?

/cc @marineam

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Jun 4, 2014

Contributor

No, I would rather not do raid in the current installer script. If we want to start working on a full featured Linux installer we should do that in a language that isn't shell.

Contributor

marineam commented Jun 4, 2014

No, I would rather not do raid in the current installer script. If we want to start working on a full featured Linux installer we should do that in a language that isn't shell.

@robszumski

This comment has been minimized.

Show comment
Hide comment
@robszumski

robszumski Jun 4, 2014

Member

What about software raid outside of the installation script?

Member

robszumski commented Jun 4, 2014

What about software raid outside of the installation script?

@jsierles

This comment has been minimized.

Show comment
Hide comment
@jsierles

jsierles Jun 4, 2014

At least it would be useful to document what needs to be done, if anything, besides building the raid and running the installer against it. I tried this but couldn't get my machine to boot.

jsierles commented Jun 4, 2014

At least it would be useful to document what needs to be done, if anything, besides building the raid and running the installer against it. I tried this but couldn't get my machine to boot.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Jun 4, 2014

Contributor

@robszumski software raid is what we are talking about.

@jsierles sounds like we have bugs to fix because my intent is to make that work.

Contributor

marineam commented Jun 4, 2014

@robszumski software raid is what we are talking about.

@jsierles sounds like we have bugs to fix because my intent is to make that work.

@nekinie

This comment has been minimized.

Show comment
Hide comment
@nekinie

nekinie Aug 9, 2014

Any news on the software raid documentation?

Would be rather useful

nekinie commented Aug 9, 2014

Any news on the software raid documentation?

Would be rather useful

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Aug 10, 2014

Also would see this as very useful functionality.

ghost commented Aug 10, 2014

Also would see this as very useful functionality.

@pierreozoux

This comment has been minimized.

Show comment
Hide comment
@pierreozoux

pierreozoux Aug 13, 2014

Yes, it would be great :)
@philips I saw this commit But yeah... Does anybody can tell me where to start if I want raid software? emerge mdadm ??

pierreozoux commented Aug 13, 2014

Yes, it would be great :)
@philips I saw this commit But yeah... Does anybody can tell me where to start if I want raid software? emerge mdadm ??

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Aug 13, 2014

Contributor

@pierreozoux mdadm is included base images but we haven't played with it at all. Setting up non-root raid volumes should work just the same as any other distro. Same ol' mdadm command for creating and assembling volumes. You may need to enable mdadm.service if you want to assemble volumes on boot via /etc/mdadm.conf as opposed to using the raid-autodetect partition type and letting the kernel do it. It might be possible to move the root filesystem as long as the raid-autodetect partition type is used but for that you are almost certainly better off with using multi-device support in btrfs.

Contributor

marineam commented Aug 13, 2014

@pierreozoux mdadm is included base images but we haven't played with it at all. Setting up non-root raid volumes should work just the same as any other distro. Same ol' mdadm command for creating and assembling volumes. You may need to enable mdadm.service if you want to assemble volumes on boot via /etc/mdadm.conf as opposed to using the raid-autodetect partition type and letting the kernel do it. It might be possible to move the root filesystem as long as the raid-autodetect partition type is used but for that you are almost certainly better off with using multi-device support in btrfs.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Aug 13, 2014

Contributor

What certainly won't work right now is installing all of coreos on top of software raid, the update and boot processes both assume the ESP and /usr partitions are plain disk partitions.

Contributor

marineam commented Aug 13, 2014

What certainly won't work right now is installing all of coreos on top of software raid, the update and boot processes both assume the ESP and /usr partitions are plain disk partitions.

@dalu

This comment has been minimized.

Show comment
Hide comment
@dalu

dalu Aug 28, 2014

I don't know if I understood your last comment.
Are there options/considerations for hard drive / partition replication?
My mechanical drives tend to need to be replaced every 1-2 years. (unrelocatable read or write errors).
And I'm stuck with mechanical drives right now, and even if I wasn't at least RAID1 is a must.
I'd like to try CoreOS with 2 servers but this lack of replication raises a big red danger sign.
I mean, what's the plan if the disc CoreOS is installed on fails?
Do you just install it again and do a dd of the non-os partition/s? What if you can't just dd the container partition/s?
SSDs are currently not an option (double price leased, 1/4-1/8 the size)
KVM some other os and then installing coreos inside a container doesn't seem to be the way it should be done.
Maybe those questions are better asked somewhere else, I did a google search for coreos raid and this was the top result.

dalu commented Aug 28, 2014

I don't know if I understood your last comment.
Are there options/considerations for hard drive / partition replication?
My mechanical drives tend to need to be replaced every 1-2 years. (unrelocatable read or write errors).
And I'm stuck with mechanical drives right now, and even if I wasn't at least RAID1 is a must.
I'd like to try CoreOS with 2 servers but this lack of replication raises a big red danger sign.
I mean, what's the plan if the disc CoreOS is installed on fails?
Do you just install it again and do a dd of the non-os partition/s? What if you can't just dd the container partition/s?
SSDs are currently not an option (double price leased, 1/4-1/8 the size)
KVM some other os and then installing coreos inside a container doesn't seem to be the way it should be done.
Maybe those questions are better asked somewhere else, I did a google search for coreos raid and this was the top result.

@brejoc

This comment has been minimized.

Show comment
Hide comment
@brejoc

brejoc Sep 2, 2014

@marineam Would this constraint of CoreOS also apply to btrfs-raids?

brejoc commented Sep 2, 2014

@marineam Would this constraint of CoreOS also apply to btrfs-raids?

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Sep 2, 2014

Contributor

@brejoc multi-device btrfs for the root filesystem should work

Contributor

marineam commented Sep 2, 2014

@brejoc multi-device btrfs for the root filesystem should work

@warwickchapman

This comment has been minimized.

Show comment
Hide comment
@warwickchapman

warwickchapman Oct 20, 2014

What about migrating after install. Eg to RAID 1 from installed /dev/sda (one partition sda1 for demonstration) should be something like this from a Rescue CD or similar:

sfdisk -d /dev/sda | sfdisk /dev/sdb
sfdisk --id /dev/sdb 1 fd
mdadm --zero-superblock /dev/sdb1
mdadm --create /dev/md0 --level 1 --raid-devices=2 missing /dev/sdb1
mkfs.btrfs /dev/md0
mkdir /mnt/source;mount /dev/sda /mnt/source
mkdir /mnt/target;mount /dev/md0 /target
cp -a /mnt/source/* /mnt/target

Thereafter the disk mount configuration needs to be changed and the kernel root device in the bootloader, as well as the bootloader installed to both disks.

modify /mnt/target/etc/fstab Replace /dev/sda1 with /dev/md0 - but this is non-existent on CoreOS
bootloader since 435 seems to be GRUB which helps but I cannot find a grub binary only config in /usr/boot

Thoughts?

warwickchapman commented Oct 20, 2014

What about migrating after install. Eg to RAID 1 from installed /dev/sda (one partition sda1 for demonstration) should be something like this from a Rescue CD or similar:

sfdisk -d /dev/sda | sfdisk /dev/sdb
sfdisk --id /dev/sdb 1 fd
mdadm --zero-superblock /dev/sdb1
mdadm --create /dev/md0 --level 1 --raid-devices=2 missing /dev/sdb1
mkfs.btrfs /dev/md0
mkdir /mnt/source;mount /dev/sda /mnt/source
mkdir /mnt/target;mount /dev/md0 /target
cp -a /mnt/source/* /mnt/target

Thereafter the disk mount configuration needs to be changed and the kernel root device in the bootloader, as well as the bootloader installed to both disks.

modify /mnt/target/etc/fstab Replace /dev/sda1 with /dev/md0 - but this is non-existent on CoreOS
bootloader since 435 seems to be GRUB which helps but I cannot find a grub binary only config in /usr/boot

Thoughts?

@seeekr

This comment has been minimized.

Show comment
Hide comment
@seeekr

seeekr Dec 11, 2014

@warwickchapman just in case you finished your exploration into this topic and came up with a complete solution - or if someone else has - I'd appreciate if you shared it. I know too little about setting up and messing with RAID / mounts / boot in order to complete this myself. It's not a hard requirement for my use case but it would help being able to have RAID to be able to use both/all disks in a system. I understand it's also possible to set up a distributed file system like Ceph and let it manage the disks without RAID, and that would work for the use cases I have in mind, but for now I'm happy about any additional complexity I can avoid!

seeekr commented Dec 11, 2014

@warwickchapman just in case you finished your exploration into this topic and came up with a complete solution - or if someone else has - I'd appreciate if you shared it. I know too little about setting up and messing with RAID / mounts / boot in order to complete this myself. It's not a hard requirement for my use case but it would help being able to have RAID to be able to use both/all disks in a system. I understand it's also possible to set up a distributed file system like Ceph and let it manage the disks without RAID, and that would work for the use cases I have in mind, but for now I'm happy about any additional complexity I can avoid!

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Dec 11, 2014

Contributor

As noted on IRC, for btrfs if raid0 or raid1 is all you need then it is easiest to just add devices to btrfs and rebalance: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices

As for md raid if the partition types are the raid-autodetect type then the raid volume will be assembled automatically. But you can only put the ROOT filesystem on raid, we don't currently support putting the other partitions on anything other than plain disk devices.

Contributor

marineam commented Dec 11, 2014

As noted on IRC, for btrfs if raid0 or raid1 is all you need then it is easiest to just add devices to btrfs and rebalance: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices

As for md raid if the partition types are the raid-autodetect type then the raid volume will be assembled automatically. But you can only put the ROOT filesystem on raid, we don't currently support putting the other partitions on anything other than plain disk devices.

@seeekr

This comment has been minimized.

Show comment
Hide comment
@seeekr

seeekr Dec 11, 2014

@marineam Perfect -- again thanks for the pointer, that was all I needed! Here's a gist with instructions, a script and helper file, plus some reference links to help people get this done quickly and easily :) I've verified that my instance reboots just fine but haven't checked beyond that if I might have messed up things, which could easily be the case given I'm not experienced at messing with the Linux file system!

https://gist.github.com/seeekr/1afa1e5ce3ad6e998367

seeekr commented Dec 11, 2014

@marineam Perfect -- again thanks for the pointer, that was all I needed! Here's a gist with instructions, a script and helper file, plus some reference links to help people get this done quickly and easily :) I've verified that my instance reboots just fine but haven't checked beyond that if I might have messed up things, which could easily be the case given I'm not experienced at messing with the Linux file system!

https://gist.github.com/seeekr/1afa1e5ce3ad6e998367

@warwickchapman

This comment has been minimized.

Show comment
Hide comment
@warwickchapman

warwickchapman Dec 11, 2014

Thanks, very interesting - I left it at the point I got to and have stuck with OpenVZ for now. Will start testing again.

warwickchapman commented Dec 11, 2014

Thanks, very interesting - I left it at the point I got to and have stuck with OpenVZ for now. Will start testing again.

@agend07

This comment has been minimized.

Show comment
Hide comment
@agend07

agend07 Dec 23, 2014

forgive me my ignorance - does it mean that if i add drives using your script from the gist - i dont need to put any mount units in my cloud-config.
Right now i'm testing it on virtualbox installation and looks like btfrs can see all drives (sudo btrfs fi show) after restart and no mount units.

agend07 commented Dec 23, 2014

forgive me my ignorance - does it mean that if i add drives using your script from the gist - i dont need to put any mount units in my cloud-config.
Right now i'm testing it on virtualbox installation and looks like btfrs can see all drives (sudo btrfs fi show) after restart and no mount units.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Dec 23, 2014

Contributor

@agend07 when adding devices to a btrfs filesystem they become a required part of that filesystem so all of them need to be available in order to mount the filesystem in the first place. The discovery of the devices happens automatically so there isn't any extra configuration.

Contributor

marineam commented Dec 23, 2014

@agend07 when adding devices to a btrfs filesystem they become a required part of that filesystem so all of them need to be available in order to mount the filesystem in the first place. The discovery of the devices happens automatically so there isn't any extra configuration.

@seeekr

This comment has been minimized.

Show comment
Hide comment
@seeekr

seeekr Dec 23, 2014

@agend07 I am not that knowledgeable about btrfs (and CoreOS) myself, but as far as I can tell no other changes are necessary, i.e. no additional mount points, and things just keep working after a restart. From the btrfs docs I also get the matching impression that btrfs is a "self-managing" system for lack of a better term.

seeekr commented Dec 23, 2014

@agend07 I am not that knowledgeable about btrfs (and CoreOS) myself, but as far as I can tell no other changes are necessary, i.e. no additional mount points, and things just keep working after a restart. From the btrfs docs I also get the matching impression that btrfs is a "self-managing" system for lack of a better term.

@agend07

This comment has been minimized.

Show comment
Hide comment
@agend07

agend07 Dec 23, 2014

all clear now - thanks, i was just afraid that even after restart it works, it could stop working after system upgrade without something special in cloud-config.
Now I can sleep better.

agend07 commented Dec 23, 2014

all clear now - thanks, i was just afraid that even after restart it works, it could stop working after system upgrade without something special in cloud-config.
Now I can sleep better.

@agend07

This comment has been minimized.

Show comment
Hide comment
@agend07

agend07 Dec 23, 2014

i believe docs are little misleading on this topic:

https://coreos.com/docs/cluster-management/debugging/btrfs-troubleshooting/#adding-a-new-physical-disk - links to:
https://coreos.com/docs/cluster-management/setup/mounting-storage/
which looks like mount unit in cloud-config is the only way.

I'd probably never got it working without finding this issue.

agend07 commented Dec 23, 2014

i believe docs are little misleading on this topic:

https://coreos.com/docs/cluster-management/debugging/btrfs-troubleshooting/#adding-a-new-physical-disk - links to:
https://coreos.com/docs/cluster-management/setup/mounting-storage/
which looks like mount unit in cloud-config is the only way.

I'd probably never got it working without finding this issue.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Dec 23, 2014

Contributor

@agend07 ah, yes, that is misleading, you either would want to mount device(s) as an independent volume or add them to the ROOT volume, not both. Also, referencing that ephemeral storage documentation in the context of adding devices to ROOT is also bad. You do NOT want to add ephemeral devices to the persistent ROOT because the persistent volume will become unusable as soon as the ephemeral devices are lost.

@robszumski ^^

Contributor

marineam commented Dec 23, 2014

@agend07 ah, yes, that is misleading, you either would want to mount device(s) as an independent volume or add them to the ROOT volume, not both. Also, referencing that ephemeral storage documentation in the context of adding devices to ROOT is also bad. You do NOT want to add ephemeral devices to the persistent ROOT because the persistent volume will become unusable as soon as the ephemeral devices are lost.

@robszumski ^^

@robszumski

This comment has been minimized.

Show comment
Hide comment
@robszumski

robszumski Dec 24, 2014

Member

@agend07 I'm a little unclear what was misleading, a PR to that doc would be greatly appreciated :)

Member

robszumski commented Dec 24, 2014

@agend07 I'm a little unclear what was misleading, a PR to that doc would be greatly appreciated :)

@agend07

This comment has been minimized.

Show comment
Hide comment
@agend07

agend07 Dec 24, 2014

@robszumski I'm not a native english speaker, and I'm not always sure if i understood everything correctly, and I'm probably not best person to write docs for other people, but:

here are the steps which worked for me:

  • find the new drives' name with 'sudo fdisk -l', lets say it's /dev/sdc
  • create one partition on this drive with 'sudo fdisk /dev/sdc' - then 'n' for new partition, choose all defaults with enter, then 'p' to see the changes, 'w' to write them to disk and quit fdisk
  • 'sudo mount /dev/disk/by-label/ROOT /mnt'
  • 'sudo btrfs device add /dev/sdc1 /mnt'
  • 'sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt' - with link to btrfs-balance(8) man
  • 'sudo btrfs fi df /mnt' - to see if it worked
  • 'sudo umount /mnt' - clean up

Easiest thing to do would be:

  • remove the link to "mounting storage" from "adding a new physical disk"
  • add link to seeekr's gist: https://gist.github.com/1afa1e5ce3ad6e998367.git
  • add comment - that if all u need is raid 0, 1 or 10 + snapshots (5 and 6 are not stable as far as i understand) - you dont need to mess with software raid, lvm - btrfs has it all and more. Which is basically marineam comment from above, starting with "As noted on IRC ..."

actually another marineam comment starting with "What certainly won't work right now" says that coreos on top of software raid would not work at all - its aug 13 comment, not sure whats the status for today

I understand making docs that everybody would find helpful is not an easy task. Thanks for your work, and btw - can u speak polish, cause your lastname sounds polish.

agend07 commented Dec 24, 2014

@robszumski I'm not a native english speaker, and I'm not always sure if i understood everything correctly, and I'm probably not best person to write docs for other people, but:

here are the steps which worked for me:

  • find the new drives' name with 'sudo fdisk -l', lets say it's /dev/sdc
  • create one partition on this drive with 'sudo fdisk /dev/sdc' - then 'n' for new partition, choose all defaults with enter, then 'p' to see the changes, 'w' to write them to disk and quit fdisk
  • 'sudo mount /dev/disk/by-label/ROOT /mnt'
  • 'sudo btrfs device add /dev/sdc1 /mnt'
  • 'sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt' - with link to btrfs-balance(8) man
  • 'sudo btrfs fi df /mnt' - to see if it worked
  • 'sudo umount /mnt' - clean up

Easiest thing to do would be:

  • remove the link to "mounting storage" from "adding a new physical disk"
  • add link to seeekr's gist: https://gist.github.com/1afa1e5ce3ad6e998367.git
  • add comment - that if all u need is raid 0, 1 or 10 + snapshots (5 and 6 are not stable as far as i understand) - you dont need to mess with software raid, lvm - btrfs has it all and more. Which is basically marineam comment from above, starting with "As noted on IRC ..."

actually another marineam comment starting with "What certainly won't work right now" says that coreos on top of software raid would not work at all - its aug 13 comment, not sure whats the status for today

I understand making docs that everybody would find helpful is not an easy task. Thanks for your work, and btw - can u speak polish, cause your lastname sounds polish.

@tobia

This comment has been minimized.

Show comment
Hide comment
@tobia

tobia Mar 31, 2015

Maybe I'm missing something, but when I install CoreOS using the latest stable release I get a large ext4 filesystem on sda9, not btrfs. Is the information in this thread outdated or beta-only?

tobia commented Mar 31, 2015

Maybe I'm missing something, but when I install CoreOS using the latest stable release I get a large ext4 filesystem on sda9, not btrfs. Is the information in this thread outdated or beta-only?

@agend07

This comment has been minimized.

Show comment
Hide comment

agend07 commented Mar 31, 2015

@zjeraar

This comment has been minimized.

Show comment
Hide comment
@zjeraar

zjeraar Apr 7, 2015

Is there a guide for a root filesystem on raid1 now that we are on ext4?

zjeraar commented Apr 7, 2015

Is there a guide for a root filesystem on raid1 now that we are on ext4?

@tobia

This comment has been minimized.

Show comment
Hide comment
@tobia

tobia Apr 7, 2015

@agend07 thanks, it wasn't obvious from the rest of the documentation. I didn't understand what all that talk about btrfs was coming from! So let me add my voice to those asking for support for SW RAID1 on the root fs. Rotating disk failure in a server is a very common occurrence. Many leased bare metal servers come with two identical disks for this very purpose, but not with a HW RAID controller, which can have a monthly fee as large as the server itself.

It makes sense to let the user setup the raid themselves with mdadm, because the configurations are too many to have a script handle them. But then the install script, the boot process, and the update should accept—and keep—the given mdX as the root device.

tobia commented Apr 7, 2015

@agend07 thanks, it wasn't obvious from the rest of the documentation. I didn't understand what all that talk about btrfs was coming from! So let me add my voice to those asking for support for SW RAID1 on the root fs. Rotating disk failure in a server is a very common occurrence. Many leased bare metal servers come with two identical disks for this very purpose, but not with a HW RAID controller, which can have a monthly fee as large as the server itself.

It makes sense to let the user setup the raid themselves with mdadm, because the configurations are too many to have a script handle them. But then the install script, the boot process, and the update should accept—and keep—the given mdX as the root device.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Apr 7, 2015

Contributor

Haven't tried this in a very long time but it should be possible after writing the base disk image to change the ROOT partition type to raid autodetect, wipe the existing FS, and set up a md device on it, and then create a new filesystem, label it ROOT, and create a /usr directory in that filesystem. The rest of the fs should get initialized on boot. There is a major limitation though: we don't have a mechanism for applying updates to USR-A/USR-B across multiple disks or on top of a md device. This means that although you can use raid for ROOT for performance, volume size, or disaster recovery purposes it isn't going to help keep a server running in the event of a disk failure.

Given the complexity of doing this by hand right now and the limitation I'm not sure how worth it it is to do for ROOT. In many cases it will be much easier to place any data you need some durability for on a volume created separately from the coreos boot disk, that extra volume could be md, lvm, btrfs, etc.

Contributor

marineam commented Apr 7, 2015

Haven't tried this in a very long time but it should be possible after writing the base disk image to change the ROOT partition type to raid autodetect, wipe the existing FS, and set up a md device on it, and then create a new filesystem, label it ROOT, and create a /usr directory in that filesystem. The rest of the fs should get initialized on boot. There is a major limitation though: we don't have a mechanism for applying updates to USR-A/USR-B across multiple disks or on top of a md device. This means that although you can use raid for ROOT for performance, volume size, or disaster recovery purposes it isn't going to help keep a server running in the event of a disk failure.

Given the complexity of doing this by hand right now and the limitation I'm not sure how worth it it is to do for ROOT. In many cases it will be much easier to place any data you need some durability for on a volume created separately from the coreos boot disk, that extra volume could be md, lvm, btrfs, etc.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Apr 7, 2015

I read that btrfs was not stable, and so coreos changed to ext4 with overlayfs.

Maybe its time to have a look at btrfs. The main guy behind btrfs - Mr merlin - is funded by google after all.

ghost commented Apr 7, 2015

I read that btrfs was not stable, and so coreos changed to ext4 with overlayfs.

Maybe its time to have a look at btrfs. The main guy behind btrfs - Mr merlin - is funded by google after all.

@marineam

This comment has been minimized.

Show comment
Hide comment
@marineam

marineam Apr 7, 2015

Contributor

Chris Mason works for Facebook :)

Contributor

marineam commented Apr 7, 2015

Chris Mason works for Facebook :)

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Apr 8, 2015

Mark merlin is who I am referring to.

ghost commented Apr 8, 2015

Mark merlin is who I am referring to.

@zjeraar

This comment has been minimized.

Show comment
Hide comment
@zjeraar

zjeraar Apr 8, 2015

Thanks @marineam , I was trying to set it up, but wondering if and how updates would work. I'm going to save myself the trouble for now.

zjeraar commented Apr 8, 2015

Thanks @marineam , I was trying to set it up, but wondering if and how updates would work. I'm going to save myself the trouble for now.

@psi-4ward

This comment has been minimized.

Show comment
Hide comment
@psi-4ward

psi-4ward Apr 21, 2015

@marineam i tried the trick with

  • ROOT volume-label
  • LinuxRAID fs-type

works great: https://gist.github.com/psi-4ward/c1cece78cdf8fa9d64e7

psi-4ward commented Apr 21, 2015

@marineam i tried the trick with

  • ROOT volume-label
  • LinuxRAID fs-type

works great: https://gist.github.com/psi-4ward/c1cece78cdf8fa9d64e7

@steigr

This comment has been minimized.

Show comment
Hide comment
@steigr

steigr Apr 27, 2015

You may use any MD Raid Array as long as it's labeled "ROOT" (mkfs.ext4 -L ROOT ...).
Mount /dev/sda1 somewhere and edit coreos/grub/grub.cfg. The variable $linux_append should include the setting "rd.auto=1" and dracut will assemble MD-Raid-Arrays. I guess when you build a MD-Raid1, like described above the fs-type change have to be done. I use a small SSD for coreos-system, except (sda9) and a MD-Raid 5 of 4 disks labeled "ROOT" and 'set linux_append="rd.auto=1"' in /dev/sda1/coreos/grub/grub.cfg. Works like a charm :-D

steigr commented Apr 27, 2015

You may use any MD Raid Array as long as it's labeled "ROOT" (mkfs.ext4 -L ROOT ...).
Mount /dev/sda1 somewhere and edit coreos/grub/grub.cfg. The variable $linux_append should include the setting "rd.auto=1" and dracut will assemble MD-Raid-Arrays. I guess when you build a MD-Raid1, like described above the fs-type change have to be done. I use a small SSD for coreos-system, except (sda9) and a MD-Raid 5 of 4 disks labeled "ROOT" and 'set linux_append="rd.auto=1"' in /dev/sda1/coreos/grub/grub.cfg. Works like a charm :-D

@psi-4ward

This comment has been minimized.

Show comment
Hide comment
@psi-4ward

psi-4ward Apr 27, 2015

Im on stable and i dont have to alter any grub config.

But keep in mind, you have to enlarge sda9

psi-4ward commented Apr 27, 2015

Im on stable and i dont have to alter any grub config.

But keep in mind, you have to enlarge sda9

@steigr

This comment has been minimized.

Show comment
Hide comment
@steigr

steigr Apr 27, 2015

I removed sda9 for the sake of md0. But you are true: It's kind of a hack atm and needs to be documented or discouraged.

steigr commented Apr 27, 2015

I removed sda9 for the sake of md0. But you are true: It's kind of a hack atm and needs to be documented or discouraged.

@anpieber

This comment has been minimized.

Show comment
Hide comment
@anpieber

anpieber May 11, 2015

TBH I think it would be very important to add and/or document a solution for software RAID. Independently of how good coreos can handle failure (together with fleet, docker, ...) I'm not really fancy about "loosing a server" just because of a disc failure (which happens "all the time" btw).

anpieber commented May 11, 2015

TBH I think it would be very important to add and/or document a solution for software RAID. Independently of how good coreos can handle failure (together with fleet, docker, ...) I'm not really fancy about "loosing a server" just because of a disc failure (which happens "all the time" btw).

@baracoder

This comment has been minimized.

Show comment
Hide comment
@baracoder

baracoder Jul 16, 2015

I don't seam to get it to work, I've setup md0 and added the ROOT label, but even if I add rd.auto=1 to grub, it just hangs there on boot. After I type mdadm --assemble --scan in the emergency console, the boot continues. Any idea?

baracoder commented Jul 16, 2015

I don't seam to get it to work, I've setup md0 and added the ROOT label, but even if I add rd.auto=1 to grub, it just hangs there on boot. After I type mdadm --assemble --scan in the emergency console, the boot continues. Any idea?

@tobkle

This comment has been minimized.

Show comment
Hide comment
@tobkle

tobkle Aug 25, 2015

Cannot get CoreOS running with RAID 1 after spending 2 days. This is essential in my opinion. No valid documentation found therefore. Quitting CoreOS.

tobkle commented Aug 25, 2015

Cannot get CoreOS running with RAID 1 after spending 2 days. This is essential in my opinion. No valid documentation found therefore. Quitting CoreOS.

@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford
Member

crawford commented Dec 15, 2015

@cmoad

This comment has been minimized.

Show comment
Hide comment
@cmoad

cmoad Jan 19, 2016

I worked on attached ephemeral raid and was able to get it working using the following units in my cloud config. I tested reboots as well and the raid came back and mounted correctly.

This code is specific to creating a software raid 0 on GCE using two local SSD's with the NVME interface. Feedback or suggestions would be appreciated.

  - name: format-ephemeral-raid.service
    command: start
    content: |
      [Unit]
      After=dev-nvme0n1.device dev-nvme0n2.device
      Requires=dev-nvme0n1.device dev-nvme0n2.device
      ConditionPathExists=!/dev/md0
      [Service]
      Type=oneshot
      RemainAfterExit=yes
      ExecStart=/bin/bash -c 'mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme0n[1-2]'
      ExecStart=/bin/bash -c 'mkfs.ext4 -F /dev/md0'
      ExecStartPost=/bin/bash -c 'mkdir -p /etc/mdadm'
      ExecStartPost=/bin/bash -c 'echo "DEVICE /dev/nvme0n1 /dev/nvme0n2" > /etc/mdadm/mdadm.conf'
      ExecStartPost=/bin/bash -c 'mdadm --detail --scan >> /etc/mdadm/mdadm.conf'

  - name: data.mount
    command: start
    content: |
      [Unit]
      Requires=format-ephemeral-raid.service
      After=format-ephemeral-raid.service
      Before=docker.service
      [Mount]
      What=/dev/md0
      Where=/data
      Type=ext4

cmoad commented Jan 19, 2016

I worked on attached ephemeral raid and was able to get it working using the following units in my cloud config. I tested reboots as well and the raid came back and mounted correctly.

This code is specific to creating a software raid 0 on GCE using two local SSD's with the NVME interface. Feedback or suggestions would be appreciated.

  - name: format-ephemeral-raid.service
    command: start
    content: |
      [Unit]
      After=dev-nvme0n1.device dev-nvme0n2.device
      Requires=dev-nvme0n1.device dev-nvme0n2.device
      ConditionPathExists=!/dev/md0
      [Service]
      Type=oneshot
      RemainAfterExit=yes
      ExecStart=/bin/bash -c 'mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme0n[1-2]'
      ExecStart=/bin/bash -c 'mkfs.ext4 -F /dev/md0'
      ExecStartPost=/bin/bash -c 'mkdir -p /etc/mdadm'
      ExecStartPost=/bin/bash -c 'echo "DEVICE /dev/nvme0n1 /dev/nvme0n2" > /etc/mdadm/mdadm.conf'
      ExecStartPost=/bin/bash -c 'mdadm --detail --scan >> /etc/mdadm/mdadm.conf'

  - name: data.mount
    command: start
    content: |
      [Unit]
      Requires=format-ephemeral-raid.service
      After=format-ephemeral-raid.service
      Before=docker.service
      [Mount]
      What=/dev/md0
      Where=/data
      Type=ext4
@levipierce

This comment has been minimized.

Show comment
Hide comment
@levipierce

levipierce Feb 5, 2016

I used cmoad's approach on aws running kubernetes with coreos worked like a charm!

levipierce commented Feb 5, 2016

I used cmoad's approach on aws running kubernetes with coreos worked like a charm!

@vmatekole

This comment has been minimized.

Show comment
Hide comment
@vmatekole

vmatekole Apr 18, 2016

Hi!
Has there been any progress on this front? Developing consistent documentation for SWRAID. Would going with HWRAID be an easier option for now?

vmatekole commented Apr 18, 2016

Hi!
Has there been any progress on this front? Developing consistent documentation for SWRAID. Would going with HWRAID be an easier option for now?

@robszumski

This comment has been minimized.

Show comment
Hide comment
@robszumski
Member

robszumski commented Apr 18, 2016

@celevra

This comment has been minimized.

Show comment
Hide comment
@celevra

celevra May 1, 2016

can i use that example to setup a software raid for the root partition?

celevra commented May 1, 2016

can i use that example to setup a software raid for the root partition?

@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Oct 17, 2016

Member

@celevra you can use something like this (have not actually tested it):

{
  "ignition": { "version": "2.0.0" },
  "storage": {
    "disks": [
      {
        "device": "/dev/vdb",
        "wipeTable": true,
        "partitions": [{
          "label": "root.1",
          "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E"
        }]
      },
      {
        "device": "/dev/vdc",
        "wipeTable": true,
        "partitions": [{
          "label": "root.2",
          "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E"
        }]
      }
    ],
    "raid": [{
      "devices": [
        "/dev/disk/by-partlabel/root.1",
        "/dev/disk/by-partlabel/root.2"
      ],
      "level": "mirror",
      "name": "root"
    }],
    "filesystems": [
      {
        "name": "data",
        "mount": {
          "device": "/dev/disk/by-label/ROOT",
          "format": "ext4",
          "create": { "force": true }
        }
      },
      {
        "name": "root",
        "mount": {
          "device": "/dev/md/root",
          "format": "ext4",
          "create": {
            "options": [ "-L", "ROOT" ],
            "force": true
          }
        }
      }
    ]
  }
}

This will set up a single RAID partition on vdb and vdc, assemble them into an array, and then create the new ROOT filesystem. It also wipes out the default ROOT partition on vda9 (since you don't want two). When you use this, you'll also need rd.auto to tell the initramfs to automatically assemble the array on every boot.

Member

crawford commented Oct 17, 2016

@celevra you can use something like this (have not actually tested it):

{
  "ignition": { "version": "2.0.0" },
  "storage": {
    "disks": [
      {
        "device": "/dev/vdb",
        "wipeTable": true,
        "partitions": [{
          "label": "root.1",
          "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E"
        }]
      },
      {
        "device": "/dev/vdc",
        "wipeTable": true,
        "partitions": [{
          "label": "root.2",
          "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E"
        }]
      }
    ],
    "raid": [{
      "devices": [
        "/dev/disk/by-partlabel/root.1",
        "/dev/disk/by-partlabel/root.2"
      ],
      "level": "mirror",
      "name": "root"
    }],
    "filesystems": [
      {
        "name": "data",
        "mount": {
          "device": "/dev/disk/by-label/ROOT",
          "format": "ext4",
          "create": { "force": true }
        }
      },
      {
        "name": "root",
        "mount": {
          "device": "/dev/md/root",
          "format": "ext4",
          "create": {
            "options": [ "-L", "ROOT" ],
            "force": true
          }
        }
      }
    ]
  }
}

This will set up a single RAID partition on vdb and vdc, assemble them into an array, and then create the new ROOT filesystem. It also wipes out the default ROOT partition on vda9 (since you don't want two). When you use this, you'll also need rd.auto to tell the initramfs to automatically assemble the array on every boot.

@celevra

This comment has been minimized.

Show comment
Hide comment
@celevra

celevra Oct 18, 2016

thank you crawford, will try that out

celevra commented Oct 18, 2016

thank you crawford, will try that out

@madejackson

This comment has been minimized.

Show comment
Hide comment
@madejackson

madejackson Dec 29, 2016

Dear All,

I'm new to CoreOS and i'd like to test this on my barebone server.

I have following Issue:
I want to have 2x SSD's in RAID1 for root as I always do.

I tried crawfords solution, installed CoreOS with ignition-file, but this resulted in an error, where the boot sequence is waiting on a job with no limit, which means forever due to that job will never be finished. When i kill the server manually and reboot, the drive is not bootable anymore.

When I try to setup a HW-Raid, my CoreOS-LiveUSB does not recognize my HW-RAID and takes the two drives as regular sda and sdb. Additionally, the coreos-install script fails with an error return code 32.

BTW: I have attached another 4 Drives for Storage in the same node.

Does anyone have a solution?

madejackson commented Dec 29, 2016

Dear All,

I'm new to CoreOS and i'd like to test this on my barebone server.

I have following Issue:
I want to have 2x SSD's in RAID1 for root as I always do.

I tried crawfords solution, installed CoreOS with ignition-file, but this resulted in an error, where the boot sequence is waiting on a job with no limit, which means forever due to that job will never be finished. When i kill the server manually and reboot, the drive is not bootable anymore.

When I try to setup a HW-Raid, my CoreOS-LiveUSB does not recognize my HW-RAID and takes the two drives as regular sda and sdb. Additionally, the coreos-install script fails with an error return code 32.

BTW: I have attached another 4 Drives for Storage in the same node.

Does anyone have a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment