Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sw-raid: make mdmon start from initrd #13447

Closed
wants to merge 1 commit into from
Closed

Conversation

kklas
Copy link
Contributor

@kklas kklas commented Feb 25, 2016

Things done:
  • Tested via nix.useChroot.
  • Built on platform(s): NixOS amd64
  • Tested compilation of all pkgs that depend on this change.
  • Tested execution of binary products.
  • Fits CONTRIBUTING.md.
Extra

Fixes #13058

After investigating my problem #13058 I found out partitions couldn't be mounted as non read-only because mdmon wasn't started in initrd because the binary wasn't included, so I added it.

The other problem was raids syncing after every boot. This happens because arrays aren't clean on shutdown. I fixed this by adding systemd services for starting/stopping mdmon to ensure proper shutdown. The mdmon@.service is actullay called for by udev rules in https://github.com/nixos/nixpkgs/blob/0e32206a8610bf76bf5e82af4f6dd510a83a3d1d/nixos/modules/tasks/swraid.nix#L12 but is missing.

Referencing this:
https://git.centos.org/blob/rpms!mdadm.git/8c89314bd54c3fdc59cad75f6cc4505a30a8e8fb/SOURCES!mdadm-3.2.6-systemd-various-fixes-for-boot-with-container-arrays.patch
https://github.com/neilbrown/mdadm/blob/master/systemd/mdmon@.service
https://github.com/neilbrown/mdadm/blob/master/systemd/mdadm.shutdown

@mention-bot
Copy link

By analyzing the blame information on this pull request, we identified @edolstra, @wkennington and @nbp to be potential reviewers

@abbradar
Copy link
Member

Sounds good to have in release. cc @edolstra as the original author of swraid.nix. Looks good to me but I don't have hardware to test.

@edolstra
Copy link
Member

Hm, I wonder how is it possible that the swraid tests do succeed most of the time: http://hydra.nixos.org/job/nixos/release-15.09/nixos.tests.installer.swraid.x86_64-linux

Note btw that the initrd kills all processes except those whose argv[0] starts with @, just before handing over to stage-2. The @ also prevents systemd from killing it on shutdown. Does mdmon follow that convention?

@abbradar
Copy link
Member

From the description of mdmon it seems that this is needed for special metadata formats, such as Intel proprietary format (so-called "Intel fake RAID") -- they are implemented in userspace.

@abbradar
Copy link
Member

It seems that mdmon is systemd-aware. BTW @kklas, do we need other services such as mdadm-last-resort, mdadm-grow-continue or mdmonitor? See https://github.com/neilbrown/mdadm/tree/master/systemd

@kklas
Copy link
Contributor Author

kklas commented Feb 25, 2016

Yes, mdmon follows the @ argv[0] convention, that's what --offroot is for. You can read about it here. This man is for older version of mdmon and documentation for this flag was for some reason removed from manuals for newer versions, but it works when started from systemd. If I should remove this flag shutdown will hang for a few seconds as noted here neilbrown/mdadm@8d1d32b

As for the other services, mdadm-last-resort was added in this commit neilbrown/mdadm@169ffac, and it's also called for in the udev rules I mentioned in the original post, so probably it's a good idea to add that but it can work without it. I currently have no way of testing it.
grow-continue neilbrown/mdadm@5e76dce, is to help with monitoring the reshape in the background. Also useful to have but not critical.
mdmonitor neilbrown/mdadm@61c0947 is for starting mdadm to report any events on arrays and send reports to stdout, email and such. Also not critical. Maybe make this has a place in nixos/modules/services/monitoring to be configured by the user in configuration.nix and such? 63-md-raid-arrays.rules tries to start it.

@abbradar
Copy link
Member

From these descriptions it seems to me that we can add mdadm-grow-continue (should be fairly easy to add and useful) -- even if we can't test it now it seems configuration-less and should pose no problem.

mdadm-last-resort looks like it should be ran from initrd, so it won't be so easy -- let's skip on this one for now if everyone doesn't mind. Nevertheless it sounds very important to have it working, or you might be left with an unbootable system when your array is degraded.

mdmonitor sounds very optional -- let's leave it out completely.

@edolstra
Copy link
Member

Could we use the upstream systemd units? Looks like they're not currently installed by the mdadm package.

@abbradar
Copy link
Member

I think not for mdadm-last-resort -- from its look it should be used in initrd (when you have systemd in it). Other ones -- should be no problem. BTW, I've always thought that we can't just use systemd units installed with the package because of additional handling in NixOS (Environment, enabling the unit and so on) -- can we?

@kklas
Copy link
Contributor Author

kklas commented Feb 26, 2016

I added mdadm-grow-continue so we have it here since I'll be offline next week.

@abbradar
Copy link
Member

Your patch is about mdadm-last-resort -- you probably wanted to mention it, not -grow-continue. Notice that I don't think mdadm-last-resort would work -- if I understand correctly array won't activate when degraded until this command is run, but we would still be in initrd when that happens so a systemd service won't do much. On the other hand, mdadm-grow-continue should work if we just add a service (but please notice that I have no good knowledge of how this works and judge only by documentation, commit messages and service descriptions!).

@kklas
Copy link
Contributor Author

kklas commented Feb 26, 2016

Yes, sorry, I mixed them up for a second.
But wouldn't mdmon@.service alse be started from initrd since it originally has Before=initrd-switch-root.target which I removed because we don't use it?

Also theres an oddity I noticed. When I boot up I have in I have two mdmon processes:
├─init.scope
│ ├─ 1 systemd
│ └─141 @nix/store/a5m9zxmdvf0hs6dg9j98d3nhw672ln5c-extra-utils/bin/mdmon md127
...
├─system.slice
│ ├─system-mdmon.slice
│ │ └─mdmon@md127.service
│ │ └─522 @nix/store/pkam1pax1b46n2bzn3axkd3zi158a7fn-mdadm-3.3.4/bin/mdmon --offroot --takeover md127

after five or so minutes the one under init.scope stops, and I'm left with just the one under system.slice. I don't know what to make of it - is this expected behavior or not? Maybe it's connected with not having systemd in initrd?

@abbradar
Copy link
Member

If I understand correctly, it works as follows:

  1. In initrd, udev rule fires that starts first mdmon (from init.scope).
  2. Then, when systemd takes over after initrd, it starts mdmon@ which starts mdmon --takeover. It should kill the old mdmon but it doesn't do that. I suspect that's because mdmon has its full path in its argv rather than just @dmon -- we probably can fix that in udev rule.

I don't know how that Before rule worked, maybe it is expected to work both in initrd (if OS supports it) and in the real system.

@abbradar
Copy link
Member

I think we can merge this -- problem with a leftover mdmon is not serious because it kills itself after some time anyway (see above). Otherwise this looks good and safe -- just needs commit squashing, maybe.

@abbradar
Copy link
Member

I'll merge this in a few days unless there are any issues with the PR.

@domenkozar do you think this is worthy of merging into new 16.03 branch? tl;dr: this should fix (at least make usable when they are not degraded) RAID arrays with non-native metadata, for example Intel fake RAID arrays (I don't know of any others).

@domenkozar
Copy link
Member

Sure

@abbradar
Copy link
Member

abbradar commented Mar 3, 2016

@kklas Could you squash this into one commit?

@kklas
Copy link
Contributor Author

kklas commented Mar 3, 2016 via email

Also add required systemd services for starting/stopping mdmon.
@abbradar
Copy link
Member

abbradar commented Mar 9, 2016

Merged with a small change (mdadmMonitor -> mdadm-monitor, systemd services are usually named like this). Thanks!

abbradar pushed a commit that referenced this pull request Mar 9, 2016
Also add required systemd services for starting/stopping mdmon.

(cherry picked from commit aac666e)

See #13447 (comment) for cherry-pick discussion.
@kklas kklas deleted the imsm-fix branch March 10, 2016 09:39
wkennington pushed a commit to wkennington/nixpkgs that referenced this pull request Mar 14, 2016
Also add required systemd services for starting/stopping mdmon.

Closes NixOS#13447.
abbradar: fixed `mdadmShutdown` service name according to de facto conventions.
adrianpk added a commit to adrianpk/nixpkgs that referenced this pull request May 31, 2024
Also add required systemd services for starting/stopping mdmon.

(cherry picked from commit aac666e)

See NixOS#13447 (comment) for cherry-pick discussion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

imsm raid 1 cannot mount properly
6 participants