Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay: nuke rhcos-growpart #484

Merged
merged 1 commit into from
Jan 14, 2021

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Jan 13, 2021

Now that we've moved to Ignition LUKS, we don't need this anymore. It
also doesn't support multipath like coreos-growpart does.

(In fact, ignition-ostree-growfs.service has been running on RHCOS for
a while now because it was renamed from coreos-growpart.service and
moved to the initrd so the disable preset here didn't actually have
any effect.)

Now that we've moved to Ignition LUKS, we don't need this anymore. It
also doesn't support multipath like `coreos-growpart` does.

(In fact, `ignition-ostree-growfs.service` has been running on RHCOS for
a while now because it was renamed from `coreos-growpart.service` and
moved to the initrd so the `disable` preset here didn't actually have
any effect.)
@jlebon
Copy link
Member Author

jlebon commented Jan 13, 2021

/cc @Prashanth684

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 13, 2021
@arithx
Copy link
Contributor

arithx commented Jan 14, 2021

Just to make sure, this service only runs on the first boot correct? If not we'd likely want to keep it in place until we have the ability to do boot image bumps + an upgrade script to modify the filesystems to look more like new Ignition based LUKS.

@Prashanth684
Copy link
Contributor

multipath is a day2 operation, so this should not affect it in that case right? this will only be an issue if we support multipath during first boot ?

@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

multipath is a day2 operation, so this should not affect it in that case right?

Thanks a lot for this comment. My understanding was that we were shooting for day1 (i.e. via coreos-installer install --append-karg rd.multipath=default --append-karg root=...), but that wasn't really written anywhere in the RFE AFAICT.

It would definitely simplify a lot if it was day 2, because it falls in line with our model by providing the ability to configure it via Ignition. Obviously the downside here is that there is no redundancy during the bootstrapping phase, though that might be acceptable.

Let me confirm this with PM.

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2021
@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

Another thing worth clarifying is that since in OCP4 we always incur a reboot, this could still be day 1 yet only happen on the first reboot. So then, the kargs would be added via MachineConfig.

@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

(That said, regardless I think rhcos-growpart is safe to nuke for 4.7.)

Just to make sure, this service only runs on the first boot correct?

Yes, it's first boot only (notice how it's using a stamp file which is a ConditionPathExists=! in the service unit).

@Prashanth684
Copy link
Contributor

multipath is a day2 operation, so this should not affect it in that case right?

Thanks a lot for this comment. My understanding was that we were shooting for day1 (i.e. via coreos-installer install --append-karg rd.multipath=default --append-karg root=...), but that wasn't really written anywhere in the RFE AFAICT.

It would definitely simplify a lot if it was day 2, because it falls in line with our model by providing the ability to configure it via Ignition. Obviously the downside here is that there is no redundancy during the bootstrapping phase, though that might be acceptable.

Let me confirm this with PM.

/hold

I think we (multi-arch mainly) want it to be a day 1 functionality. I did try it but ran into some issues (including this) which i did not spend much time triaging. @nikita-dubrovskii thought there was some race between when the mpath devices were created and ignition. It would be great if you could discuss with PM and scope out a timeline to support multi-path as a day 1 functionality! https://issues.redhat.com/browse/MULTIARCH-401 is also the ticket for multipath support for multi-arch where there is an ask to support day1 functionality, but now it doesn't work.

@Prashanth684
Copy link
Contributor

multipath is a day2 operation, so this should not affect it in that case right?

Thanks a lot for this comment. My understanding was that we were shooting for day1 (i.e. via coreos-installer install --append-karg rd.multipath=default --append-karg root=...), but that wasn't really written anywhere in the RFE AFAICT.
It would definitely simplify a lot if it was day 2, because it falls in line with our model by providing the ability to configure it via Ignition. Obviously the downside here is that there is no redundancy during the bootstrapping phase, though that might be acceptable.
Let me confirm this with PM.
/hold

I think we (multi-arch mainly) want it to be a day 1 functionality. I did try it but ran into some issues (including this) which i did not spend much time triaging. @nikita-dubrovskii thought there was some race between when the mpath devices were created and ignition. It would be great if you could discuss with PM and scope out a timeline to support multi-path as a day 1 functionality! https://issues.redhat.com/browse/MULTIARCH-401 is also the ticket for multipath support for multi-arch where there is an ask to support day1 functionality, but now it doesn't work.

Also as per this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1809534#c15 at that time it looked like we did not want to support this.

@bgilbert
Copy link
Contributor

Won't this break root filesystem resizing on 4.7 clusters with 4.6 bootimages?

@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

Won't this break root filesystem resizing on 4.7 clusters with 4.6 bootimages?

Can you expand? If the bootimage is 4.6, then rhcos-growpart.service will still kick in, no?

@bgilbert
Copy link
Contributor

I had misremembered that pivot happens from the initramfs, but in fact it doesn't, and the rhcos-growpart service should run during firstboot. Objection withdrawn. 👍

@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

OK, chatted with PM about this: we're fine with supporting multipath only from the first reboot (i.e. the first boot into the target machine-os-content). For 4.7, this will be done via a day-1 MC which adds the needed kargs. Later, we can investigate adding sugar to make that easier.

As for this PR, I do think it is fine to merge, but also it's not immediately required either.
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2021
@bgilbert
Copy link
Contributor

/approve

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jan 14, 2021
The model we're moving towards is one where multipath enablement is
configured via Ignition and so only takes effect on the next boot:

openshift/os#484 (comment)

Therefore, we don't need to worry here about support for growing a
multipathed root partition. This simplifies a big chunk of the script.
@cgwalters
Copy link
Member

/lgtm
Hooray for code deletion!

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jan 14, 2021
The model we're moving towards is one where multipath enablement is
configured via Ignition and so only takes effect on the next boot:

openshift/os#484 (comment)

Therefore, we don't need to worry here about support for growing a
multipathed root partition. This simplifies a big chunk of the script.
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bgilbert, cgwalters, jlebon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

One other thing worth mentioning is that once we have kargs-via-Ignition, then it becomes trivial to support multipath from first boot, at least for setups which are fine with rd.multipath=default. (For custom configs, we'd need to either add sugar to detect that from the initrd, and translate into an appended initrd, or just require users to do that themselves via a systemd unit in the real root).

@miabbott
Copy link
Member

From internal CI:

2021-01-14T21:32:12Z kolet: rhcos-growpart.service did not start
    --- FAIL: basic/RHCOSGrowpart (0.09s)
            cluster.go:84: kolet: Process exited with status 1
    --- PASS: basic/DbusPerms (0.24s)
FAIL, output in tmp/kola
Error: harness: test suite failed
2021-01-14T21:32:15Z cli: harness: test suite failed
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-kola", line 83, in <module>
    subprocess.check_call(subargs)
  File "/usr/lib64/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['kola', '--denylist-test', 'coreos.ignition.journald-log', '-p', 'qemu-unpriv', '--output-dir', 'tmp/kola', 'run', '--qemu-nvme=true', 'basic']' returned non-zero exit status 1.

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Jan 14, 2021
Make it check `ignition-ostree-growfs.service` now just like FCOS. Keep
the tests separate still so that we can check different sizes.

See: openshift/os#484 (comment)
@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2021

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Jan 14, 2021
Make it check `ignition-ostree-growfs.service` now just like FCOS. Keep
the tests separate still so that we can check different sizes.

See: openshift/os#484 (comment)
openshift-merge-robot pushed a commit to coreos/coreos-assembler that referenced this pull request Jan 14, 2021
Make it check `ignition-ostree-growfs.service` now just like FCOS. Keep
the tests separate still so that we can check different sizes.

See: openshift/os#484 (comment)
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jan 20, 2021
The model we're moving towards is one where multipath enablement is
configured via Ignition and so only takes effect on the next boot:

openshift/os#484 (comment)

Therefore, we don't need to worry here about support for growing a
multipathed root partition. This simplifies a big chunk of the script.
@jlebon jlebon deleted the pr/multipath-fixes branch April 23, 2023 19:56
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
The model we're moving towards is one where multipath enablement is
configured via Ignition and so only takes effect on the next boot:

openshift/os#484 (comment)

Therefore, we don't need to worry here about support for growing a
multipathed root partition. This simplifies a big chunk of the script.
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
The model we're moving towards is one where multipath enablement is
configured via Ignition and so only takes effect on the next boot:

openshift/os#484 (comment)

Therefore, we don't need to worry here about support for growing a
multipathed root partition. This simplifies a big chunk of the script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants