Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service #1706

Merged

Conversation

cgwalters
Copy link
Member

@cgwalters cgwalters commented Apr 30, 2020

This is aiming to fix:
https://bugzilla.redhat.com/show_bug.cgi?id=1829642
AKA
#1215 (comment)

Basically we have our systemd units dynamically differentiate between
"4.2" and "4.3 or above" by looking at the aleph version.


https://bugzilla.redhat.com/show_bug.cgi?id=1829642

Background

In the beginning, we had this concept of pivot.service whose whole job was to pull a new machine-os-content and reboot, running Before=kubelet.service.
More: OSUpgrades.

That was simple, and worked well! Life was good. Everyone was relaxing on the beach.

Then, we needed to handle kernel arguments (specifically adding nosmt).

But a core problem here is that we have no mechanism to update bootimages by default - so every change we make that needs to happen on this "firstboot" becomes an upgrade hazard.

Further, we had a strong split between the pivot code (day 0) and what happened in the MCO. So we started baking in the MCD to the host; #868 is a notable commit here.

But still today, the MCO needs to handle having even very old bootimages (from 4.1) scale up and join the cluster.

#891 is a notable commit here; basically the MCS serves Ignition which contains a pair of systemd units:

machine-config-daemon-firstboot.service: Replaces pivot.service - this handles everything the MCO knows how to handle day 1/2, such as kernel arguments too.

machine-config-daemon-host.service: Run by rpm-ostree to perform upgrades. This is not enabled by default, but is started by the MCD (both via the -firstboot.service as well as "day 2" when the MCD runs as a daemonset).

Now wait...trying to re-understand the difference between them, I found #1215 (landed for 4.4).

Before that commit...we basically didn't run the new code AFAICS because the MCS still today writes /etc/pivot/image-pullspec (for compatibility with 4.1 bootimages). See this commit which added the code.

This is aiming to fix:
https://bugzilla.redhat.com/show_bug.cgi?id=1829642
AKA
openshift#1215 (comment)

Basically we have our systemd units dynamically differentiate between
"4.2" and "4.3 or above" by looking at the aleph version.
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2020
@cgwalters
Copy link
Member Author

A few other things we should fix here but let's do them separately, just noting for myself:

  • Before=crio.service at least
  • OnFailure=emergency.target or something? We at least need to not just have things stumble on if the unit fails; we could at least make our units RequiredBy=kubelet.service perhaps

@cgwalters
Copy link
Member Author

Also...random thought, why did we add the MCD to the host versus pulling it from a container image and just running that via podman? I am not sure it came up...this is dim in my memory now. But it would seem to completely avoid this type of problem.

@ashcrow
Copy link
Member

ashcrow commented Apr 30, 2020

So it looks like the issue was a target change that we didn't hold backwards compatibility for?

@ashcrow
Copy link
Member

ashcrow commented Apr 30, 2020

Also...random thought, why did we add the MCD to the host versus pulling it from a container image and just running that via podman? I am not sure it came up...this is dim in my memory now. But it would seem to completely avoid this type of problem.

We were pulling it as a container but decided to bake it in to the OS in #801. Essentially at that time pivot was it's own binary on the host and MCD was starting to take over that responsibility.

@cgwalters
Copy link
Member Author

So it looks like the issue was a target change that we didn't hold backwards compatibility for?

Well...at the time I did a lot of testing with 4.1 and I think we did 4.2 too - the thing is the code was only partially enabled partly because I think I hadn't fully tested enabling it and I was worried about issues like this. This change enabled it fully.

@ashcrow
Copy link
Member

ashcrow commented Apr 30, 2020

No worries and no blame! Things move fast 😄 Just making sure I understand what may have happened.

[Service]
# Need oneshot to delay kubelet
Type=oneshot
ExecStart=/usr/libexec/machine-config-daemon pivot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be dumb, but will this run in 4.1 as well on this conditionpathexists? 4.1 has pivot service in the OS directly so that's gonna be run but this will as well or am I making this up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. So...I think what's going on here right now is you'll notice that
machine-config-daemon-host.service does:

  # If pivot exists, defer to it.  Note similar code in update.go
  ConditionPathExists=!/usr/lib/systemd/system/pivot.service

I think the idea there was that this firstboot.service will try to do the "other stuff" like kernel arguments and the defer the actual "os update' to pivot.service.

But I'm honestly not sure that really makes sense - we might as well take over the whole thing?

This needs investigation - but we're also doing the same thing that the current code is doing so we can't (shouldn't) be making anything worse, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agreed I think it makes sense and it’ll be skipped indeed 🤔

@LorbusChris
Copy link
Member

LorbusChris commented Apr 30, 2020

Also...random thought, why did we add the MCD to the host versus pulling it from a container image and just running that via podman? I am not sure it came up...this is dim in my memory now. But it would seem to completely avoid this type of problem.

I don't know know the answer to that but this is what's done in OKD on FCOS which doesn't ship with MCD. How feasible would it be to change this in OCP to pull from image there?

@LorbusChris
Copy link
Member

I should've read on a bit 😄
pivot isn't a standalone thing anymore, so my Q above stands

@runcom
Copy link
Member

runcom commented Apr 30, 2020

I don't know know the answer to that but this is what's done in OKD. How feasible would it be to change this in OCP to pull from image there?

I can't remember either right now but sure that could be something we could try and move to (as OKD is doing as Christian said 👍 )

@cgwalters
Copy link
Member Author

I think this works! I deployed this PR on an affected cluster, made sure the base AMI was from 4.2, then scaled up the machineset and:

 journalctl -u machine-config-daemon-firstboot-v42
-- Logs begin at Thu 2020-04-30 21:18:14 UTC, end at Thu 2020-04-30 21:23:46 UTC. --
Apr 30 21:18:43 ip-10-0-174-54.us-east-2.compute.internal systemd[1]: Starting Machine Config Daemon Firstboot (4.2 bootimage)...
Apr 30 21:18:46 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:18:46.335721    1226 rpm-ostree.go:356] Running captured: rpm-ostree status --json
Apr 30 21:18:46 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:18:46.381157    1226 rpm-ostree.go:152] Previous pivot: quay.io/openshift-release-dev/ocp->
Apr 30 21:18:46 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:18:46.381869    1226 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config>
Apr 30 21:18:50 ip-10-0-174-54 podman[1398]: 2020-04-30 21:18:50.532481004 +0000 UTC m=+1.517709021 system refresh
Apr 30 21:19:18 ip-10-0-174-54 podman[1398]: 2020-04-30 21:19:18.939230615 +0000 UTC m=+29.924458560 image pull  
Apr 30 21:19:18 ip-10-0-174-54 machine-config-daemon[1226]: d64290e4a966531354fd074a06fbb682dd04e8eaf2fcb1d82b05f32d395155bc
Apr 30 21:19:19 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:19:19.053111    1226 rpm-ostree.go:356] Running captured: podman inspect --type=image quay>
Apr 30 21:19:19 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:19:19.535025    1226 rpm-ostree.go:356] Running captured: podman create --net=none --name >
Apr 30 21:19:19 ip-10-0-174-54 podman[1534]: 2020-04-30 21:19:19.683739826 +0000 UTC m=+0.120509216 container create 3a938326351a2b67fdafc1549d2fd941c1b32e6965>
Apr 30 21:19:19 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:19:19.710065    1226 rpm-ostree.go:356] Running captured: podman mount 3a938326351a2b67fda>
Apr 30 21:19:19 ip-10-0-174-54 podman[1543]: 2020-04-30 21:19:19.78647982 +0000 UTC m=+0.047722302 container mount 3a938326351a2b67fdafc1549d2fd941c1b32e696586>
Apr 30 21:19:19 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:19:19.789531    1226 rpm-ostree.go:234] Pivoting to: 44.81.202004260825-0 (2062bce64e49321>
Apr 30 21:20:15 ip-10-0-174-54 podman[1597]: 2020-04-30 21:20:15.961623097 +0000 UTC m=+0.090985160 container remove 3a938326351a2b67fdafc1549d2fd941c1b32e6965>
Apr 30 21:20:16 ip-10-0-174-54 podman[1608]: 2020-04-30 21:20:16.563122363 +0000 UTC m=+0.563236555 image remove d64290e4a966531354fd074a06fbb682dd04e8eaf2fcb1>
Apr 30 21:20:16 ip-10-0-174-54 machine-config-daemon[1226]: I0430 21:20:16.573172    1226 pivot.go:247] Rebooting due to /run/pivot/reboot-needed
Apr 30 21:20:16 ip-10-0-174-54 systemd[1]: Stopped Machine Config Daemon Firstboot (4.2 bootimage).

all looks good! It joined the cluster.

@wking
Copy link
Member

wking commented Apr 30, 2020

/retitle Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service

@openshift-ci-robot openshift-ci-robot changed the title templates: Add a special machine-config-daemon-firstboot-v42.service Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service Apr 30, 2020
@openshift-ci-robot
Copy link
Contributor

@cgwalters: An error was encountered searching for bug 1829642 on the Bugzilla server at https://bugzilla.redhat.com:

Get https://bugzilla.redhat.com/rest/bug/1829642?api_key=CENSORED: dial tcp: i/o timeout
Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ashcrow
Copy link
Member

ashcrow commented Apr 30, 2020

4.4 backport #1707

@kikisdeliveryservice
Copy link
Contributor

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 30, 2020
@openshift-ci-robot
Copy link
Contributor

@kikisdeliveryservice: This pull request references Bugzilla bug 1829642, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cgwalters
Copy link
Member Author

Has anyone had a chance to independently verify this yet?

@openshift-ci-robot
Copy link
Contributor

@cgwalters: This pull request references Bugzilla bug 1829642, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuqi-zhang
Copy link
Contributor

My previous cluster broke before it got to it so am trying again

@kikisdeliveryservice
Copy link
Contributor

hmm these are weird test errors in e2e-aws, will take a look but also

/retest

@cgwalters
Copy link
Member Author

cgwalters commented Apr 30, 2020

Some tips for testing this out/reproducing. The most "customer-like" reproduction scenario is:

  • Install 4.2
  • Upgrade to 4.4
  • Try to scale up a machineset (join a new node)

If it fails, the machine will be stuck in Provisioning; sshing to the node will show machine-config-daemon-firstboot.service failing (among other services).

But, I am 95% sure it will also reproduce to just:

  • Install 4.4/4.3 (don't need to oc adm upgrade at all)
  • Edit a machineset (could be existing or a new one) to use a 4.2 bootimage. For example in AWS make a new machineset and find the AMI from that file.
  • Boot a single node in that machineset

I'm testing that we successfully fail with the latter right now.

@cgwalters
Copy link
Member Author

cgwalters commented Apr 30, 2020

I tagged the release image from this PR to registry.svc.ci.openshift.org/rhcos-devel/walters-42-bootimage:latest to help testing.

And just for reference I find this "tag release image from PR" workflow to be very useful, so here's a hand command to do it. First find the CI namespace from the artifacts, then:
$ oc tag --source=docker registry.svc.ci.openshift.org/ci-op-22vyst17/release:latest rhcos-devel/walters-42-bootimage:latest
We've set up the rhcos-devel namespace to have publicly accessible pulls, you could also use quay.io or whatever too.

@sinnykumari
Copy link
Contributor

I understand this will fix the m-c-d-firstboot-complete issue. But I am not sure if it solves issues like the crio https://bugzilla.redhat.com/show_bug.cgi?id=1829642#c8 .

It is a bit surprising that when a 4.2 cluster gets upgraded to 4.4 and we scale-up, we are using 4.2 based bootimage to boot node but it uses ignition config served by 4.4. Shouldn't bootimage and ignition config be from same time (i.e. ignition config which was served initially with 4.2 bootimage)? correct me, if I misunderstood something.

@sinnykumari
Copy link
Contributor

Also...random thought, why did we add the MCD to the host versus pulling it from a container image and just running that via podman? I am not sure it came up...this is dim in my memory now. But it would seem to completely avoid this type of problem.

Related discussion seem to have happened here onward #798 (comment) and related pr is #904 . I vaguely remember that at that time we decided to go with supporting karg day1 for 4.3+ cluster but we didn't foresee that we would have upgrade issue like the one this PR is about.

@runcom
Copy link
Member

runcom commented May 1, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 339d337 into openshift:master May 1, 2020
@openshift-ci-robot
Copy link
Contributor

@cgwalters: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1706. Bugzilla bug 1829642 has been moved to the MODIFIED state.

In response to this:

Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sinnykumari added a commit to sinnykumari/machine-config-operator that referenced this pull request May 21, 2020
Also, add machine-config-daemon-firstboot-v42.service so that new
nodes through machine-api comes up as expected on a 4.2 to 4.3
upgraded cluster

Backport of PRs:
- openshift#1366
- openshift#1706
cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 28, 2020
This came up in discussions around here:
openshift#1706 (comment)

There's no reason to start crio either, and it's confusing
to do so because in that bug it made us think there was a
crio problem when there wasn't.

Add `crio-wipe.service` here too because it's part of `crio.service`.
cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Jun 9, 2020
I haven't verified this yet but I think this is the core fix
for https://bugzilla.redhat.com/show_bug.cgi?id=1842906

Basically the MCS injects both `/etc/pivot/image-pullspec` which
4.1 bootimages need, *and* we get `/etc/ignition-machine-config-encapsulated.json`
which starts `machine-config-daemon-firstboot.service`.

We should only have one running.  I think we have a setup like this:

4.1: Uses `pivot.service` which lives on the host already
4.2: Uses `machine-config-daemon-firstboot-v42.service` from
     openshift#1706
≥4.3: Uses `machine-config-daemon-firstboot.service`

So let's stop starting this by default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants