Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/daemon: implement OS version validation #60

Merged
merged 3 commits into from
Sep 18, 2018

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Sep 13, 2018

Teach the MCD to check the current OS version and upgrade the host. This
essentially implements the OSImageURL part of the MachineConfig
spec.

Since we're in a chroot anyway, I just took the easy path of exec'ing
rpm-ostree and pivot directly. Which is good, because right now only
pivot knows how to upgrade RHCOS from an oscontainer image pull spec.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 13, 2018
@jlebon
Copy link
Member Author

jlebon commented Sep 13, 2018

I'm going to hold this for now. It works fine though I haven't reprovisioned my cluster in a while, and the installer and m-c-o move at the speed of light...

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2018

// just make it a hard error if we somehow don't have any deployments
if len(rosState.Deployments) == 0 {
return nil, fmt.Errorf("Not currently booted in a deployment")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is entirely the same error as the one at the end of the function, and the one at the end of the function will be hit if the length is zero and this isn't here, do we need this if statement at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I meant to drop this but forgot!

@@ -400,6 +403,11 @@ func getFileOwnership(file ignv2_2types.File) (int, int, error) {
return uid, gid, nil
}

func (dn *Daemon) updateOS(oldConfig, newConfig *mcfgv1.MachineConfig) error {
glog.Infof("Updating OS to %s", newConfig.Spec.OSImageURL)
return runPivot(newConfig.Spec.OSImageURL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is pivot idempotent? this is going to get run times when there is not an actual os update, because everything gets refreshed if anything is different (eg if we add a file, we will run pivot with our currently booted os image url even though it's not different)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's idempotent (see https://github.com/ashcrow/pivot/pull/14), but I should add a check here as well so we don't waste time running it.

@@ -400,6 +403,11 @@ func getFileOwnership(file ignv2_2types.File) (int, int, error) {
return uid, gid, nil
}

func (dn *Daemon) updateOS(oldConfig, newConfig *mcfgv1.MachineConfig) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though this isn't exported, it's probably worth adding a comment describing it's functionality to keep it in line with the other functions that perform parts of an update.

)

// Subset of `rpm-ostree status --json`
// https://github.com/projectatomic/rpm-ostree/blob/bce966a9812df141d38e3290f845171ec745aa4e/src/daemon/rpmostreed-deployment-utils.c#L227
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this ref

@abhinavdahiya
Copy link
Contributor

@jlebon can you run ./hack/update-generated-bindata.sh to update /manifests to pkg/operator/assets/bindata.go

if dn.checkFiles(desiredConfig.Spec.Config.Storage.Files) &&
dn.checkUnits(desiredConfig.Spec.Config.Systemd.Units) {
dn.checkUnits(desiredConfig.Spec.Config.Systemd.Units) &&
isDesiredOS {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: My brain does not like how this if statement is structured, but it's not problematic.

return true, nil
}

// error is nil, as we successfully decided that validate is false
return false, nil
}

func (dn *Daemon) checkOS(osImageURL string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It would be helpful to document what the non-error return signifies.

@jlebon
Copy link
Member Author

jlebon commented Sep 14, 2018

Alrighty, comments addressed! Tested on top of the latest everything. Dropping hold.
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2018
@jlebon
Copy link
Member Author

jlebon commented Sep 14, 2018

I will note the daemon cannot yet go through with rebooting the node due to some missing role bindings:

I0914 21:07:35.208898    6711 update.go:44] Update completed. Draining the node.
E0914 21:07:35.215929    6711 daemon.go:85] Marking degraded due to: pods is forbidden: User "system:serviceaccount:openshift-machine-config-operator:machine-config-daemon" cannot list pods at the cluster scope: no RBAC policy matched

Though let's keep that separate.

@jlebon
Copy link
Member Author

jlebon commented Sep 14, 2018

I filed #66 so we remember it.

@jlebon
Copy link
Member Author

jlebon commented Sep 14, 2018

Oh one more thing. Once this merges, the MCD in a fresh cluster will error out because the osImageURL currently embedded in machine configs is set to ://dummy. So we're essentially at the point where the installer actually needs to use https://github.com/ashcrow/pivot.

If we're not ready for that yet, we could work around this for now by considered a MC with ://dummy a match on nodes which haven't yet been updated.

@abhinavdahiya
Copy link
Contributor

abhinavdahiya commented Sep 14, 2018

@jlebon everything regarding the MachineConfig generation for masters and worker, is controller by this repo.

https://github.com/openshift/machine-config-operator/blob/master/pkg/controller/template/render.go#L208

@abhinavdahiya
Copy link
Contributor

Also #66 (comment)

Copy link
Member

@ashcrow ashcrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will defer to a second review by @sdemos for merging.

@jlebon
Copy link
Member Author

jlebon commented Sep 17, 2018

@jlebon everything regarding the MachineConfig generation for masters and worker, is controller by this repo.

It's not that simple though. This is dependent on the installer moving over from using the RHCOS AMI/qcow2 directly to using the OS container image and pivot.

@jlebon
Copy link
Member Author

jlebon commented Sep 17, 2018

If we're not ready for that yet, we could work around this for now by considered a MC with ://dummy a match on nodes which haven't yet been updated.

OK, I pushed another commit for now which does this! ⬆️

@ashcrow
Copy link
Member

ashcrow commented Sep 17, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2018
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2018
@jlebon
Copy link
Member Author

jlebon commented Sep 17, 2018

Sorry for the race condition there. I just updated that last commit to use backticks to avoid escaping the double quotes.

@ashcrow
Copy link
Member

ashcrow commented Sep 17, 2018

@jlebon not a problem 👍

@ashcrow
Copy link
Member

ashcrow commented Sep 17, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2018
@sdemos
Copy link
Contributor

sdemos commented Sep 17, 2018

looks like we need a person with approver on the repo at large to /lgtm as well for the changes to the manifest.

@ashcrow
Copy link
Member

ashcrow commented Sep 17, 2018

Opened #73 as I think it may make sense to let us modify MCD manifest files.

@jlebon
Copy link
Member Author

jlebon commented Sep 17, 2018

Oh one more thing. Once this merges, the MCD in a fresh cluster will error out because the osImageURL currently embedded in machine configs is set to ://dummy. So we're essentially at the point where the installer actually needs to use https://github.com/ashcrow/pivot.

I opened openshift/installer#267 to raise visibility on this.

@ashcrow
Copy link
Member

ashcrow commented Sep 18, 2018

/lgtm

@jlebon
Copy link
Member Author

jlebon commented Sep 18, 2018

Hmm, I think it might be blocking on the fact that we're adding a new pkg/utils?

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2018
@ashcrow
Copy link
Member

ashcrow commented Sep 18, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2018
@abhinavdahiya
Copy link
Contributor

seems like daemon is the only one calling the pkg/utils and i personally don't like packages that are utils. I would recommend you created one under daemon and name it something like Executor or Runner etc..

Teach the MCD to check the current OS version and upgrade the host. This
essentially implements the 'OSImageURL` part of the `MachineConfig`
spec.

Since we're in a chroot anyway, I just took the easy path of exec'ing
`rpm-ostree` and `pivot` directly. Which is good, because right now only
`pivot` knows how to upgrade RHCOS from an oscontainer image pull spec.
One subtle thing that might be hard to appreciate in the previous patch
is that we're running `pivot` from inside the container, which in turn
runs `podman` to access the contents of the oscontainer. For podman to
work right, we need `hostPID`. Also need `hostNetwork` for the
`rpm-ostree` client to talk to the daemon over D-Bus.

Another way to see this: make the MCD feel even more like it's running
directly on the host. (Though ideally, if `rpm-ostree` learns to pull
content from the oscontainer itself, this would get easier).
Until the installer learns to pivot ➰ and we have an accurate starting
`OSImageURL` entry.
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed lgtm Indicates that a PR is ready to be merged. labels Sep 18, 2018
@jlebon
Copy link
Member Author

jlebon commented Sep 18, 2018

OK, updated to just keep those helper functions under pkg/daemon. ⬆️

@ashcrow
Copy link
Member

ashcrow commented Sep 18, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, jlebon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit a91ab52 into openshift:master Sep 18, 2018
osherdp pushed a commit to osherdp/machine-config-operator that referenced this pull request Apr 13, 2021
update readme with details on config fields, behavior, troubleshooting
@jlebon jlebon deleted the pr/rpm-ostree branch May 1, 2023 15:32
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants