Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Ign spec 3 upgrade #1873

Merged
merged 2 commits into from Jul 11, 2020

Conversation

yuqi-zhang
Copy link
Contributor

@yuqi-zhang yuqi-zhang commented Jun 25, 2020

Based on #1703, this is a (messy and hacky) MCO that renders all existing configs to spec 3. Currently tested to work with a default install upgrading to this.

Unit and e2e tests not included

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 25, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 25, 2020
@yuqi-zhang yuqi-zhang requested review from LorbusChris and runcom and removed request for mtrmac and umohnani8 June 25, 2020 18:32
}

// Test: have some situations de-duplicate and see what happens, will merge into main function
func ParseAndConvertConfigWithDeduplication(rawIgn []byte) (ign3types.Config, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will probably merge this into the main function, just trying to think if there's scenarios we won't want to de-duplicate


// TODO jerzhang check if you can have contents and dropins in the same file
unitNameMap := map[string]bool{}
var outUnits []ign2types.Unit
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so units is somewhat wonky since it needs to consider dropins when doing a merge, now consider this scenario:

Unit A:
  Content: AC
  Dropins:
    Dropin: AD
Unit B:
    Content: BC
  Dropins:
    Dropin: BD

Should the final look like:

  Unit rendered:
    Content: BC
  Dropins:
    Dropin: AD
    Dropin: BD

or

Unit rendered:
    Content: BC
  Dropins:
    Dropin: BD

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favor of the former, having both drop ins added.

Another case that may also be considered is if there is a unit in one template, and a dropin only in another template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of a unit in one and a dropin in another, both the existing spec 2 behaviour and the ignition spec 3 merge function keeps both, so I think we should keep that.

And agreed, having both dropins probably makes it more consistent

return err
}
if needUpdate {
_, err = ctrl.client.MachineconfigurationV1().MachineConfigs().Update(context.TODO(), config, metav1.UpdateOptions{})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-rendered configs I probably need a better location to update them, there is also the Boostrap path to consider

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that said with this update in place we will no longer be able to rollback, so maybe for user-created configs I should keep it in the version that it was on?

return err

switch typedConfig := ignconfigi.(type) {
case ign3types.Config:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for validation I'm thinking we should keep validating in the config that it was created in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesnt the ign translator func also validate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right in this case I meant validating the on-disk config based on the rendered-config

If the cluster is still on a spec 2 rendered-config version (which shouldn't happen that often, only should happen if a) templates changed causing a race or b) a rendered-config was created during a middle of an update or c) a previous rendered-config update was failing), I'm thinking we should continue to validate in spec 2, instead of translating up and then validating in spec 3.

Maybe we could get away with translating first, just have to be a bit more careful

@yuqi-zhang
Copy link
Contributor Author

With #1766 rebased this also works on first installs (take a 4.6 installer and override release image with this MCO), which brings up everything in spec 3 automatically.

Next steps:

@LorbusChris
Copy link
Member

@yuqi-zhang I'll leave #1778 open for now, as it might be helpful as a reference. The upgrade path isn't handled there yet, because there is no deduplication of units happening (we'll probably also want this for files).

@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 5, 2020
@yuqi-zhang yuqi-zhang changed the title WIP: Ign spec 3 upgrade *: Ign spec 3 upgrade Jul 5, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 5, 2020
@yuqi-zhang
Copy link
Contributor Author

Updated tests in reference to #1778. Probably need to add a few more but hopefully the existing ones pass now.

Also cleaned up the code a bit. Rebased on #1703 again and tested manually to pass upgrades + install.

@yuqi-zhang
Copy link
Contributor Author

/test e2e-gcp-upgrade

@yuqi-zhang
Copy link
Contributor Author

/test e2e-aws

@yuqi-zhang
Copy link
Contributor Author

/test e2e-aws

flaking on various other tests, but just to make sure...

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 9, 2020
@yuqi-zhang
Copy link
Contributor Author

/test e2e-openstack

@LorbusChris
Copy link
Member

/test e2e-azure
/test e2e-openstack
/test e2e-ovirt
/test e2e-ovn-step-registry
/test e2e-vsphere

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 9, 2020
@yuqi-zhang
Copy link
Contributor Author

Squashed commits and added better message

@LorbusChris
Copy link
Member

/test e2e-vsphere
/test e2e-openstack
/test e2e-ovirt
/test e2e-azure

@yuqi-zhang
Copy link
Contributor Author

I don't expect any of those tests to pass except /test e2e-openstack , which would be nice to make sure

azure we're out of quota, and the other ones we haven't passed in awhile.

@runcom
Copy link
Member

runcom commented Jul 9, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 9, 2020
@LorbusChris
Copy link
Member

:shipit:

@kikisdeliveryservice
Copy link
Contributor

/test e2e-ovirt

@kikisdeliveryservice
Copy link
Contributor

/test e2e-gcp-upgrade

@kikisdeliveryservice
Copy link
Contributor

I don't expect any of those tests to pass except /test e2e-openstack , which would be nice to make sure

azure we're out of quota, and the other ones we haven't passed in awhile.

Can you run the e2e-azure test on your own azure cluster? or is quota gone for everything?

@LorbusChris
Copy link
Member

now that #1792 has merged, does this need an update to switch the pointer config to v3 as well?
ref: https://github.com/openshift/machine-config-operator/pull/1792/files#diff-5926e0cec606d949b66f823310790298R142

This is the commit that aims to upgrade the spec version used
in the MCO from ignition spec 2 to spec 3. With this change
in place:

 - New clusters will have all machineconfigs in spec 3, minus
   those created by the installer.

 - Existing clusters will re-render templates into spec 3 upon
   an upgrade, and render all rendered-machineconfigs into
   spec 3 with translation from spec 2. Existing rendered-configs
   and user-created configs will not be translated in-place, but
   rather re-translated anytime rendering happens.

All clusters from this point on will support new MCs with ignition
on spec 2 or spec 3.

Other key points of this PR:

 - A de-duplication function is added, so spec 2 -> spec 3 translation
   can happen. This should not change existing behaviour. A unit test
   is added to show the expected outcome.

 - The FCCT usage is slightly modified to account for duplication.
   Eventually we should use TranslateBytes() from FCCT to get
   verification.

 - Validation of on-disk state will still be parsed in the ignition
   spec version the rendered-config was created in. All other locations
   translate the config from spec 2 to spec 3 (e.g. Reconcilable).

Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
This will allow us to manage a secret to scale up nodes with
ignition v2 binary installed.

Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 10, 2020
@yuqi-zhang
Copy link
Contributor Author

Rebased again and modified stub secret generation to v3

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 10, 2020

@yuqi-zhang: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-vsphere fc052b2e0d2e4ed4c2a6bce1a96aaa7d47c1ad7f link /test e2e-vsphere
ci/prow/e2e-azure fc052b2e0d2e4ed4c2a6bce1a96aaa7d47c1ad7f link /test e2e-azure
ci/prow/e2e-ovirt fc052b2e0d2e4ed4c2a6bce1a96aaa7d47c1ad7f link /test e2e-ovirt

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@yuqi-zhang
Copy link
Contributor Author

/hold cancel

This is ready to go

Can you run the e2e-azure test on your own azure cluster? or is quota gone for everything?

Unfortunately the CI account and my account is the same namespace I think

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 10, 2020
@yuqi-zhang
Copy link
Contributor Author

/retest

@runcom
Copy link
Member

runcom commented Jul 10, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 10, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: runcom, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 24e24e8 into openshift:master Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants