Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading nodes from Rancher #330

Closed
davidcassany opened this issue Jan 13, 2023 · 12 comments
Closed

Upgrading nodes from Rancher #330

davidcassany opened this issue Jan 13, 2023 · 12 comments
Milestone

Comments

@davidcassany
Copy link
Contributor

davidcassany commented Jan 13, 2023

Assume the upgrade procedure as defined in docs. So there are two ways:

  1. Manually set the image we want to upgrade to (osImage approach)
  2. Select the upgrade image from a channel

For the 1st option not much to say, it is a manual process and the administrator has full freedom to update whatever she/he wants to. I believe this option is clearly good to keep it, but should it be supported? IMHO we shouldn't and some sort of warning should be tracked in logs when used.

For the 2nd there are details to figure out

  • How to deliver such a channel? I'd manually create a container including a json list of all supported versions, the question then is how to create and maintain it, it could easily be a manual job for now
  • Do we need to set version constraints? Or fully relay on channel consistency? (all versions there are supported and exchangeable) I'd vote to assume all versions in a channel are exchangeable for now.
  • How to notify new upgrades are available?
  • How to inform about changes (aka release notes) in a new image? (security/bugfix/feature release; fixed CVEs, bugs, etc.; added features)
  • Do we support mutable image references tracking? The obvious use case would be pulling images as rancher/elemental-teal:latest and assume elemental operator is capable to keep upgrading on each new latest release. IMHO we should not support this use case, tends to be confusing.
  • Together with the above, is there a way to set automatic upgrades? So the cluster upgrades as soon as a greater image in the channel is available. Do we want to support that? I'd say not for now.

Last but not least, we should discuss the essential tests for an upgrade acceptance criteria on each release:

  • Deploy from latest elemental-operator the oldest release of elemental-teal OS (old ISO) and upgrade to the latest one
@davidcassany davidcassany added this to the 2.7.2 milestone Jan 13, 2023
@davidcassany
Copy link
Contributor Author

@agracey @rancher/elemental any input and thoughts about this topic would be appreciated.

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

If the osImage should be unsupported (I'd agree, but last word is with @agracey ), then it shouldn't be (prominently?) offered in the UI, imho. osImage and channel shouldn't be presented as options on equal level.

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

The channel json must be manually (well, I don't see an easy way to automate it) created since it's us who decide which version to support and which not.

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

Added

  • How to inform about changes...

to the initial list

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

👍 on not supporting :latest release

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

👍 on not supporting automatic upgrades. (Might be revisited based on market requirements).

@kkaempf
Copy link
Contributor

kkaempf commented Jan 13, 2023

How to notify new upgrades are available?

timestamp based ? That's already how the updater decides if an image is newer.

@agracey
Copy link

agracey commented Jan 15, 2023

  • Do we need to set version constraints? Or fully relay on channel consistency? (all versions there are supported and exchangeable) I'd vote to assume all versions in a channel are exchangeable for now.

I would expect your assumption to be true for now. I can't imagine having wholly different OSes being distinguished by tags (like some images do with :alpine and :ubuntu)

  • How to notify new upgrades are available?

I would think a job polling for new tags would be sufficient? I don't think this would generate noticibly more traffic than most CI systems already do?

  • How to inform about changes (aka release notes) in a new image? (security/bugfix/feature release; fixed CVEs, bugs, etc.; added features)

I would love to see a pattern where release notes and closed CVEs are listed in the image annotations. This would mean that an admin would be able to see the diff between versions and decide when to upgrade (reducing downtime without increasing risk). I don't know how much work that would be to build though.

  • Do we support mutable image references tracking? The obvious use case would be pulling images as rancher/elemental-teal:latest and assume elemental operator is capable to keep upgrading on each new latest release. IMHO we should not support this use case, tends to be confusing.

Agreed, listing image tags and correlating the hash and build timestamp is likely enough?

  • Together with the above, is there a way to set automatic upgrades? So the cluster upgrades as soon as a greater image in the channel is available. Do we want to support that? I'd say not for now.

IMO, this could be left to a higher level automation. We just need to make sure the API is stable and fully featured to build against.

@fgiudici
Copy link
Member

Yep, this is a discussion we really need!

  • How to deliver such a channel? I'd manually create a container including a json list of all supported versions, the question then is how to create and maintain it, it could easily be a manual job for now

so, the idea is to have a json file inside a container? 🤔
I would just keep a plain json file listing the images on a web URL. Having a container for that looks just extra overhead with not benefit. Wondering if I'm missing something.

  • Do we need to set version constraints? Or fully relay on channel consistency? (all versions there are supported and exchangeable) I'd vote to assume all versions in a channel are exchangeable for now.

👍🏼 totally agree!

  • How to notify new upgrades are available?

I would stick with @agracey idea: pull the json from time to time. Overhead should be not noticeable.

  • How to inform about changes (aka release notes) in a new image? (security/bugfix/feature release; fixed CVEs, bugs, etc.; added features)

🤔 if we use a json, I would just add a reference to the official release notes (which I expect mainly be for the OS release) directly in the json. We can even think about having OS release notes and elemental ones (to separate OS changes an elemental proper ones) and add them only if/when needed.

  • Do we support mutable image references tracking? The obvious use case would be pulling images as rancher/elemental-teal:latest and assume elemental operator is capable to keep upgrading on each new latest release. IMHO we should not support this use case, tends to be confusing.

👍🏼 yep, makes sense. Especially since the json will get updated with all the available versions.

  • Together with the above, is there a way to set automatic upgrades? So the cluster upgrades as soon as a greater image in the channel is available. Do we want to support that? I'd say not for now.

I am ok to not have it for now... but at some point this is something we should allow. Like the @agracey idea of leaving it to an higher level automation tool.

@davidcassany
Copy link
Contributor Author

so, the idea is to have a json file inside a container? thinking
I would just keep a plain json file listing the images on a web URL. Having a container for that looks just extra overhead with not benefit. Wondering if I'm missing something.

well the benefit or convenience of using a container is that we already have infrastructure and processes to actually deliver it, we can use the container registry. I'd also go to a web server, but I am clue less about how this should be handled form a maintenance point of view, this goes beyond the regular process of publishing RPM repositories or containers in a registry form OBS. In any case this is a tiny implementation detail, the relevant part is that we go for building and maintaining a list of available images in a json format compatible with elemental-operator.

How to notify new upgrades are available?

I would think a job polling for new tags would be sufficient? I don't think this would generate noticibly more traffic than most CI systems already do?

Sure the polling strategy is already in place, my question is more in the lines of should there be some logic somewhere to raise a notification somewhere (in the UI?) to make the admin aware new updates are available (imagine an important security fix)? I believe, for now, we can expect the admin to be proactive and manually check available updates time to time. But I am convinced we will need some sort of notification mechanism so admin can react on unexpected important updates (security fixes mostly).

How to inform about changes (aka release notes) in a new image? (security/bugfix/feature release; fixed CVEs, bugs, etc.; added features)

This is a though topic, I wonder whats being done for the BCI images on that regard. I'll try contact them to check if someone is already doing such a thing within the company for container images. OBS gives us *.packages list file containing a full list of all packages. This can be diffed across releases, however how to map that into actual bug fixes sounds complex. For that matter KIWI builds produce a *.changes file including all the change log of every single package, if diffed across releases one can parse bugzilla tickets... I wonder if something like this could be done for Docker builds. Feels there is a lot to explore in that area.

@davidcassany
Copy link
Contributor Author

davidcassany commented Jan 20, 2023

To sum up the discussion/comments, for now (short term), I believe we can state:

  • We use the ManagedOSVersionChannel and ManagedOSVersion CRDs to list and deliver updates. For that json list of available images is required.
  • We consider any version within the channel to be compatible with any release of the elemental-operator. We could always set a new channel when incompatibilities appear (aka new SLE base image).
  • We are are not addressing update notifications (UI notifications, emails or stuff like that).
  • Each teal release has at least one unique tag in the registry and this is being used within the channel setup (this means including the build number within the channel tags list).
  • Automatic updates are out of scope

Action items:

@agracey @rancher/elemental if you are fine with it I am willing to close this card and create new ones for each action item.

@davidcassany
Copy link
Contributor Author

Closing since follow up issue are created. They are linked and listed within the comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

4 participants