Atomic: OSTree Update

Stef Walter edited this page Sep 29, 2015 · 24 revisions
Clone this wiki locally

Cockpit should allow people to update their OSTree based Atomic system.

Notes

  • OSTree has a simple to represent update/rollback model.
  • Requires a reboot after an update/rollback.
  • Can we figure out if an update represents a security update?
    • Or if security updates are available and thus the system needs to be updated/rebooted.
  • Not clear if we want to include the concept of switching OSTree branches/channels in Cockpit.
  • This feature will probably be simple for the time being.
  • This feature doesn't include updating containers, which is pretty undefined at this point.
  • https://coreos.com/assets/images/screenshots/GroupList-HiDPI.png

Stories

User stories, workflow that will drive design.

User stories:

Robert is a sysadmin at a small IT company. They have 3 servers, one run a file server, one that runs their build server and one that runs the company website. They run Atomic on all 3. Due to a recent security breach in the kernel that affects all 3 machines, he needs to update the system to a newer version.

George runs a startup with two friends of his. They develop a messaging service that has a backend part running on top of CentOS and a app for Android and iPhone. They do all their testing in virtual machines on top of Atomic. They are not concerned too much with security updates, but they do have need a feature that is only part of the latest Atomic release. They are working against a deadline, so if anything goes wrong, they need a smooth downgrade to an earlier version of Atomic.

Workflows:

Robert:

  • Robert reads about the fatal kernel security hole in the news. Since the web server is public, he's getting nervous.
  • After logging in to the web server via Cockpit, he's alerted that there is a system upgrade available. He verifies that the update indeed contains the kernel security fix he needs.
  • At a time where he knows the website traffic is usually low, he applies the update and reboots the machine.
  • The system boots, the machine now uses the kernel with the security hole fixed.
  • Robert verifies that all services and containers are running as they should and that nothing broke during the upgrade.

George:

  • George gets notified that a new version of Atomic is available in one of the VMs they develop against.
  • He looks at the change details, sees that it contains a couple of bug fixes and a new version of systemd and thinks "sure, why not?". He notifies he's colleagues that he'll quickly reboot that VM. He applies the update and reboots.
  • After the reboot, he logs into the machine again, but realizes that their software doesn't work quite well with the newest Atomic. They have an important deadline coming in a week, so in retrospect it had made more sense to do the upgrade after the deadline.
  • He therefore chooses to downgrade to the version he was running before the upgrade.
  • He selects the previous update snapshot from Cockpit and reboots the VM.
  • Everything is back to normal.

Wireframe

Wireframe

Feedback

Please give feedback on the above!

  • How does Robert know where to get the update from, and where does he get the update from? (Ju Lim)
  • I don't know of any SysAdmin who does not perform some level of backup of the system or important config files as well as ensuring a backup exists first before applying an update. (Ju Lim)
  • We can also assume that Robert is performing the update during a schedule maintenance window and not just anytime especially since a reboot is needed. (Ju Lim)
    • Fixed (Andreas)
  • After applying the update, I would assume that Robert also would want to verify the update / patch has been applied through some level of tests or ensuring his apps can start up and operate normally. (Ju Lim)
    • Fixed (Andreas)
  • same concerns for Robert's workflow applies to George's. (Ju Lim)
    • Fixed (Andreas)
  • Do we support multiple trees or only 1 previous version? Just wondering if there are multiple, we'd want to consider George being able to look and find the previous tree easily vs. other trees. (Ju Lim)
    • Good question. I have no idea what's planned there. (Andreas)
  • We need a button to 'check for updates'. There is no automatic update scheduling currently in rpm-ostree. It would be nice to also provide a ui to setup a systemd timer to check for updates on a regular basis.
    • fixed in the new version of the mockups
  • Since rpm-ostree by default only keeps 2 deployments around (current, previous) we may want to optimize the design for that use case.
    • fixed in the new version of the mockups
  • It might be good to have some sort of indication of when the default deployment is not the booted deployment. Do we want to offer an option to update or rollback without reboot?
  • Future: Expose signatures in the UI somehow?