Add a Drogue IoT MQTT connector #904

ctron · 2022-12-01T16:04:44Z

This PR adds an MQTT based connector for Drogue IoT.

The goal is to provide remote control and monitoring capabilities for an IoT cloud side backend, like Drogue IoT.

The main changes (aside from adding the code to perform this) are:

Add capabilities to the update and ostree actor to broadcast its current state
Allow to set an "update trigger", which defined the logic to identify a new update (currently either Cincinnati or Drogue)
Keep the existing "update strategy" as this still makes sense, deciding when the update should be applied
The state change from "no update" to "update requested" is now defined by the update trigger. For Cincinnati this works as before, for Drogue this just waits (until a remote commands triggers the change).

In case updates are disabled, it behaves as before. In case the Drogue agent is configured "readonly" it will only report the state, but changes are triggered through Cincinnati.

Everything is gated by a feature flag (drogue), which is disabled by default.

I also know this PR contains a few changes in the makefile and documentation which helped me test. They are not considered for merging.

Used to report the current state, and schedule updates from the cloud.

lucab · 2022-12-08T09:57:56Z

Thanks for the PR, interesting experiment!

I do agree that there several small quality-of-life improvements in here that could be split out and easily merged.

On the new core logic, I think this design does not exactly align with FCOS release engineering flow. Notably, FCOS updates are pull-based / level-triggered, and do not individually target specific nodes nor releases. This PR is instead somehow trying to sidestep the update graph and making the central update server aware of individual nodes (and overall scheduling updates in a push-based / event-triggered way).

While I don't know the whole context for this feature implementation (and I'm not directly working on Zincati anymore) I think that patching Zincati this way is probably not the best way to go.
I would instead suggest looking into some other possible approaches:

moving this MQTT logic to a container listening on localhost and pointing Zincati to it. A local container can implement whatever custom logic is required, and then expose local Cincinnati and Fleetlock endpoints. This kind of containerized logic is overall well-aligned with the goals and typical usages of FCOS.
avoiding Zincati at all. If you are not using any of FCOS releng features (update graph, windowed rollouts, etc.) then you most likely just don't need Zincati. Simply disable it and then directly drive rpm-ostree from a fully custom updater.

One thing that I acknowledge is that Zincati is currently lacking a primitive to externally trigger a tick / refresh the state-machine.
This would be really valuable for event-based flows like yours, where there is external knowledge that an update is very likely already available and Zincati should quickly try to progress toward an UpdateAvailable state. This is a new primitive that should likely be exposed through a DBus method.
Right now a dirty workaround is to always speed up the refresh timings through #219, but it wasn't really meant to handle cases like this so it is quite expensive in this context.

ctron · 2022-12-12T08:43:40Z

The use case comes from a space where one might want to have multiple images, for different devices/gateways. So not all devices receive the same image.

The PR actually has the following changes:

Expose information via MQTT. The allows to monitor the state of OStree and the updates in a more "realtime" fashion. In case the read-only mode is set, the updates follow the normal flow through Cincinatti.
Allow triggering an update through the MQTT channel. (I will explain this below).
Add the MQTT base code (this could be extracted into a dedicated crate, which would then add an some external dependency: pros & cons).

Initially I had the same feeling: it doesn't quite fit. I still started out to add this functionality in order to avoid "not invented here", and leverage the code already in place. From a technology perspective (Rust, Actix) is was a good fit.

During that process it turned out that the change actually isn't that big (aside from the core MQTT stuff and some internal plumbing) there are two main changes (as mentioned above): adding some monitoring functionality over MQTT and choosing a different trigger for an update.

If you take a closer look at the change of triggering an update, it isn't that much of a different IMHO. With Cincinatti, you have a client, which polls HTTP to figure out the new target state. With Drogue/MQTT, you do the same, just with MQTT, which reversed the command direction (not pulling, but pushing).

And all the other stuff is still active. Including the fleet lock logic (which also might make sense in combination with Drogue/MQTT).

Pulling this out of Zincati is definitely possible, but would replicate around 80% of the code, if not more. True, one could extract this into a sidecar container, mimicking Cincinatti (fleetlock is still used as before). But that would mean that one would create an artificial upgrade graph, just for triggering an update.

I think a cleaner approach would be to make the update trigger a trait too. One implementation is Cincinatti, but there can be others too.

cgwalters · 2023-01-10T14:26:40Z

First, thanks so much for this pull request! There's a lot of neat stuff going on in Drogue (I'm a big Rust fan too).

I have a lot of thoughts and there's a lot going on related to things that touch on this topic.

First, there's a giant shift we have going on to use containers for updates https://fedoraproject.org/wiki/Changes/OstreeNativeContainerStable that touches on the update graph bits coreos/fedora-coreos-tracker#1263

As part of this - it's becoming more emphasized to support injecting custom privileged code that runs directly as part of the host. Today, one could write a privileged container that orchestrated FCOS updates in a custom way (and disables zincati). In fact, we effectively do that in OpenShift because it's the machine config operator there that does updates (and handles draining nodes).

But with layering one can now directly inject custom update agents into the host and hence there's no point in time in which one has an OS without an agent.

This also touches on the RHEL for Edge flow which always involves a custom OS build and hence there's an opportunity to inject custom agents there too.

Ultimately I think I'd like to pare down the basic functionality of zincati down into rpm-ostree (and into bootc). What specific APIs we support there is up for discussion but what I'm thinking right now is that we basically support polling a remote container image and that's it - more complex logic requires a custom agent/driver which could be a container or external binary.

cgwalters · 2023-05-18T19:51:28Z

Per discussion for now, closing but without prejudice - this is just something that can be done external to this project.

ctron added 13 commits December 1, 2022 16:53

feat: Add a Drogue IoT connector

d0f92bd

Used to report the current state, and schedule updates from the cloud.

refactor: fix todo, make compile without drogue

ee6f0d7

refactor: split up drogue configuration, extract MQTT section

ea78149

build: some easier local testing, drop later

10d0fd2

refactor: drop the idea of the "waiting" state

8a180e1

feat: extract ostree information, MQTT improvements

6e6abc4

feat: allow configuring the MQTT client id

61f0d51

feat: create a "read only" (report only) mode

36610fa

feat: use a "trigger", instead of just a boolean

8c1e9fe

refactor: make this a bit more general purpose

e0e7689

docs: describe example config

d6b102f

build: fix conditional compilation issue

2615a2d

ci: add job for checking the drogue feature

d8f28b0

cgwalters mentioned this pull request May 3, 2023

Add a doc for container provisioning and updates coreos/fedora-coreos-docs#540

Open

cgwalters closed this May 18, 2023

cgwalters mentioned this pull request Apr 5, 2024

Fold Zincati features into bootc containers/bootc#459

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Drogue IoT MQTT connector #904

Add a Drogue IoT MQTT connector #904

ctron commented Dec 1, 2022

lucab commented Dec 8, 2022 •

edited

Loading

ctron commented Dec 12, 2022

cgwalters commented Jan 10, 2023

cgwalters commented May 18, 2023

Add a Drogue IoT MQTT connector #904

Add a Drogue IoT MQTT connector #904

Conversation

ctron commented Dec 1, 2022

lucab commented Dec 8, 2022 • edited Loading

ctron commented Dec 12, 2022

cgwalters commented Jan 10, 2023

cgwalters commented May 18, 2023

lucab commented Dec 8, 2022 •

edited

Loading