State reconciliation with dependency graphs #2447

milan-zededa · 2022-01-03T08:50:39Z

This commit adds packages depgraph and reconciler under EVE/libs, implementing a dependency graph
and a state reconciliation algorithm, respectively.

It will be used in EVE to generically solve the problem of reconciliation between the actual/current
(configuration) state and the latest intended/desired state. Every time the desired state changes, a certain
sequence of Create(), Modify() and Delete() operations has to be scheduled and executed to update
the current state accordingly. The order of operations often matters as it has to respect all dependencies
that exist between configuration items.

The plan is to first refactor NIM and use dependency graph and reconciler for configuration items that device
networks are built with.
Based on how NIM refactoring goes, this solution could be used for zedrouter next.

For more detailed reasoning and motivation behind depgraph+reconciler, please refer
to the proposal presented on the LF Edge wiki.

Signed-off-by: Milan Lenco milan@zededa.com

milan-zededa · 2022-01-05T10:41:02Z

Moved to libs/, removed dependency on the pillar's logger and added an example (under libs/depgraph/examples) as suggested by @deitch (see communication on the eve-tsc mailing list).

deitch · 2022-01-05T13:37:57Z

This is really great. I'm super happy to see complexity and repetitive code pulled out of eve, simplifying it.

I've got some suggestions, will put these on here later. @milan-zededa has helpfully been explaining things to me.

milan-zededa · 2022-01-05T14:00:06Z

Note that for the start I recommend to review the documentation (Readme file), the APIs (depgraph_api.go) and then maybe have a look at the example (examples/filesync; can be easily built and played with: go build && ./filesync). No need to dive into the implementation (depgraph.go) or unit tests until we agree on the graph semantics and the interface.

deitch · 2022-01-06T13:00:21Z

Before jumping in with some structural recommendations, I need to point out how great this is. Anything that helps eve (and especially things in pillar) act more independently and without having to repeat code and patterns over and over is a good thing. For some reason, you decided to tackle one of the hardest things - state changes - first, so hats off.

Also, thank you (many times over) for writing up a solid README on this, including examples. Too many people miss that, and it makes it harder to grasp. This really really helps.

Naming

My biggest single comment is that, well, this is not a dependency graph, so naming it that confuses people. What you built is something much more powerful: a migration controller or state controller or reconciler or something like that whose input is two dependency graphs: current and target state (either of which might be empty).

As you described it to me offline, the process looks like this:

Create a new “reconciler”
Register a bunch of “handlers”
Create a named desired “state” and tell the "reconciler" the desired state
tell the reconciler to “go to that state"

The reconciler (depgraph here) then uses various handlers to make changes.

If an item existed in the old one but not in the new one, call Delete()
if an item exists in the new one but not in the old one, call Create()
etc

I would recommend naming it exactly what it does (taking your examples from the README and modifying them, leaving out some error handling):

log := MyLogger{}
r := reconciler.New(log)
err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")

// THIS is a graph
myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
         LinuxInterface{name: "eth0"},   // Could be and external item
     },
 }
r.Cluster(myNetwork.Name).Put(myNetwork)
err := r.Sync(ctx)

If I give someone writing state machine code a library called "dependency graph", they probably will say, "what do I need one for, perhaps as a way to represent my dependencies, but who uses this one?".

If I give the same person a state reconciler, they say, "oh, exactly what I need! How do I represent the state and how do I tell it what to call to reconcile? Look, state is a DAG, I get that; handlers are registered, cool! Solved my problems!"

Exact same thing you did, but different names.

Handlers

I like what you did here. Register a handler, give it a key (the string) to enable the reconciler to link Items to the handler, it works. I also like that you call it Configurator. I thought I didn't like it, kept trying to come up with something better, and every time I analyzed it, came back to your name. :-)

The way you have built the handlers, as well, they can be generic or specific. So I can build a configurator that knows how to handle "all Linux bridges", and one that specifically knows how to handle "Linux bridge br3":

brc := LinuxBridgeConfigurator{}
brspecial := LinuxBridgeConfigurator{specialOption: "br3"}
err := r.RegisterConfigurator(brc, "linux-bridge")
err := r.RegisterConfigurator(brspecial, "linux-bridge-br3")
myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxBridge{name: "br3", uplink: "eth0"},
     },
 }

Dependencies in the Graph

The "graph" actually isn't a graph right now, it is an array:

myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
         LinuxInterface{name: "eth0"},   // Could be and external item
     },

Does the LinuxRoute there depend on the LinuxInterface? I do not know. It is possible, as you mentioned privately, that the handler (Configurator) knows that it should depend, but that eliminates our ability to have generic handlers. More importantly, it links too closely the description and the handler. The description is a graph. I agree that a handler should be able to know dependencies imperatively, not just in the graph declaratively, but the power of a complete graph is what you are building.

myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1", dependencies: ...},
         LinuxInterface{name: "eth0"}, 
     },

There are different ways to represent graph semantics, e.g. such that LinuxRoute and LinuxInterface both are required independently, but also LinuxRoute depends upon LinuxInterface (i.e. Interface is not just a dependency of Route), and that Interface is a dependency of both Bridge and Route (and maybe Route depends on Bridge too).

If you want, I can track down some and put them in here.

Either way, you really want to be able to reflect your entire desired state and dependencies as, well, a graph.

Items to Configurators

Can you describe in the documentation, and with examples, how the reconciler knows which Configurator handles which Item?

err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")
myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1", dependencies: ...},
         LinuxInterface{name: "eth0"}, 
     },

There is nothing obvious here that links it. In Slack, you described that the Item interface has a func Type() that returns a string, that will be used by a configurator. If I understood your correctly (and it fits with the code and codedoc, let's put that in the README.

In the same vein, we should be consistent. If the way an Item says, "here is the way to find the Configurator to handle me" is via Name() (not explicit in the graph), then the way a Configurator identifies probably should be also. Something more like:

brc := LinuxBridgeConfigurator{Type: "linux-bridge"}
brspecial := LinuxBridgeConfigurator{Type: "linux-bridge-br3", specialOption: "br3"}
err := r.RegisterConfigurator(brc, "linux-bridge")
err := r.RegisterConfigurator(brspecial, "linux-bridge-br3")
myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         NewLinuxBridge({"bd0", uplink: "eth0"}),           // default Type() == "linux-bridge"
         NewLinuxBridge({"br3", uplink: "eth0", type: "linux-bridge-br3}),    // override: Type() == "linux-bridge-br3"
     },
 }

Cluster

I don't fully get what Cluster buys us? I found it confusing, so I probably missed something.

What is wrong with?

log := MyLogger{}
r := reconciler.New(log)
err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")

myNetwork := reconciler.Graph{
     Name:        "My network",
     Description: "Network managed by an agent.",
     Items:       []reconciler.Item{
         LinuxBridge{name: "bd0", uplink: "eth0"},
         LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
         LinuxInterface{name: "eth0"},   // Could be and external item
     },
 }
r.Put(myNetwork)
err := r.Sync(ctx)

Is it that we want the reconciler to have a name and description? So why not give it to the reconciler, rather than the Graph or the Cluster? In your above examples, the Name: "My network" and Description: "Network managed by an agent.", really apply to the reconciler, don't they?

That having been said, I could see why each individual graph instances might want to have a name and a description, although eve is unlikely to use those.

Current and Target State

The way this is built, I Put() a state, Sync(); Put() another state, Sync(), etc. It implicitly must know the current state, and relies on it knowing it.

What happens if it lost the in-memory state, or it got messed up, or my code missed it? I could end up calling the wrong handlers with the wrong information.

Could we change Put() to From() and To(), thus explicitly saying, "I want to go from this state to this state, now go call my handlers"? Or maybe even a simpler Put(from, to Graph), where either can be nil? If you want to keep track of your current state, do it outside of the reconciler, whose job is reconciliation, not data management.

Conclusion

This was a very long comment, mainly because I really like what you are doing. I just have some suggestions around naming and interfaces.

milan-zededa · 2022-01-07T11:03:42Z

Thanks @deitch for a detailed review. I will respond section-by-section:

Naming

Now that I think of it, it makes sense to call this component by what is does (state reconciliation), not by how it does it (i.e. by use of graph theory). So Reconciler is definitely a better name. Given how much renaming is needed, I will leave this for later, after we agree on the remaining suggestions. And maybe we will have even more naming ideas by then.
But going forward I will denote depgraph as Reconciler...

Handlers

I'm not sure if we will actually need the ability to have generic vs. specialized Configurators. At least within EVE I cannot think of a use-case right now. But we can certainly have this as a feature that might be helpful in the future.

But also note that it might make sense to think about this the other way. What if Items (the descriptions) should be generic and Configurators always specific? For example, a network route description is pretty much always a tuple (dst, gw, interface, metric) regardless of which network stack is being used. So the same description could be used by a "LinuxRouteConfigurator", "OVSRouteConfigurator", etc. But then how would the reconciler know which configurator to use for a particular route instance? So I don't know about this, probably nonsense, just mentioning a different angle on the problem.

Dependencies in the Graph

So firstly, it probably makes sense to move DependsOn from Configurator to Item interface. This is a simple change and it will allow to decouple Items from Configurators. Also, all the declarative descriptions (which DependsOn is part of) will be then solely inside the Item, while Configurator will encapsulate only the imperative methods Create/Delete... So I like this suggestion and will apply it.

Note that while Item.Type() string should return a name for the item type (like reflect.TypeOf(item), but maybe more human-readable), Item.Name() string should return a unique identifier for an item instance. For example, Type() could return "network interface" and Name() could return "eth0".
But don't think of Type() as a method for telling the reconciler which Configurator to use. That would be another instance of Items being closely coupled with Configurators. We can support different ways of mapping Configurators to Items, but this should be done through the Reconciler API, not by having Configurators or Items telling the reconciler to what they should be mapped.
So currently the Reconciler allows to map Configurators by Item types:

RegisterConfigurator(configurator Configurator, itemType string) error

Maybe it could be called instead:

RegisterConfiguratorByItemType(configurator Configurator, itemType string) error

And then have also a (higher-priority) override for a specialized configurator:

RegisterConfiguratorByItemName(configurator Configurator, itemName string) error

Next to this point:

The "graph" actually isn't a graph right now, it is an array.

So in my current implementation, the input to the Reconciler is a set of items, not graph.
This is because, in essence, what you as a user are telling the Reconciler is that you want to have this set of items currently configured (and nothing else). These items may have some dependencies and that determines what ordering of operations is valid, but the fact the Reconciler is based on graph theory and uses DAG as a data structure is a matter of implementation.

And as for clusters - again, a cluster is not a graph per se. It is a grouping of related items (a subset of items if we are talking about the state of the system as a whole). For example (from EVE), all configuration items created for a specific application can be grouped under a cluster. This is useful mostly for when you are building the intended state. Clusters do not affect the reconciliation algorithm. Typically, when the intended state changes, it does not change completely, only some part of it. Another example from EVE: when user un-deploys an application, you do not want to re-render the whole intended state (inefficient), but rather tell the Reconciler to remove the cluster of items that represent the application. Or when the application definition changes, you only need to re-render and replace the content of that one cluster.
Note that most dependencies will be within clusters, but some dependencies will go across clusters. For example: an application network bridge depends on a device port. While the bridge is application specific (and will go away when application is un-deployed), the device port is a "shared" item, always present in the intended state (so it will not be inside the application's cluster).

Items to Configurators

Already mentioned above. I will improve the documentation to make this topic more clear (once we fully agree on the interface).

Clusters

Already mentioned above. If it is still confusing, I can draw some example of how it will be used in EVE.
(I already have a sketch of clusters that will be used for nim & zedrouter, but it is only on a paper).
As pointed out above, the main purpose of clusters is to simplify and make more efficient the process of building/updating the intended state. Descriptions are only a bonus. If you visualize the current/intended state (using RenderDot() + graphviz), descriptions allow you to better understand the state and the purpose of items inside it.

Current and Target state

I think it might make sense to add From() as a way of updating/replacing the view of the actual state. For EVE this could be useful when the device is malfunctioning and so the view of the actual state can be wrong. But the scenarios of this in-memory state representation getting corrupted or out-of-date will be probably rather rare, therefore I would add From() next to Put() and Del().

In theory, your proposal (replace Put/Del with From) is more robust. However, in practice it will be probably very inefficient to use. As I mentioned, usually only a small portion of the intended state changes, so rebuilding it as a whole will be unnecessarily costly. Therefore it makes sense to cache the last applied state. This cache could be separated from the Reconciler and I will think about how this could be done. The challenge is to not make this tightly-coupled with the Reconciler (because then it doesn't make sense to separate it), but at the same time design the cache API such that the reconciliation algorithm can remain being effective. If it will need to get the cache content and convert it to some internal data structure, then it will be very inefficient. So the cache should definitely be already a DAG with appropriate getters.

But also note that I plan to add support for asynchronous operations, which will make the Reconciler even more powerful and useful tool at disposal. When added, Sync() might return even if some operations are still ongoing. This might complicate the Reconciler <-> state cache separation, but I haven't though about it in a great detail yet.

In any case, I think it will be useful to have a representation of the intended state stored/cached in-memory for EVE (with cache being implemented in one place as a lib for all microservices). Currently, you cannot log into a device and ask EVE about what is supposed to be configured (I mean the low-level OS configuration, not config from the controller) and why (the cluster descriptions are partly the "why-s"). And I think this information would be useful for troubleshooting purposes.

Let me know If I missed any of your questions...

eriknordmark · 2022-01-07T11:47:09Z

probably makes sense to move DependsOn from Configurator to Item interface

Yes, that would make the dependencies visible so one can actually see the tree in the []Item.
But as you do that, would the list in []Item matter? Or the Reconciler will build a tree based on DependsOn?

deitch · 2022-01-07T12:56:44Z

Now that I think of it, it makes sense to call this component by what is does (state reconciliation), not by how it does it (i.e. by use of graph theory). So Reconciler is definitely a better name

Sounds good.

Hmm, if Reconciler has an API with Put(from, to State), should the call to migrate from from to to be Sync()? Or maybe just, Reconcile()?

I'm not sure if we will actually need the ability to have generic vs. specialized Configurators. At least within EVE I cannot think of a use-case right now. But we can certainly have this as a feature that might be helpful in the future.

Then the middle road is: don't bother with it now, but don't block it for the future. Keep the current RegisterConfigurator that uses the Type() output, and you always can add a Name-based one later.

For example, a network route description is pretty much always a tuple (dst, gw, interface, metric) regardless of which network stack is being used. So the same description could be used by a "LinuxRouteConfigurator", "OVSRouteConfigurator", etc. But then how would the reconciler know which configurator to use for a particular route instance? So I don't know about this, probably nonsense, just mentioning a different angle on the problem.

Actually, this is kind of interesting. I don't know that we need to solve it now, but what you are saying is that rather than having an item say, "I should be handled by this configurator", it says, "I can be solved by any configurator that implements X, and when you run me, you determine which one." That would send the responsibility to the registration phase.

My instinct here is that you are onto something really useful - items that should be handle by specific configurators, items that can be handled by any configurator that implements an interface - but that it should wait for a future iteration.

Maybe it could be called instead:
RegisterConfiguratorByItemType(configurator Configurator, itemType string) error
And then have also a (higher-priority) override for a specialized configurator:
RegisterConfiguratorByItemName(configurator Configurator, itemName string) error

The idea of item type makes sense; the idea of item name, though, is challenging. How would the network stack, e.g. have any clue what is coming down in a configuration and what should be called? Isn't the whole purpose of this to receive configuration and apply it? If my code knows, "I handle eth0 this way", then I weaken the benefits of configuration-driven management.

I think I would start simple, with RegisterConfigurator() as you have it. We always can add ByType or ByName in the future, if/when we need it, without breaking what exists.

But don't think of Type() as a method for telling the reconciler which Configurator to use. That would be another instance of Items being closely coupled with Configurators. We can support different ways of mapping Configurators to Items, but this should be done through the Reconciler API, not by having Configurators or Items telling the reconciler to what they should be mapped.

Something is going to call that API - the calling module, e.g. NIM, and going to tell it how to map things. There will be some string (like the Item.Type()), or interface, or something. We should not require the caller to know what the contents of the config are. Only the what it can handle and how to handle it.

In any case, I would stick with Type-mapping for now. With some experience running this, I suspect the new use cases and optimal ways to use them will open up.

So in my current implementation, the input to the Reconciler is a set of items, not graph.
This is because, in essence, what you as a user are telling the Reconciler is that you want to have this set of items currently configured (and nothing else).

Arguably, as set of Items is a graph, just a really simple one. Nothing prevents you from saying, "right now, we start with a the simplest graph, which is a list of items, and therefore no dependencies or ordering, and in the future we will support dependencies".

These items may have some dependencies and that determines what ordering of operations is valid, but the fact the Reconciler is based on graph theory and uses DAG as a data structure is a matter of implementation

This is semantic debate, but I think the fact that it uses a DAG as an input - items with their dependencies (ok, as a true graph, a single invisible root with an array of sub-nodes, each of which may or may not have more dependencies, but that is nitpicking details) - means that the API actually is a graph. But either way, it doesn't change what you are doing.

a cluster is not a graph per se. It is a grouping of related items (a subset of items if we are talking about the state of the system as a whole). For example (from EVE), all configuration items created for a specific application can be grouped under a cluster.

Any subgraph of a graph is, by definition, a graph (albeit a smaller one). However, from Reconciler's perspective, it doesn't care. It can get a graph of everything, or just a graph of a small subset, or even just one item. It receives current and target state, has configurators, is told Sync(), and it goes.

Another example from EVE: when user un-deploys an application, you do not want to re-render the whole intended state (inefficient), but rather tell the Reconciler to remove the cluster of items that represent the application.

I still see this as just a subgraph, which means Reconciler has no idea. It just gets current and future state and configurators and runs.

You could take this one step further, and have a single graph for the entire state of the entire eve, and have the Reconciler run through the whole graph and figure out which subgraphs are changed, and then call configurators just for those (or create temporary new reconcilers for each minimal changed subgraph and register configurators from its own sets and call Sync()).

Descriptions are only a bonus. If you visualize the current/intended state (using RenderDot() + graphviz), descriptions allow you to better understand the state and the purpose of items inside it.

Indeed. When you put that in the README, I instantly saw the value of them. Nicely done.

I think it might make sense to add From() as a way of updating/replacing the view of the actual state. For EVE this could be useful when the device is malfunctioning and so the view of the actual state can be wrong. But the scenarios of this in-memory state representation getting corrupted or out-of-date will be probably rather rare, therefore I would add From() next to Put() and Del().

I get what you are saying. I think that the current state should be stored in the caller, and not in the Reconciler. The reconciler should be narrowly focused on "give me the current state and target state, tell me to reconcile, and watch me go. When I am done, you can throw me away. If you want a new reconciliation, create a new Reconciler and do the same thing again." We could keep a single object around, for efficiency, but no memory caching.

In theory, your proposal (replace Put/Del with From) is more robust. However, in practice it will be probably very inefficient to use. As I mentioned, usually only a small portion of the intended state changes, so rebuilding it as a whole will be unnecessarily costly. Therefore it makes sense to cache the last applied state.

I think I just understood what you are saying. I think you are saying not that the reconciliation will be expensive, but that the absorption will be expensive. i.e. when I give a Reconciler a current state or a future state, it needs to do something with that to make it useful, before it runs reconciliation?

I am somewhat skeptical that will be the case, especially if we use the objects as-is. Once we convert it to something (what is that something?), it will be about the same.

In any case, let's see what that func is, see if it really gets expensive or saves us anything. This may turn out to be a case of premature optimization, or it might not.

But also note that I plan to add support for asynchronous operations,

Oh, nice. I really like this, and it fits with the whole eve philosophy (awful pseudo-code).

r, err := reconciler.New
r.Put(from, to)
c := make(chan error, 1)
err := r.Reconcile(c)

When it is done, it sent the error, or nil, down the channel.

In any case, I think it will be useful to have a representation of the intended state stored/cached in-memory for EVE (with cache being implemented in one place as a lib for all microservices)

I agree. I don't think Reconciler is the place for it; I think that configmanager is.

Currently, you cannot log into a device and ask EVE about what is supposed to be configured (I mean the low-level OS configuration, not config from the controller)

How do you separate the two? Isn't the entire config driven by the controller?

eriknordmark · 2022-01-07T15:43:01Z

libs/depgraph/README.md

+further denoted as *configuration item* or just *item*, can be represented by a single graph node.
+In the graph's API, this is described by the Item interface (see [depgraph_api.go](./depgraph_api.go))
+Dependencies between items are modeled using directed edges. For the start, the graph supports
+dependency with the semantics *"must exist"*. To give an example from networking: if a route depends


'must exist' might not be sufficient, because the vlan link might exist before a change and exist after the change, but if it was deleted and recreated then the routes referencing it will have to be recreated. 'must exist and unchanged' would capture that.

I don't know if it is worth-while to try to distiguish between modifications to an parent which does vs. does not require the children to redo/recreate things, or just assume that if anything changes in a parent then the children will have to at least check in their modify/change action whether they need to do something.

Note that "must exist" is a dependency applied to the entire lifetime of an item, not just prior to Create. So if a route depends on a VLAN link and the VLAN link is being re-created, it means that route will get deleted before the VLAN is, and then the route is created back after the VLAN is recreated. In other words, while the VLAN link is deleted (even if it is just "momentarily"), the route must be deleted as well.
"must exist" here means the route cannot exist without the VLAN.

Also, even if modification is done in-place (not by recreate), it is possible to explicitly request re-create of items that depend on it. This may be needed in some cases.

eriknordmark · 2022-01-07T15:46:17Z

libs/depgraph/README.md

+   however. Configurator will have an option (probably by use of the context argument)
+   to continue running operation inside a separate Go routine, exiting back to depgraph
+   immediately and informing that the state change continues in the background.


FWIW this doesn't fit with how nim currently works where it asks domainmgr to get an interface back, when there is a change to AssignableAdapters in the nim event loop it will re-evaulate the current state and proceed.

It could be that we should rewrite and simplify nim to have a goroutine which does the application of a new DevicePortConfig, with the AssignableAdapter handler kicking such a goroutine. Is that what you have in mind?

Actually for this kind of case I have prepared a different solution - items in the graph marked as "external", representing objects managed by other agents (microservices): https://github.com/milan-zededa/eve/blob/dependency-graph/libs/depgraph/depgraph_api.go#L169-L175
In the NIM depgraph there will two items in the graph for a device port as a whole, say "physio" and "system-adapter", where "physio" will be an external item, added/removed to/from the graph based on AssignableAdapters (or it can be a notification from netlink), and "system-adapter" will depend on "physio" and contain the interface configuration to apply. So Create of "system-adapter" doesn't actually "create" the interface, rather applies the desired interface configuration (like MTU etc.), but only after the interface is brought back from pciback.

Asynchronicity will be used for Create/Modify/Delete operations if they could take too long to complete (e.g. download a blob, untar an archive, copy something big, etc.) and would therefore delay other managed objects that do not even depend on it.

I guess I don't see exactly what that flow would look like hence hard to comment.
As you "execute" the steps and get to an external, then how and who tracks that some event (from the asynch being done) will need to restart the execution at that point and presumably also capture the case when the external asynch fails or times out?

It seems like this executing can be complex. Currently how tied is the execution of the steps separable from the determination of the steps needed to reconcile? If it is possible to just ask the reconciler for the steps and separately execute them (including tracking what remains to be executed), then it might be easier to handle the asynch steps.

But perhaps you've already figured this out. Should we have a call to discuss?

milan-zededa · 2022-01-08T09:20:08Z

I agree. I don't think Reconciler is the place for it; I think that configmanager is

@deitch Where can I find "configmanager"? (or is that just a name that such a component should have?)

deitch · 2022-01-09T10:41:58Z

@milan-zededa sorry that was not an actual component. I blanked at the time on which part of eve-os received the config and processed it, cached it, etc. That happens in zedagent if I recall correctly (which really should get a new name).

milan-zededa · 2022-01-10T09:49:19Z

@deitch
Until I re-design reconciler and make it stateless, could we first agree and approve the interfaces for Configurators and configured items? It will allow me to go ahead with the NIM refactoring and start preparing implementations for objects managed by NIM.
The interfaces are here:
https://github.com/milan-zededa/eve/blob/dependency-graph/libs/depgraph/depgraph_api.go#L143-L181
https://github.com/milan-zededa/eve/blob/dependency-graph/libs/depgraph/depgraph_api.go#L355-L369
One thing I'm considering to change is to rename Item to StatefulObj. The reason is that item represents a stateful object (that can be created, modified and deleted; like interface, route, volume, etc.) and the typename "Item" is rather vague.

deitch · 2022-01-10T10:36:11Z

could we first agree and approve the interfaces for Configurators and configured items?

In principle, it looks pretty good. I have some questions:

can you rename the package to reconciler (or whatever we decide on) from the beginning? Also the name of the thing Reconciler or whatever? I think it will make life easier to change some implementations rather than the entire naming convention
How do dependencies work? If I have 5 objects, and number 2 depends on 3 & 4 and number 4 depends on 5, what does it look like?

One thing I'm considering to change is to rename Item to StatefulObj. The reason is that item represents a stateful object (that can be created, modified and deleted; like interface, route, volume, etc.) and the typename "Item" is rather vague.

I would counsel against that. Besides the ugliness of the name, it represents what it is under the covers. I don't think people think about "stateful" vs "stateless" or anything like that. Consumers just look at it and say, "oh, ok, I need to build a graph representation of my configuration items". I think your first instinct of Item was simple and very correct.

milan-zededa · 2022-01-10T11:15:08Z

can you rename the package to reconciler (or whatever we decide on) from the beginning?

OK

How do dependencies work? If I have 5 objects, and number 2 depends on 3 & 4 and number 4 depends on 5, what does it look like?

Say you have objects Item1, Item2, ... Item5, with names "1", "2", ..., respectively (as returned by Item.Name()).
Then Item.Dependencies would be implemented as follows:

func (item Item1) Dependencies() []depgraph.Dependency {
	return []depgraph.Dependency{}
}

func (item Item2) Dependencies() []depgraph.Dependency {
	return []depgraph.Dependency{}
		&depgraph.ItemIsCreated{
			ItemName:   "3",
			Description: "Describe why it depends on 3...",
		},
		&depgraph.ItemIsCreated{
			ItemName:   "4",
			Description: "Describe why it depends on 4...",
		},
	}
}

func (item Item3) Dependencies() []depgraph.Dependency {
	return []depgraph.Dependency{}
}

func (item Item4) Dependencies() []depgraph.Dependency {
		return []depgraph.Dependency{}
		&depgraph.ItemIsCreated{
			ItemName:   "5",
			Description: "Describe why it depends on 5...",
		},
	}
}

func (item Item5) Dependencies() []depgraph.Dependency {
	return []depgraph.Dependency{}
}

milan-zededa · 2022-01-10T11:48:30Z

@deitch
One more thing: Do you prefer Cluster or SubGraph? SubGraph is self-explanatory, but cluster is probably somewhat nicer name.

deitch · 2022-01-18T19:40:00Z

libs/depgraph/depgraph_api.go

+	// PutPrivateData allows the user to store any data with the graph.
+	// Note that it is also possible to store private data next to each node.
+	// (see Node.PrivateData)
+	PutPrivateData(interface{})


What is this for?

The reconciler needs to have some context alongside the current-state graph to manage async operations. Also it needs to have the ability to resume reconciliation after an async operation(s) finalized. My goal here (and with the Reconciler API) was to have the reconciler stateless, so that you can throw it out even if some async operations are still running. Also the reconciliation is run in the same Go routine as the one from which it was called (apart from async operations). So no Go routines are spawned by Reconciler (only by Configurators maybe for async ops). So when Reconcile() returns, you know that no reconciliation is ongoing, no one else is reading/writing the current/intended state graphs, and you can start next reconciliation (with possibly different intended state and a new instance of reconciler) whenever you want. Using this context stored alongside the current-state graph, the reconciler will be aware of async operations from previous reconciliations and their current status. I find this as less bug-prone even if it maybe diverges from common promise-style engines.

Anyway, say we switch to your suggested API:

ch := r.Reconcile(ctx, currentState, intendedState)

To do this, the reconciler would start the reconciliation inside a Go routine and return immediately.
The Go routine could stream status of executed operations through the channel.

The challenges here are:

What if I want to update the intendedState (I received an update from controller)? Can I update intendedState even if there is a Reconciler Go routine still possibly running and reading the graph? Do I need to support thread-safety in depgraph?

Can I trigger another reconciliation while the previous one is still running? If yes, then the next Reconcile needs to somehow communicate with the Go routine of the previous one and cooperate to avoid race conditions.

I still don't get what the private data is for. Who uses it and how? Isn't that distinct from starting new reconciliation?
Is this private information for configurators, like credentials? Wouldn't that be provided to the configurator?

you can throw it out even if some async operations are still running.

if you "throw away" the reconciler reference, whatever is still running, well, still will be running.

So when Reconcile() returns, you know that no reconciliation is ongoing, no one else is reading/writing the current/intended state graphs

I don't get it. Even with the current design, don't you have some configurators that could be running in the background?

Using this context stored alongside the current-state graph, the reconciler will be aware of async operations from previous reconciliations and their current status

So, private data is a current state from a reconciler? I can sort of do:

start a reconciliation

export the state

stop a reconciliation

start a new reconciliation importing the state

keep going?

With some running in the background, how do you make this reliable? More importantly, is there a use case wherein it is necessary?

So, private data is a current state from a reconciler

Yes

I can sort of do:

start a reconciliation

export the state

stop a reconciliation

start a new reconciliation importing the state

keep going?

With the current API this is simplified for the user to just:

run a reconciliation

do something else, maybe update the intended state

run a reconciliation

keep going

I didn't want to have the reconciliation context separate from the graph to decrease the chance of bugs (like lost context, or used old obsolete context, etc.).
Also the user doesn't have to keep track of previous reconciliations, he does not have to stop them, graphs do not have to be locked, etc.
I think that with the current API it will be simpler, but at this point the discussion is only theoretical. I would like to first use the reconciler in EVE, get some experience and maybe then change the API if it is not very good to work with. Here we are talking about the return value of only a single function (Reconcile()), so making modifications to it in the future should not be so drastic.

I still don't quite get it, but if it is something we don't even know that we will use, then sure, start with it simple, we can gain experience.

deitch · 2022-01-18T19:44:06Z

libs/depgraph/depgraph_api.go

+// Note that Node implements methods ID() and AsGraph(). The latter allows
+// the node to present itself as a single-node read-only graph. Use it
+// only after the node was put into an actual graph with PutNode().
+type Node struct {


Still confused by this. If this is for reconciliation, it probably should live in reconciler package. Is the root of a graph (or subgraph, same thing) an Item or Node? Why do I need a Node to wrap my Item to do things?

I see that now. Will get rid of this Node wrapper and will respond to your other comment about nodes above.

deitch · 2022-01-18T19:46:42Z

libs/reconciler/README.md

+in the depgraph package. To summarize: every configuration primitive (OS config, file,
+process, etc.), further denoted as *configuration item* or just *item*, can be represented
+by a single graph node, while dependencies between items are modeled using directed edges.
+The graph supports dependency with a semantics *"must exist"*.\


Extra \ at the end of the line.

This is forced line-break: https://szymonkrajewski.pl/line-break-in-markdown/

deitch · 2022-01-18T19:52:50Z

libs/reconciler/README.md

+currentState = status.NewCurrentState
+```
+
+With asynchronous operations the usage is still relatively simple.\


Another \ here at the end of the line

This is forced line-break: https://szymonkrajewski.pl/line-break-in-markdown/

deitch · 2022-01-18T19:54:00Z

libs/reconciler/README.md

+Some of these events may result in a change for the intended state and therefore
+require to run a state reconciliation.
+
+Reconciler always only waits for synchronous operations and exists immediately when all


Did you mean "exits"?

So there are two kinds of configurator actions, those that are sync and those that are async?

I find this model quite confusing, probably because it is different than how most promises-based engines I have seen work.

Is there any reason that we cannot do a promise-style? The following is the API we always use:

ch := r.Reconcile(ctx, currentState, intendedState)

It always returns a channel, which we can monitor for when things are done. That would simplify the interface from the outside. Whether the configurator actions are sync or async, doesn't matter to what calls Reconcile(). It always gets a channel and selects on that channel. The returned object can contain an error or errors, success, whatever. The interface is just a promise.

Did you mean "exits"?

Yes, fixed.

Is there any reason that we cannot do a promise-style?

See my comments above. Mainly I didn't want the Reconciler to run in a separate Go routine. The reason is to avoid race-conditions between successive reconciliations as well as between the Reconciler and the user's code. Basically the user decides what runs in a separate Go routine (inside a Configurator) and what runs synchronously from within Reconcile().

deitch · 2022-01-18T19:57:01Z

libs/reconciler/README.md

+//  - currentState is of type depgraph.Graph
+//  - intendedState is of type depgraph.GraphR (subset of depgraph.Graph methods)
+// See libs/depgraph/README.md to learn how to build and update a graph.
+r := reconciler.New(registry)


might want to extend this example a bit, so you can see :

registry := &reconciler.DefaultRegistry{} registry.Register(...) r := reconciler.New(registry)

milan-zededa · 2022-01-27T16:04:32Z

@deitch @eriknordmark
I have added a whole bunch of unit tests to cover all kinds of scenarios. In terms of code coverage, depgraph package has 87.1%, reconciler 85.7%. With that, the PR is complete. Most of the suggestions from the review were reflected in one way or another. Still some changes could be done in the future as we start using these packages and gain experience.

As a next step I would like to actually use this, starting with the NIM microservice. Given how much time I've spent on this, I really hope that it will be have a noticeably positive effect on our code quality, readability, extensibility, etc. But that would come in next PRs.
Anyway, this PR is ready for merge and at this point it should be completely harmless because there are no changes, just new code (under /libs) which is not yet imported by eve/pillar from anywhere.

eriknordmark

I still need to understand how the external/asynch operations will be handled using this, but that will become obvious as it is applied to nim. So moving along with this PR.
Thanks for the documentation and all the tests!

milan-zededa · 2022-01-28T09:06:16Z

@eriknordmark I've added control-flow examples into the readme which should help you (and others) better understand how async operations and external items are handled: https://github.com/milan-zededa/eve/tree/dependency-graph/libs/reconciler#control-flow-examples
Still, if you want we can have a call and I can explain in more detail.

This commit adds packages "depgraph" and "reconciler" under EVE/libs, implementing a dependency graph [1] and a state reconciliation algorithm, respectively. It will be used in EVE to generically solve the problem of reconciliation between the actual/current (configuration) state and the latest intended/desired state. Every time the desired state changes, a certain sequence of Create, Modify and Delete operations has to be scheduled and executed to update the current state accordingly. The order of operations often matters as it has to respect all dependencies that exist between configuration items. The plan is to first refactor NIM and use dependency graph and reconciler for configuration items that device networks are built with. Based on how NIM refactoring goes, this solution could be used for zedrouter next. For more detailed reasoning and motivation behind depgraph+reconciler, please refer to the proposal presented on the LF Edge wiki [2]. [1]: https://en.wikipedia.org/wiki/Dependency_graph [2]: https://wiki.lfedge.org/display/EVE/Dependency+graph+for+configuration+items Signed-off-by: Milan Lenco <milan@zededa.com>

milan-zededa self-assigned this Jan 3, 2022

milan-zededa requested review from eriknordmark and rvs as code owners January 3, 2022 08:50

milan-zededa force-pushed the dependency-graph branch 6 times, most recently from 686cce4 to 9c8b4d8 Compare January 5, 2022 10:34

milan-zededa force-pushed the dependency-graph branch 2 times, most recently from d67e164 to 0fc5929 Compare January 5, 2022 11:25

milan-zededa force-pushed the dependency-graph branch from 0fc5929 to 873b691 Compare January 5, 2022 16:17

eriknordmark reviewed Jan 7, 2022

View reviewed changes

milan-zededa marked this pull request as draft January 10, 2022 08:25

milan-zededa force-pushed the dependency-graph branch 2 times, most recently from 0125633 to 454a78e Compare January 10, 2022 09:04

deitch reviewed Jan 18, 2022

View reviewed changes

milan-zededa force-pushed the dependency-graph branch 11 times, most recently from fbcfeb1 to 53b088e Compare January 25, 2022 10:10

milan-zededa force-pushed the dependency-graph branch 4 times, most recently from 76bb8fa to 26db1e8 Compare January 27, 2022 15:28

milan-zededa marked this pull request as ready for review January 27, 2022 15:50

milan-zededa force-pushed the dependency-graph branch from 26db1e8 to 0ed6379 Compare January 27, 2022 16:47

eriknordmark approved these changes Jan 27, 2022

View reviewed changes

milan-zededa force-pushed the dependency-graph branch from 0ed6379 to 654a3c5 Compare January 28, 2022 09:02

milan-zededa force-pushed the dependency-graph branch from 654a3c5 to 2d630ea Compare January 28, 2022 09:10

eriknordmark merged commit 4394b60 into lf-edge:master Jan 28, 2022

State reconciliation with dependency graphs #2447

State reconciliation with dependency graphs #2447

Conversation

milan-zededa commented Jan 3, 2022 • edited

milan-zededa commented Jan 5, 2022

deitch commented Jan 5, 2022

milan-zededa commented Jan 5, 2022 • edited

deitch commented Jan 6, 2022

Naming

Handlers

Dependencies in the Graph

Items to Configurators

Cluster

Current and Target State

Conclusion

milan-zededa commented Jan 7, 2022

Naming

Handlers

Dependencies in the Graph

Items to Configurators

Clusters

Current and Target state

eriknordmark commented Jan 7, 2022

deitch commented Jan 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milan-zededa Jan 8, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milan-zededa commented Jan 8, 2022 • edited

deitch commented Jan 9, 2022

milan-zededa commented Jan 10, 2022

deitch commented Jan 10, 2022

milan-zededa commented Jan 10, 2022

milan-zededa commented Jan 10, 2022

Choose a reason for hiding this comment

milan-zededa Jan 19, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milan-zededa commented Jan 27, 2022 • edited

eriknordmark left a comment

Choose a reason for hiding this comment

milan-zededa commented Jan 28, 2022

milan-zededa commented Jan 3, 2022 •

edited

milan-zededa commented Jan 5, 2022 •

edited

milan-zededa Jan 8, 2022 •

edited

milan-zededa commented Jan 8, 2022 •

edited

milan-zededa Jan 19, 2022 •

edited

milan-zededa commented Jan 27, 2022 •

edited