New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State reconciliation with dependency graphs #2447
Conversation
686cce4
to
9c8b4d8
Compare
Moved to libs/, removed dependency on the pillar's logger and added an example (under libs/depgraph/examples) as suggested by @deitch (see communication on the eve-tsc mailing list). |
d67e164
to
0fc5929
Compare
This is really great. I'm super happy to see complexity and repetitive code pulled out of eve, simplifying it. I've got some suggestions, will put these on here later. @milan-zededa has helpfully been explaining things to me. |
Note that for the start I recommend to review the documentation (Readme file), the APIs ( |
0fc5929
to
873b691
Compare
Before jumping in with some structural recommendations, I need to point out how great this is. Anything that helps eve (and especially things in pillar) act more independently and without having to repeat code and patterns over and over is a good thing. For some reason, you decided to tackle one of the hardest things - state changes - first, so hats off. Also, thank you (many times over) for writing up a solid README on this, including examples. Too many people miss that, and it makes it harder to grasp. This really really helps. NamingMy biggest single comment is that, well, this is not a dependency graph, so naming it that confuses people. What you built is something much more powerful: a migration controller or state controller or reconciler or something like that whose input is two dependency graphs: current and target state (either of which might be empty). As you described it to me offline, the process looks like this:
The reconciler (depgraph here) then uses various handlers to make changes.
I would recommend naming it exactly what it does (taking your examples from the README and modifying them, leaving out some error handling): log := MyLogger{}
r := reconciler.New(log)
err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")
// THIS is a graph
myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
LinuxInterface{name: "eth0"}, // Could be and external item
},
}
r.Cluster(myNetwork.Name).Put(myNetwork)
err := r.Sync(ctx) If I give someone writing state machine code a library called "dependency graph", they probably will say, "what do I need one for, perhaps as a way to represent my dependencies, but who uses this one?". If I give the same person a state reconciler, they say, "oh, exactly what I need! How do I represent the state and how do I tell it what to call to reconcile? Look, state is a DAG, I get that; handlers are registered, cool! Solved my problems!" Exact same thing you did, but different names. HandlersI like what you did here. Register a handler, give it a key (the string) to enable the reconciler to link Items to the handler, it works. I also like that you call it Configurator. I thought I didn't like it, kept trying to come up with something better, and every time I analyzed it, came back to your name. :-) The way you have built the handlers, as well, they can be generic or specific. So I can build a configurator that knows how to handle "all Linux bridges", and one that specifically knows how to handle "Linux bridge br3": brc := LinuxBridgeConfigurator{}
brspecial := LinuxBridgeConfigurator{specialOption: "br3"}
err := r.RegisterConfigurator(brc, "linux-bridge")
err := r.RegisterConfigurator(brspecial, "linux-bridge-br3")
myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxBridge{name: "br3", uplink: "eth0"},
},
} Dependencies in the GraphThe "graph" actually isn't a graph right now, it is an array: myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
LinuxInterface{name: "eth0"}, // Could be and external item
}, Does the myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1", dependencies: ...},
LinuxInterface{name: "eth0"},
}, There are different ways to represent graph semantics, e.g. such that If you want, I can track down some and put them in here. Either way, you really want to be able to reflect your entire desired state and dependencies as, well, a graph. Items to ConfiguratorsCan you describe in the documentation, and with examples, how the reconciler knows which Configurator handles which Item? err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")
myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1", dependencies: ...},
LinuxInterface{name: "eth0"},
}, There is nothing obvious here that links it. In Slack, you described that the In the same vein, we should be consistent. If the way an Item says, "here is the way to find the Configurator to handle me" is via brc := LinuxBridgeConfigurator{Type: "linux-bridge"}
brspecial := LinuxBridgeConfigurator{Type: "linux-bridge-br3", specialOption: "br3"}
err := r.RegisterConfigurator(brc, "linux-bridge")
err := r.RegisterConfigurator(brspecial, "linux-bridge-br3")
myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
NewLinuxBridge({"bd0", uplink: "eth0"}), // default Type() == "linux-bridge"
NewLinuxBridge({"br3", uplink: "eth0", type: "linux-bridge-br3}), // override: Type() == "linux-bridge-br3"
},
} ClusterI don't fully get what What is wrong with? log := MyLogger{}
r := reconciler.New(log)
err := r.RegisterConfigurator(LinuxInterfaceConfigurator{}, "linux-interface")
err := r.RegisterConfigurator(LinuxRouteConfigurator{}, "linux-route")
myNetwork := reconciler.Graph{
Name: "My network",
Description: "Network managed by an agent.",
Items: []reconciler.Item{
LinuxBridge{name: "bd0", uplink: "eth0"},
LinuxRoute{dst: "192.168.1.0/24", gw: "192.168.1.1"},
LinuxInterface{name: "eth0"}, // Could be and external item
},
}
r.Put(myNetwork)
err := r.Sync(ctx) Is it that we want the reconciler to have a name and description? So why not give it to the reconciler, rather than the Graph or the Cluster? In your above examples, the That having been said, I could see why each individual graph instances might want to have a name and a description, although eve is unlikely to use those. Current and Target StateThe way this is built, I What happens if it lost the in-memory state, or it got messed up, or my code missed it? I could end up calling the wrong handlers with the wrong information. Could we change ConclusionThis was a very long comment, mainly because I really like what you are doing. I just have some suggestions around naming and interfaces. |
Thanks @deitch for a detailed review. I will respond section-by-section: NamingNow that I think of it, it makes sense to call this component by what is does (state reconciliation), not by how it does it (i.e. by use of graph theory). So HandlersI'm not sure if we will actually need the ability to have generic vs. specialized Configurators. At least within EVE I cannot think of a use-case right now. But we can certainly have this as a feature that might be helpful in the future. But also note that it might make sense to think about this the other way. What if Items (the descriptions) should be generic and Configurators always specific? For example, a network route description is pretty much always a tuple (dst, gw, interface, metric) regardless of which network stack is being used. So the same description could be used by a "LinuxRouteConfigurator", "OVSRouteConfigurator", etc. But then how would the reconciler know which configurator to use for a particular route instance? So I don't know about this, probably nonsense, just mentioning a different angle on the problem. Dependencies in the GraphSo firstly, it probably makes sense to move Note that while RegisterConfigurator(configurator Configurator, itemType string) error Maybe it could be called instead: RegisterConfiguratorByItemType(configurator Configurator, itemType string) error And then have also a (higher-priority) override for a specialized configurator: RegisterConfiguratorByItemName(configurator Configurator, itemName string) error Next to this point:
So in my current implementation, the input to the Reconciler is a set of items, not graph. And as for clusters - again, a cluster is not a graph per se. It is a grouping of related items (a subset of items if we are talking about the state of the system as a whole). For example (from EVE), all configuration items created for a specific application can be grouped under a cluster. This is useful mostly for when you are building the intended state. Clusters do not affect the reconciliation algorithm. Typically, when the intended state changes, it does not change completely, only some part of it. Another example from EVE: when user un-deploys an application, you do not want to re-render the whole intended state (inefficient), but rather tell the Reconciler to remove the cluster of items that represent the application. Or when the application definition changes, you only need to re-render and replace the content of that one cluster. Items to ConfiguratorsAlready mentioned above. I will improve the documentation to make this topic more clear (once we fully agree on the interface). ClustersAlready mentioned above. If it is still confusing, I can draw some example of how it will be used in EVE. Current and Target stateI think it might make sense to add In theory, your proposal (replace Put/Del with From) is more robust. However, in practice it will be probably very inefficient to use. As I mentioned, usually only a small portion of the intended state changes, so rebuilding it as a whole will be unnecessarily costly. Therefore it makes sense to cache the last applied state. This cache could be separated from the Reconciler and I will think about how this could be done. The challenge is to not make this tightly-coupled with the Reconciler (because then it doesn't make sense to separate it), but at the same time design the cache API such that the reconciliation algorithm can remain being effective. If it will need to get the cache content and convert it to some internal data structure, then it will be very inefficient. So the cache should definitely be already a DAG with appropriate getters. But also note that I plan to add support for asynchronous operations, which will make the Reconciler even more powerful and useful tool at disposal. When added, In any case, I think it will be useful to have a representation of the intended state stored/cached in-memory for EVE (with cache being implemented in one place as a lib for all microservices). Currently, you cannot log into a device and ask EVE about what is supposed to be configured (I mean the low-level OS configuration, not config from the controller) and why (the cluster descriptions are partly the "why-s"). And I think this information would be useful for troubleshooting purposes. Let me know If I missed any of your questions... |
Yes, that would make the dependencies visible so one can actually see the tree in the []Item. |
Sounds good. Hmm, if Reconciler has an API with
Then the middle road is: don't bother with it now, but don't block it for the future. Keep the current
Actually, this is kind of interesting. I don't know that we need to solve it now, but what you are saying is that rather than having an item say, "I should be handled by this configurator", it says, "I can be solved by any configurator that implements X, and when you run me, you determine which one." That would send the responsibility to the registration phase. My instinct here is that you are onto something really useful - items that should be handle by specific configurators, items that can be handled by any configurator that implements an interface - but that it should wait for a future iteration.
The idea of item type makes sense; the idea of item name, though, is challenging. How would the network stack, e.g. have any clue what is coming down in a configuration and what should be called? Isn't the whole purpose of this to receive configuration and apply it? If my code knows, "I handle eth0 this way", then I weaken the benefits of configuration-driven management. I think I would start simple, with
Something is going to call that API - the calling module, e.g. NIM, and going to tell it how to map things. There will be some string (like the In any case, I would stick with Type-mapping for now. With some experience running this, I suspect the new use cases and optimal ways to use them will open up.
Arguably, as set of Items is a graph, just a really simple one. Nothing prevents you from saying, "right now, we start with a the simplest graph, which is a list of items, and therefore no dependencies or ordering, and in the future we will support dependencies".
This is semantic debate, but I think the fact that it uses a DAG as an input - items with their dependencies (ok, as a true graph, a single invisible root with an array of sub-nodes, each of which may or may not have more dependencies, but that is nitpicking details) - means that the API actually is a graph. But either way, it doesn't change what you are doing.
Any subgraph of a graph is, by definition, a graph (albeit a smaller one). However, from
I still see this as just a subgraph, which means You could take this one step further, and have a single graph for the entire state of the entire eve, and have the Reconciler run through the whole graph and figure out which subgraphs are changed, and then call configurators just for those (or create temporary new reconcilers for each minimal changed subgraph and register configurators from its own sets and call
Indeed. When you put that in the README, I instantly saw the value of them. Nicely done.
I get what you are saying. I think that the current state should be stored in the caller, and not in the
I think I just understood what you are saying. I think you are saying not that the reconciliation will be expensive, but that the absorption will be expensive. i.e. when I give a Reconciler a current state or a future state, it needs to do something with that to make it useful, before it runs reconciliation? I am somewhat skeptical that will be the case, especially if we use the objects as-is. Once we convert it to something (what is that something?), it will be about the same. In any case, let's see what that func is, see if it really gets expensive or saves us anything. This may turn out to be a case of premature optimization, or it might not.
Oh, nice. I really like this, and it fits with the whole eve philosophy (awful pseudo-code). r, err := reconciler.New
r.Put(from, to)
c := make(chan error, 1)
err := r.Reconcile(c) When it is done, it sent the error, or nil, down the channel.
I agree. I don't think
How do you separate the two? Isn't the entire config driven by the controller? |
libs/depgraph/README.md
Outdated
further denoted as *configuration item* or just *item*, can be represented by a single graph node. | ||
In the graph's API, this is described by the Item interface (see [depgraph_api.go](./depgraph_api.go)) | ||
Dependencies between items are modeled using directed edges. For the start, the graph supports | ||
dependency with the semantics *"must exist"*. To give an example from networking: if a route depends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'must exist' might not be sufficient, because the vlan link might exist before a change and exist after the change, but if it was deleted and recreated then the routes referencing it will have to be recreated. 'must exist and unchanged' would capture that.
I don't know if it is worth-while to try to distiguish between modifications to an parent which does vs. does not require the children to redo/recreate things, or just assume that if anything changes in a parent then the children will have to at least check in their modify/change action whether they need to do something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that "must exist" is a dependency applied to the entire lifetime of an item, not just prior to Create. So if a route depends on a VLAN link and the VLAN link is being re-created, it means that route will get deleted before the VLAN is, and then the route is created back after the VLAN is recreated. In other words, while the VLAN link is deleted (even if it is just "momentarily"), the route must be deleted as well.
"must exist" here means the route cannot exist without the VLAN.
Also, even if modification is done in-place (not by recreate), it is possible to explicitly request re-create of items that depend on it. This may be needed in some cases.
libs/depgraph/README.md
Outdated
however. Configurator will have an option (probably by use of the context argument) | ||
to continue running operation inside a separate Go routine, exiting back to depgraph | ||
immediately and informing that the state change continues in the background. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this doesn't fit with how nim currently works where it asks domainmgr to get an interface back, when there is a change to AssignableAdapters in the nim event loop it will re-evaulate the current state and proceed.
It could be that we should rewrite and simplify nim to have a goroutine which does the application of a new DevicePortConfig, with the AssignableAdapter handler kicking such a goroutine. Is that what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually for this kind of case I have prepared a different solution - items in the graph marked as "external", representing objects managed by other agents (microservices): https://github.com/milan-zededa/eve/blob/dependency-graph/libs/depgraph/depgraph_api.go#L169-L175
In the NIM depgraph there will two items in the graph for a device port as a whole, say "physio" and "system-adapter", where "physio" will be an external item, added/removed to/from the graph based on AssignableAdapters (or it can be a notification from netlink), and "system-adapter" will depend on "physio" and contain the interface configuration to apply. So Create of "system-adapter" doesn't actually "create" the interface, rather applies the desired interface configuration (like MTU etc.), but only after the interface is brought back from pciback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asynchronicity will be used for Create/Modify/Delete operations if they could take too long to complete (e.g. download a blob, untar an archive, copy something big, etc.) and would therefore delay other managed objects that do not even depend on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't see exactly what that flow would look like hence hard to comment.
As you "execute" the steps and get to an external, then how and who tracks that some event (from the asynch being done) will need to restart the execution at that point and presumably also capture the case when the external asynch fails or times out?
It seems like this executing can be complex. Currently how tied is the execution of the steps separable from the determination of the steps needed to reconcile? If it is possible to just ask the reconciler for the steps and separately execute them (including tracking what remains to be executed), then it might be easier to handle the asynch steps.
But perhaps you've already figured this out. Should we have a call to discuss?
@deitch Where can I find "configmanager"? (or is that just a name that such a component should have?) |
@milan-zededa sorry that was not an actual component. I blanked at the time on which part of eve-os received the config and processed it, cached it, etc. That happens in zedagent if I recall correctly (which really should get a new name). |
0125633
to
454a78e
Compare
@deitch |
In principle, it looks pretty good. I have some questions:
I would counsel against that. Besides the ugliness of the name, it represents what it is under the covers. I don't think people think about "stateful" vs "stateless" or anything like that. Consumers just look at it and say, "oh, ok, I need to build a graph representation of my configuration items". I think your first instinct of |
OK
Say you have objects Item1, Item2, ... Item5, with names "1", "2", ..., respectively (as returned by Item.Name()).
|
@deitch |
// PutPrivateData allows the user to store any data with the graph. | ||
// Note that it is also possible to store private data next to each node. | ||
// (see Node.PrivateData) | ||
PutPrivateData(interface{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reconciler needs to have some context alongside the current-state graph to manage async operations. Also it needs to have the ability to resume reconciliation after an async operation(s) finalized. My goal here (and with the Reconciler API) was to have the reconciler stateless, so that you can throw it out even if some async operations are still running. Also the reconciliation is run in the same Go routine as the one from which it was called (apart from async operations). So no Go routines are spawned by Reconciler (only by Configurators maybe for async ops). So when Reconcile()
returns, you know that no reconciliation is ongoing, no one else is reading/writing the current/intended state graphs, and you can start next reconciliation (with possibly different intended state and a new instance of reconciler) whenever you want. Using this context stored alongside the current-state graph, the reconciler will be aware of async operations from previous reconciliations and their current status. I find this as less bug-prone even if it maybe diverges from common promise-style engines.
Anyway, say we switch to your suggested API:
ch := r.Reconcile(ctx, currentState, intendedState)
To do this, the reconciler would start the reconciliation inside a Go routine and return immediately.
The Go routine could stream status of executed operations through the channel.
The challenges here are:
- What if I want to update the intendedState (I received an update from controller)? Can I update
intendedState
even if there is a Reconciler Go routine still possibly running and reading the graph? Do I need to support thread-safety in depgraph? - Can I trigger another reconciliation while the previous one is still running? If yes, then the next Reconcile needs to somehow communicate with the Go routine of the previous one and cooperate to avoid race conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't get what the private data is for. Who uses it and how? Isn't that distinct from starting new reconciliation?
Is this private information for configurators, like credentials? Wouldn't that be provided to the configurator?
you can throw it out even if some async operations are still running.
if you "throw away" the reconciler reference, whatever is still running, well, still will be running.
So when Reconcile() returns, you know that no reconciliation is ongoing, no one else is reading/writing the current/intended state graphs
I don't get it. Even with the current design, don't you have some configurators that could be running in the background?
Using this context stored alongside the current-state graph, the reconciler will be aware of async operations from previous reconciliations and their current status
So, private data is a current state from a reconciler? I can sort of do:
- start a reconciliation
- export the state
- stop a reconciliation
- start a new reconciliation importing the state
- keep going?
With some running in the background, how do you make this reliable? More importantly, is there a use case wherein it is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, private data is a current state from a reconciler
Yes
I can sort of do:
- start a reconciliation
- export the state
- stop a reconciliation
- start a new reconciliation importing the state
- keep going?
With the current API this is simplified for the user to just:
- run a reconciliation
- do something else, maybe update the intended state
- run a reconciliation
- keep going
I didn't want to have the reconciliation context separate from the graph to decrease the chance of bugs (like lost context, or used old obsolete context, etc.).
Also the user doesn't have to keep track of previous reconciliations, he does not have to stop them, graphs do not have to be locked, etc.
I think that with the current API it will be simpler, but at this point the discussion is only theoretical. I would like to first use the reconciler in EVE, get some experience and maybe then change the API if it is not very good to work with. Here we are talking about the return value of only a single function (Reconcile()
), so making modifications to it in the future should not be so drastic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't quite get it, but if it is something we don't even know that we will use, then sure, start with it simple, we can gain experience.
libs/depgraph/depgraph_api.go
Outdated
// Note that Node implements methods ID() and AsGraph(). The latter allows | ||
// the node to present itself as a single-node read-only graph. Use it | ||
// only after the node was put into an actual graph with PutNode(). | ||
type Node struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still confused by this. If this is for reconciliation, it probably should live in reconciler package. Is the root of a graph (or subgraph, same thing) an Item
or Node
? Why do I need a Node
to wrap my Item
to do things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that now. Will get rid of this Node wrapper and will respond to your other comment about nodes above.
in the depgraph package. To summarize: every configuration primitive (OS config, file, | ||
process, etc.), further denoted as *configuration item* or just *item*, can be represented | ||
by a single graph node, while dependencies between items are modeled using directed edges. | ||
The graph supports dependency with a semantics *"must exist"*.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra \
at the end of the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is forced line-break: https://szymonkrajewski.pl/line-break-in-markdown/
currentState = status.NewCurrentState | ||
``` | ||
|
||
With asynchronous operations the usage is still relatively simple.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another \
here at the end of the line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is forced line-break: https://szymonkrajewski.pl/line-break-in-markdown/
libs/reconciler/README.md
Outdated
Some of these events may result in a change for the intended state and therefore | ||
require to run a state reconciliation. | ||
|
||
Reconciler always only waits for synchronous operations and exists immediately when all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean "exits"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there are two kinds of configurator actions, those that are sync and those that are async?
I find this model quite confusing, probably because it is different than how most promises-based engines I have seen work.
Is there any reason that we cannot do a promise-style? The following is the API we always use:
ch := r.Reconcile(ctx, currentState, intendedState)
It always returns a channel, which we can monitor for when things are done. That would simplify the interface from the outside. Whether the configurator actions are sync or async, doesn't matter to what calls Reconcile()
. It always gets a channel
and selects on that channel. The returned object can contain an error or errors, success, whatever. The interface is just a promise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean "exits"?
Yes, fixed.
Is there any reason that we cannot do a promise-style?
See my comments above. Mainly I didn't want the Reconciler to run in a separate Go routine. The reason is to avoid race-conditions between successive reconciliations as well as between the Reconciler and the user's code. Basically the user decides what runs in a separate Go routine (inside a Configurator) and what runs synchronously from within Reconcile()
.
// - currentState is of type depgraph.Graph | ||
// - intendedState is of type depgraph.GraphR (subset of depgraph.Graph methods) | ||
// See libs/depgraph/README.md to learn how to build and update a graph. | ||
r := reconciler.New(registry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to extend this example a bit, so you can see :
registry := &reconciler.DefaultRegistry{}
registry.Register(...)
r := reconciler.New(registry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
fbcfeb1
to
53b088e
Compare
76bb8fa
to
26db1e8
Compare
@deitch @eriknordmark As a next step I would like to actually use this, starting with the NIM microservice. Given how much time I've spent on this, I really hope that it will be have a noticeably positive effect on our code quality, readability, extensibility, etc. But that would come in next PRs. |
26db1e8
to
0ed6379
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to understand how the external/asynch operations will be handled using this, but that will become obvious as it is applied to nim. So moving along with this PR.
Thanks for the documentation and all the tests!
0ed6379
to
654a3c5
Compare
@eriknordmark I've added control-flow examples into the readme which should help you (and others) better understand how async operations and external items are handled: https://github.com/milan-zededa/eve/tree/dependency-graph/libs/reconciler#control-flow-examples |
This commit adds packages "depgraph" and "reconciler" under EVE/libs, implementing a dependency graph [1] and a state reconciliation algorithm, respectively. It will be used in EVE to generically solve the problem of reconciliation between the actual/current (configuration) state and the latest intended/desired state. Every time the desired state changes, a certain sequence of Create, Modify and Delete operations has to be scheduled and executed to update the current state accordingly. The order of operations often matters as it has to respect all dependencies that exist between configuration items. The plan is to first refactor NIM and use dependency graph and reconciler for configuration items that device networks are built with. Based on how NIM refactoring goes, this solution could be used for zedrouter next. For more detailed reasoning and motivation behind depgraph+reconciler, please refer to the proposal presented on the LF Edge wiki [2]. [1]: https://en.wikipedia.org/wiki/Dependency_graph [2]: https://wiki.lfedge.org/display/EVE/Dependency+graph+for+configuration+items Signed-off-by: Milan Lenco <milan@zededa.com>
654a3c5
to
2d630ea
Compare
This commit adds packages
depgraph
andreconciler
under EVE/libs, implementing a dependency graphand a state reconciliation algorithm, respectively.
It will be used in EVE to generically solve the problem of reconciliation between the actual/current
(configuration) state and the latest intended/desired state. Every time the desired state changes, a certain
sequence of
Create()
,Modify()
andDelete()
operations has to be scheduled and executed to updatethe current state accordingly. The order of operations often matters as it has to respect all dependencies
that exist between configuration items.
The plan is to first refactor NIM and use dependency graph and reconciler for configuration items that device
networks are built with.
Based on how NIM refactoring goes, this solution could be used for zedrouter next.
For more detailed reasoning and motivation behind
depgraph
+reconciler
, please referto the proposal presented on the LF Edge wiki.
Signed-off-by: Milan Lenco milan@zededa.com