Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NetworkDB docs #2238

Merged
merged 1 commit into from
Mar 14, 2019
Merged

Add NetworkDB docs #2238

merged 1 commit into from
Mar 14, 2019

Conversation

talex5
Copy link
Contributor

@talex5 talex5 commented Jul 20, 2018

This documentation addition is based on reading the code in the networkdb directory.

@GordonTheTurtle
Copy link

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "networkdb-docs" git@github.com:talex5/libnetwork.git somewhere
$ cd somewhere
$ git commit --amend -s --no-edit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

Copy link

@fcrisciani fcrisciani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments
Definitely helpful for new people

There are two databases used in libnetwork:

- A persistent database that stores the network configuration requested by the user. This is typically the SwarmKit managers' raft store.
- A non-persistent peer-to-peer gossip-based database that keeps track of the current runtime state. This is NetworkDB.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can add that is mainly used for transport of the data, there is no actual Get being done on the DB itself, is used mainly a pub/sub

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a bit more about this below ("Nodes look up information using their local networkdb instance. Queries are not sent to remote nodes.").

- For each peer node, the set of networks to which that node is connected.
- For each of the node's currently-in-use networks, a set of named tables of key/value pairs.

Updates are spread throughout the cluster through the gossip protocols, and nodes may have inconsistent views at any given time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus periodic tcp syncs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded this to avoid mentioning gossip here. We explain about gossip and full syncs later.

Note that nodes only keep track of tables for networks to which they belong.
Updates to a network's tables are only shared between nodes that are on that network.

NetworkDB does not impose any structure on the tables.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can rephrase saying that networkDB is a key value store and the only requirement is that the key is a string while the value is a []byte

For example, there are tables for service discovery and load balancing,
and the [overlay](overlay.md) driver uses NetworkDB to store routing information.

All nodes in a libnetwork cluster join the gossip network.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/network/cluster
mentioning network here can be misleading with the concept of network of libnetwork


All nodes in a libnetwork cluster join the gossip network.
To do this, they need the IP address and port of at least one other member of the cluster.
In the case of a SwarmKit cluster, for example, each Docker engine will use the IP addresses of the swarm managers as the initial join addresses.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention today's limitation that there is no feedback loop on the manager list, so if the 3 managers IP changes the list won't be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that nDB.Join(addrs) can be called at any time to update the list, so maybe this problem is outside of networkDB?

It will also perform a bulk-sync of the network-specific state (the tables) with every other node on the network being joined.
This will allow it to get all the network-specific information quickly.
The tables will mostly be kept up-to-date by UDP gossip messages between the nodes on that network, but
each node in the network will also do a full TCP sync of the tables with another random node on the same network from time to time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe redundant with line 33? one of them can go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to show that there are two systems here, even though they are similar. I've added a clarification of this below.


When a node wishes to leave a network, it will send a `NetworkEventTypeLeave` via gossip. It will then delete the network's table data.
When a node hears that another node is leaving a network, it deletes all table entries belonging to the leaving node.
Deleting an entry in this case means marking it for deletion for a while (so that the deletion can propagate via gossip too).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and more important to protect itself from receiving an old CREATE with lower version and accept it


When a node wishes to leave the cluster, it will send a `NodeEventTypeLeave` message via gossip.
Nodes receiving this will mark the node as "left".
The node will then send a memberlist leave message too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning will forward it to others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was unclear. I meant that the original node will send a memberlist leave message.

@@ -402,6 +403,7 @@ func (d *delegate) NotifyMsg(buf []byte) {
d.nDB.handleMessage(buf, false)
}

// XXX: should this limit be shared?
func (d *delegate) GetBroadcasts(overhead, limit int) [][]byte {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this PR(#1446) changed it this way. To me makes more sense the original code with the simple return...

@@ -492,7 +492,7 @@ func (nDB *NetworkDB) gossip() {
nDB.RUnlock()

if mnode == nil {
break
break // XXX: shouldn't this be "continue"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep a continue makes more sense

This is based on reading the code in the `networkdb` directory.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Copy link
Contributor Author

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the doc based on @fcrisciani's feedback. I'll move the other (non-documentation) parts to another issue.


All nodes in a libnetwork cluster join the gossip network.
To do this, they need the IP address and port of at least one other member of the cluster.
In the case of a SwarmKit cluster, for example, each Docker engine will use the IP addresses of the swarm managers as the initial join addresses.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that nDB.Join(addrs) can be called at any time to update the list, so maybe this problem is outside of networkDB?

It will also perform a bulk-sync of the network-specific state (the tables) with every other node on the network being joined.
This will allow it to get all the network-specific information quickly.
The tables will mostly be kept up-to-date by UDP gossip messages between the nodes on that network, but
each node in the network will also do a full TCP sync of the tables with another random node on the same network from time to time.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to show that there are two systems here, even though they are similar. I've added a clarification of this below.

- For each peer node, the set of networks to which that node is connected.
- For each of the node's currently-in-use networks, a set of named tables of key/value pairs.

Updates are spread throughout the cluster through the gossip protocols, and nodes may have inconsistent views at any given time.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded this to avoid mentioning gossip here. We explain about gossip and full syncs later.

There are two databases used in libnetwork:

- A persistent database that stores the network configuration requested by the user. This is typically the SwarmKit managers' raft store.
- A non-persistent peer-to-peer gossip-based database that keeps track of the current runtime state. This is NetworkDB.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a bit more about this below ("Nodes look up information using their local networkdb instance. Queries are not sent to remote nodes.").


When a node wishes to leave the cluster, it will send a `NodeEventTypeLeave` message via gossip.
Nodes receiving this will mark the node as "left".
The node will then send a memberlist leave message too.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was unclear. I meant that the original node will send a memberlist leave message.

@thaJeztah
Copy link
Member

ping @fcrisciani PTAL

Copy link

@fcrisciani fcrisciani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fcrisciani fcrisciani merged commit ebcade7 into moby:master Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants