Stability: 1 - Experimental
Discover is a distributed master-less node discovery mechanism that enables locating any entity (server, worker, drone, actor) based on node id. It enables point-to-point communications without pre-defined architecture.
npm install discover
npm test
npm run-script localtest
Discover is a distributed master-less node discovery mechanism that enables locating any entity (server, worker, drone, actor) based on node id. It enables point-to-point communications without pre-defined architecture and without a centralized router or centralized messaging.
It is worth highlighting that Discover is only a discovery mechanism. You can find out where a node is located (it's hostname and port, for example), but to communicate with it, you should have a way of doing that yourself.
Each Discover instance stores information on numerous nodes. Each instance also functions as an external "gateway" of sorts to beyond the local environment. For example, if a local process wants to send a message to a remote process somewhere, Discover enables distributed master-less correlation of that remote process' node id with it's physical location so that a point-to-point link can be made (or failure reported if the contact cannot be located).
Discover manages information about nodes via maintaining node information in a structure called a contact. A contact stores the details of a particular node on the network.
A contact is a JavaScript object that consists of contact.id
, contact.data
, and contact.transport
. The id
and data
are the only properties that are guaranteed not to be changed by Discover.
id
: String (base64) A globally unique Base64 encoded node id.data
: Any Any data that should be included with this contact when it is retrieved by others on the network. This should be a "serializable" structure (no circular references) so that it can beJSON.stringify()
ed.transport
: Any Any data that the transport mechanism requires for operation. Similarly todata
, it should be a "serializable" structure so that it can beJSON.stringify()
ed.
Example contact with TCP Transport information:
var contact = {
id: "Zm9v", // Base64 encoded String representing node id
data: "foo", // any data (could be {foo: "bar"}, or ["foo", "bar"], etc.)
transport: {
host: "foo.bar.com", // or "localhost", "127.0.0.1", etc...
port: 6742
}
};
As explained below in Technical Origin Details, Discover is intended to implement only PING and FIND-NODE RPCs. This reflects the intent of Discover to be a discovery mechanism and not a data storage/distribution mechanism. It is important to keep that in mind when using contact.data
.
The existence of contact.data
is to support the discovery mechanism. Given that contact.transport
contains information for how a Discover transport can connect to another Discover transport, this is not very useful if one is trying to figure out the endpoint address of another node for application level purposes. It may not correspond at all to what's in contact.transport
. The intended use of contact.data
is to store a minimal amount of information required for connecting to the node endpoint for the application's purpose.
For example, if we want a DNS-like functionality, we could look for a contact with id of my.secret.dns.com
. This could correspond to the following contact:
var contact = {
id: "bXkuc2VjcmV0LmRucy5jb20=", // Base64 encoding of "my.secret.dns.com"
data: {
port: 8080
},
transport: {
host: "10.22.1.37",
port: 6742
}
};
This would tell us that we can connect to my.secret.dns.com
at IP address 10.22.1.37
and port 8080
.
As another example and to illustrate perhaps less familiar intents, if we want to find an actor "receptionist" in the global actor system, we could look for a contact that looks like this:
var contact = {
id: "tmqjRAfBILbEC6aaHoz3AurtluM=", // Base64 encoded receptionist address
data: {
webkey: "c9bf857b35ed4750ca35c0a4f41e56644df59547",
port: 9999,
publicKey: "mQINBFJhVUwBEADRwsK6hvXoZU/niqZU2k9NXVNA9kAiVBfhUZjJZhT4BUrh1R6PynIBLWmGbQhcId5CVLlLSL/3WszBE5g1QrcA72vdffgHhF845Y5ErqAKwIhu0dEO6iNw/LYVVo0RKMXEIrDJkklv5gijdJfbyIxswxh/iKav4HI9nhFpxZBt8gykONf4wCAZevHA8KEsUFyY6pCjbVTJzIwYcgGNJWbQaowxH1yMo2rxZMG9AeerCr/TsdTyOZXjSPYf4yDarxk6br690OiQnUtFGvFNl0VZstWVB2B7v62icrWXHKAXyLvSUZMGW7GGbfiwjHoj5JVZXe6MgKw6TWiLgW/49docdTfjtlRzPHpvk6VdxFPtwSHuQW7GO9xIXkI6ZopTbkQ8PW1eaqlA/FWz6UwvDxT2bn6YCIxe024U9LJTvBg0n5tyP9Pbqv5UHyiGOQOXzPwGfSFqfdfK8Z9W8WtHpfw4/imh2w8ecB4hmBIjhUujREKDTALHq12t+A/8wnQMyCDA4llWQSmNEnHJtiXwKh98a0H9IjGXFfM+YiFzHCWIScxV/12V1EXlJe8Qu0YwOBmJUAfoKeRHSvQ+lB+h8wlw/yWszUhgDCKuswtr1OF3+ZsEBeM2i4EtFfgobvKUOPoNUZ/T0Nye0Z5Re8uYJXY+domLIjgIRSExmTl8n69ILwARAQABtB1FeGFtcGxlIDxleGFtcGxlQGV4YW1wbGUuY29tPokCPgQTAQIAKAUCUmFVTAIbAwUJAAaXgAYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQpJhrsYLKyttAlg/+MXZCyeF6B6qmU/2PXXmIYt6axcEozkcUZ7Mq2CoTs0zQgkzlboUny6auKpZuExPm38NM/KH+Q0nUvYw+UEV3xEPP1iwtBKtP10JY0OyMijqcR6I95KmPIgv5FXQqBKiJuwub168jUHVeHa6IUo4aIBBSvXlXsW46gi9vDKk5a/7AnLoYhmoT4DprofNjrkX6ldjI4W2CGR1xIPkbFMXb6Emu0SWPXb7JjQoNpBxbL5+8jVKOw/p2YGxCnP1P7DSCuOJbNWORUPf6A38nOOUQekU/2uaJFAvTnQO+9JdxJuPTFAq7wlutzYbt4aUK0qsbIWww0IAmUMkzCSob/EZOlM9OZYMeTitua6KYLDBy82ceRqZEn9Ss/nYVFZD2PzyrV4X7eXcM/9vodORmD+BzRr3gq0R1ErUbEDw8nuv02exaRav0xH/ly4D8We4qAGoN0xaJ8PY8+Du/aUPX6hWze/U+lJRnQPwYY1p3AB6fEJVDRWIUNFS3PUvHcq/YicWhNHf7S/aIg36d1laEhhW/EWVpBYpFsLJ2H9RppYft/8b1pOfoM0DXUUVRA8InZTKmpoCTDWXx0XSiSJzO4bPIIc0X2XxSc9zTEWPJulGo84UG99ESvh+TY+K7N506HllSQ15sfod1Hvx015C6Vy6i4HdjJXKU263ysJU5UWA7VyK5Ag0EUmFVTAEQANjuB28jtCCM7qLiaA1JnB118F9IQE7oNJKQZV7Rq/ZKE5ZHk0RnJ4c4uzTNlmrD/KEKLyEbmU9WO7lnpUKYAEJtRczn4j+MCDahM60cuB3IW9k3F7B9EUPDpCanb+D1GH7HVAiEP6ad6bGcqLjui/X1Lu7Xr9qwH5C9AHo1+k4h1CjBLoJ8X3fdRqEvYc/fNsp6qAYhpWSLZVyinh7xoQ7kplXMlOLILftAZ3FyNcxCLb1L2eKPUCTbf8xXKnVqcGnGfHzeYTslMENNA71rrjaKtBW9souBVl9GpZtwBCRwuDm03XzlZm7odzZTokggzqodP6/JQbQBiaJfM3EvG2vhDqiVYiQki+ybwxT/Zq9Rk5Geb31gh8hQQJk6nljxJ5qmxhwEJ81QbdX5RBoJPwl9KtC9IBN8V89HwtDxPX8BM9Z2226PFUKmTZk4K6F614EHdaBL6i9faf9T2tJygP/unQd67JGYv2X/nDUvot3NwkJRKwE9yy1NxcJWCHR/9pO9biUqCpKbHLLqqaO/UtDdng2kl64n3FTbPar/KmAcspixX8z6uLPn1z8u0SV42zy7YLfBUcnxF4jy49VUKm86Awn10gGKOByPvcD6xFqp/GlLNLVv+GtbMfGy4yYEWwfMoc0yjaEXNj5OPnWcjHcVFgejkq47FrFhtn1eYECLABEBAAGJAiUEGAECAA8FAlJhVUwCGwwFCQAGl4AACgkQpJhrsYLKyttSbQ/+IN8TVh0bcd0wZremWrOcRI19knv2Z8bVp1e6uzbG91/TOqlr6QexxJf7HbM5CCizf3OSYRYzTGc/7QJOPzDyGh8+YTtdOdPOICTLEjnGlqyKKiggNGHr6tsJKdgYh9qL7TaT13ZkX9NnBWzQCim8aqcouUC/2zjrOSsGNA9sk9OVleJ6aQCikQETmPhjqs2sD4vFmyv2dSneMbtd/31L1JHvmrwDZt85gsXrt7I00Gty4fjGw9DG3jGNoA6f4AiAbkf1jlRmAfDlwsNEn44HXNQ712Tmo0Un+q2yq9I6yDPVVBD73qtq8IVy+bDZ8XanI7E//SLpPNdc03v1Laki1s4cn0UQHGc7ZdM8NsofiBZDJphh0/nItdE0QZaJtiO5QTzJyKFZjt2mm47SE4u9HWGcTr98Nqdn8/ZqNfW51p/2VxoriIRQoejBxQB7npM6nBcpnFFQLJhRNrbeAJdgGibsB99I2Z1mRT/NAIC8xFT5ojyPvU2sEy7IFva57gSAaM2IgFEDEBVsfS0otcpByW+oJtonYkmAnGmqY1aMNe9HN58OGns76jb9zL1RcmekIqrBqkBjdxdJEcC/T1MILIRBubjETvW5VgGbbf+CpSBHyMCvB53r0ciW07+dbnv9KohonKAwRYKwEulkbtJSogNhlUfZNgaWYco9YzK2K1Q="
},
transport: {
host: "10.13.211.201",
port: 6742
}
};
This would tell us that we can access the actor using the published webkey at IP address 10.13.211.201
and port 9999
and to encrypt our communication using provided public key.
Uses of contact.data
that are not "minimal" in this way can result in poor system behavior.
Discover is implemented using a stripped down version of the Kademlia Distributed Hash Table (DHT). It uses only the PING and FIND-NODE Kademlia protocol RPCs. (It leaves out STORE and FIND-VALUE).
An enhancement (maybe) on top of the Kademlia protocol implementation is the inclusion of optional vector clocks in the discovery mechanism (this is still a work in progress at this point and not exposed in a functioning way). The purpose of the vector clock is to account for rapid change in location of entities to be located. For example, if you rapidly migrate compute workers to different physical servers, vector clocks allow the distributed nodes to select between conflicting location reports by selecting the contact with the corresponding id that also has the largest vector clock value. A better example (and initial use case) of rapidly shifting entities are actors within a distributed actor configuration.
There are three reasons.
Discover grew out of my experience with building messaging for a Node.js Platform as a Service based on an Actor Model of Computation. I did not like having a centralized messaging service that could bring down the entire platform. Messaging should be decentralized, which led to a Kademlia DHT-based implementation. see: Technical Origin Details
Every Kademlia DHT implementation I came across in Node.js community tightly coupled the procotocol implementation with the transport implementation.
Lastly, I wanted to learn and commit to intuition the implementation of Kademlia DHT so that I can apply that knowledge in other projects.
Node ids in Discover are represented as base64 encoded Strings. This is because the default generated node ids (SHA-1 hashes) could be unsafe to print. base64
encoding was picked over hex
encoding because it takes up less space when printed or serialized in ASCII over the wire.
For more detailed documentation including private methods see Discover doc
Public API
- new Discover(options)
- discover.find(nodeId, callback, [announce])
- discover.register(contact)
- discover.unreachable(contact)
- discover.unregister(contact)
options
:CONCURRENCY_CONSTANT
: Integer (Default: 3) Number of concurrent FIND-NODE requests to the network perfind
request.eventTrace
: Boolean (Default: false) If set totrue
, Discover will emit~trace
events for debugging purposes.inlineTrace
: Boolean (Default: false) If set totrue
, Discover will log to console~trace
messages for debugging purposes.seeds
: Array (Default: []) An array of seedcontact
Objects that thetransport
understands.transport
: Object (Default:discover-tcp-transport
) An optional initialized and ready to use transport module for sending communications that conforms to the Transport Protocol. Iftransport
is not provided, a new instance ofdiscover-tcp-transport
will be created and used with default settings.
Creates a new Discover instance.
The seeds
are necessary if joining an existing Discover cluster. Discover will use these seeds
to announce itself to other nodes in the cluster. If seeds
are not provided, then it is assumed that this is a seed node, and other nodes will include this node's address in their seeds
option. It all has to start somewhere.
nodeId
: String (base64) The node id to find, base64 encoded.callback
: Function The callback to call with the result of searching fornodeId
.announce
: Object (Default: undefined) CAUTION: reserved for internal use Contact object, if specified, it indicates an announcement to the network so we ask the network instead of satisfying request locally and the sender is theannounce
contact object.
The callback
is called with the result of searching for nodeId
. The result will be a contact
containing contact.id
, contact.data
, and contact.transport
of the node. If an error occurs, only error
will be provided.
discover.find('bm9kZS5pZC50aGF0LmltLmxvb2tpbmcuZm9y', function (error, contact) {
if (error) return console.error(error);
console.dir(contact);
});
contact
: Object Contact object to register.id
: String (base64) (Default:crypto.createHash('sha1').update('' + new Date().getTime() + process.hrtime()[1]).digest('base64')
) The contact id, base 64 encoded; will be created if not present.data
: Any Data to be included with the contact, it is guaranteed to be returned for anyone querying for thiscontact
byid
transport
: Any Any data that the transport mechanism requires for operation.vectorClock
: Integer (Default: 0) Vector clock to pair with node id.
- Return: Object Contact that was registered with
id
andvectorClock
generated if necessary.
Registers a new node on the network with contact.id
. Returns a contact
:
discover.register({
id: 'Zm9v', // base64 encoded String representing nodeId
data: 'foo',
transport: {
host: "foo.bar.com", // or "localhost", "127.0.0.1", etc...
port: 6742
},
vectorClock: 0 // vector clock paired with the nodeId
});
NOTE: Current implementation creates a new k-bucket for every registered node id. It is important to remember that a k-bucket could store up to k*lg(n) contacts, where lg is log base 2, n is the number of registered node ids on the network, and k is the size of each k-bucket (by default 20). For 1 billion registered nodes on the network, each k-bucket could store around 20 * lg (1,000,000,000) = ~ 598 contacts. This isn't bad, until you have 1 million local entities for a total of 598,000,000 contacts plus k-bucket overhead, which starts to put real pressure on Node.js/V8 memory limit.
contact
: Object Contact object to report unreachableid
: String (base64) The previously registered contact id, base 64 encoded.vectorClock
: Integer (Default: 0) Vector clock of contact to report unreachable.
Reports the contact
as unreachable in case Discover is storing outdated information. This can happen because Discover is a local cache of the global state of the network. If a change occurs, it may not immediately propagate to the local Discover instance.
If it is desired to get the latest contact
that is unreachable, the following code shows an example:
discover.find("Zm9v", function (error, contact) {
// got contact
// attempt to connect ... and fail :(
discover.unreachable(contact);
discover.find(contact.id, function (error, contact) {
// new contact will be found in the network
// or an error if it cannot be found
});
});
contact
: Object Contact object to registerid
: String (base64) The previously registered contact id, base 64 encoded.vectorClock
: Integer (Default: 0) Vector clock of contact to unregister.
Unregisters previously registered contact
(identified by contact.id
and contact.vectorClock
) from the network.
Modules implementing the transport mechanism for Discover shall conform to the following interface. A transport
is a JavaScript object.
Transport implementations shall ensure that contact.id
and contact.data
will be immutable and will pass through the transportation system without modification (contact
objects are passed through the transportation system during FIND-NODE and PING requests).
Transport has full dominion over contact.transport
property.
Transport implementations shall allow registering and interacting with event listeners as provided by events.EventEmitter
interface.
For reference implementation, see discover-tcp-transport.
NOTE: Unreachability of nodes depends on the transport. For example, transports ,like TLS transport, could use invalid certificate criteria for reporting unreachable nodes.
WARNING: Using TCP transport is meant primarily for development in a development environment. TCP transport exists because it is a low hanging fruit. It is most likely that it should be replaced with DTLS transport in production (maybe TLS if DTLS is not viable). There may also be a use-case for using UDP transport if communicating nodes are on a VPN/VPC. Only if UDP on a VPN/VPC seems not viable, should TCP transport be considered.
Transport Interface Specification
- transport.findNode(contact, nodeId, sender)
- transport.ping(contact, sender)
- Event 'findNode'
- Event 'node'
- Event 'ping'
- Event 'reached'
- Event 'unreachable'
contact
: Object The node to contact with request to findnodeId
.id
: String (base64) Base64 encoded contact node id.transport
: Any Any data that the transport mechanism requires for operation.
nodeId
: String (base64) Base64 encoded string representation of the node id to find.sender
: Object The sender of this request.id
: String (base64) Base64 encoded sender id.data
: Any Sender data.transport
: Any Any data that the transport mechanism requires for operation.
Issues a FIND-NODE request to the contact
. Response, timeout, errors, or otherwise shall be communicated by emitting a node
event.
contact
: Object Contact to ping.id
: String (base64) Base64 encoded contact node id.transport
: Any Any data that the transport mechanism requires for operation.
sender
: Object The sender of this request.id
: String (base64) Base64 encoded sender id.data
: Any Sender data.transport
: Any Any data that the transport mechanism requires for operation.
Issues a PING request to the contact
. The transport will emit unreachable
event if the contact
is unreachable, or reached
event otherwise.
nodeId
: String (base64) Base64 encoded string representation of the node id to find.sender
: Object The contact making the request.id
: String (base64) Base64 encoded sender id.data
: Any Sender data.transport
: Any Any data that the transport mechanism requires for operation.
callback
: Function The callback to call with the result of processing the FIND-NODE request.error
: Error An error, if any.response
: Object or Array The response to FIND-NODE request.
Emitted when another node issues a FIND-NODE request to this node.
transport.on('findNode', function (nodeId, sender, callback) {
// this node knows the node with nodeId or is itself node with nodeId
var error = null;
return callback(error, contactWithNodeId);
});
A single contactWithNodeId
shall be returned with the information identifying the contact corresponding to requested nodeId
.
transport.on('findNode', function (nodeId, sender, callback) {
// nodeId is unknown to this node, so it returns an array of nodes closer to it
var error = null;
return callback(error, closestContacts);
});
An Array of closestContacts
shall be returned if the nodeId
is unknown to this node.
If an error occurs and a request cannot be fulfilled, an error should be passed to the callback.
transport.on('findNode', function (nodeId, sender, callback) {
// some error happened
return callback(new Error("oh no!"));
});
error
: Error An error, if one occurred.contact
: Object The node that FIND-NODE request was sent to.nodeId
: String The original node id requested to be found.response
: Object or Array The response from the queriedcontact
.
If error
occurs, the transport encountered an error when issuing the findNode
request to the contact
. contact
and nodeId
will also be provided in case of an error. response
is to be undefined if an error
occurs.
response
will be an Array if the contact
does not contain the nodeId
requested. In this case response
will be a contact
list of nodes closer to the nodeId
that the queried node is aware of. The usual step is to next query the returned contacts with the FIND-NODE request.
response
will be an Object if the contact
knows of the nodeId
. In other words, the node has been found, and response
is a contact
object.
nodeId
: String (base64) Base64 encoded string representation of the node id being pinged.sender
: Object The contact making the request.id
: String (base64) Base64 encoded sender node id.data
: Any Sender node data.transport
: Any Any data that the transport mechanism requires for operation.
callback
: Function The callback to call with the result of processing the PING request.error
: Error An error, if any.response
: Object or Array The response to PING request, if any.
Emitted when another node issues a PING request to this node.
transport.on('ping', function (nodeId, sender, callback) {
// ... verify that we have the exact node specified by nodeId
return callback(null, contact);
});
In the above example contact
is an Object representing the answer to ping
query.
If the exact node specified by nodeId does not exist, an error shall be returned as shown below:
transport.on('ping', function (nodeId, sender, callback) {
// ...we don't have the nodeId specified
return callback(true);
});
contact
: Object The contact that was reached when pinged.id
: String (base64) Base64 encoded contact node id.data
: Any Data included with the contact.transport
: Any Any data that the transport mechanism requires for operation.
Emitted when a previously pinged contact
is deemed reachable by the transport.
contact
: Object The contact that was unreachable when pinged.id
: String (base64) Base64 encoded contact node id.transport
: Any Any data that the transport mechanism requires for operation.
Emitted when a previously pinged contact
is deemed unreachable by the transport.
This is roughly in order of current priority:
- Interface Specification: The interface points between
discover
,transport
, andk-bucket
are still experimental but are quickly converging on what they need to be in order to support the functionality - Implementation Correctness: Gain confidence that the protocol functions as expected. This should involve running a lot of nodes and measuring information distribution latency and accuracy.
- TLS Transport (separate module) or it might make sense to change the TCP Transport into Net Transport and include within both TCP and TLS.
- UDP Transport (separate module)
- DTLS Transport (separate module)
- Performance: Make it fast and small.
- discover.kBuckets: It should be a datastructure with O(log n) operations.
- Storage Refactoring: There emerged (obvious in retrospect) a "storage" abstraction during the implementation of
discover
that is higher level than ak-bucket
but that still seems to be worth extracting.- 24 Sep 2013: Despite a storage abstraction, it is not straightforward to separate out due to the 'ping' interaction between
k-bucket
and transport. KBucket storage implementation would have to pass some sort of token to Discover in order to remove an old contact form the correct KBucket (a closer KBucket could be registered while pinging is happening), but this exposes internal implementation, the hiding of which, was the point of abstracting a storage mechanism. It is also a very KBucket specific mechanism that I have difficulty generalizing to a common "storage" interface. Additionally, I am hard pressed to see Discover working well with non-k-bucket storage. Thusly, storage refactoring is no longer a priority.
- 24 Sep 2013: Despite a storage abstraction, it is not straightforward to separate out due to the 'ping' interaction between
This is a non-exclusive list of some of the highlights to keep in mind and maybe implement if opportunity presents itself.
Throughout Discover, the transport, and the k-bucket implementations, the vocabulary is inconsistent (in particular the usage of "contact", "node", "network", and "DHT"). Once the implementation settles and it becomes obvious what belongs where, it will be helpful to have a common, unifying way to refer to everything.
Currently, discover.unregister(contact)
deletes all "closest" contact information that was gathered within the k-bucket corresponding to the contact
. This throws away DHT information stored there.
An elaboration would be to distribute known contacts to other k-buckets when a contact
is unregistered.
The implementation has been sourced from: