Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Social Discovery to add found DIs to the Gestalt Graph #316

Closed
emmacasolin opened this issue Jan 10, 2022 · 20 comments · Fixed by #320
Closed

Implement Social Discovery to add found DIs to the Gestalt Graph #316

emmacasolin opened this issue Jan 10, 2022 · 20 comments · Fixed by #320
Assignees
Labels
design Requires design development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy

Comments

@emmacasolin
Copy link
Contributor

emmacasolin commented Jan 10, 2022

Specification

Currently, most of the Gestalts CLI commands do not work in isolation as they expect nodes and identities to have already been set in the Gestalt Graph. The commands will fail with an ErrorGestaltsGraphNodeIdMissing or ErrorGestaltsGraphIdentityIdMissing error, or will return no data. This is because identitiesInfoGetConnected, which is supposed to add found connected identities to the gestalt graph, is currently incomplete and is inaccessible to the CLI. (outdated)

When choosing to trust a node/identity (i.e. to send notifications/vault share), this should add the node/identity to our gestalt graph. This can be done inside the handlers for these commands. Once done, this should trigger discovery on the added node/identity.

We don't want to have to await this discovery, so the Discovery domain needs to be converted from CreateDestroy to CreateDestroyStartStop so that it can have underlying state that it can crawl in the background.

Additional context

Tasks

  1. Expose identitiesInfoGetConnected to the CLI
  2. Implement functionality for adding trusted nodes/identities to the gestalt graph
  3. Add spider-crawling to the Discovery domain
  4. Modify/extend existing tests to match this behaviour
@emmacasolin emmacasolin added development Standard development design Requires design labels Jan 10, 2022
@emmacasolin
Copy link
Contributor Author

From this comment (https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/144#note_501235581) it looks like the identitiesInfoGetConnected GRPC handler (which isn't currently used by any CLI commands) might be intended to be used for social discovery, and thus growing the gestalt graph. Right now all it does is return data about connected digital identities, however, if these identities were added into the gestalt graph we would have a way of growing it. However, we would also need to scrape these found identities for node claims, since a gestalt consisting of just an identity is pretty useless to us (since we need nodes in order to set permissions for a gestalt).

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 20, 2022

Once you have figured out what needs to be changed, it's better to break down this issue from a general "review issue" to focused sub issues on things that need to be changed. We can do a meeting to review the details.

@emmacasolin
Copy link
Contributor Author

Looks like my hunch about identitiesInfoGetConnected was right - https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/144#note_501361799

So the process of growing the gestalt graph is as follows:

  1. When first bootstrapping, your own node should be added to the gestalt graph
  2. After bootstrapping, you HAVE to augment one of your digital identities. This is the only way to be able to eventually discover other gestalts. This augmented DI should also be added to the gestalt graph
  3. Anytime you claim another node, this node should be added to the gestalt graph
  4. Provided you've done step 2 already, you can now call identitiesInfoGetConnected to search the provider of your DI for other identities. These found identities should be set into the gestalt graph and Discovery.discoverGestaltByIdentity() should be called on them in order to discover their gestalts
  5. The discovery functions can also be called at any time to flesh out the links between nodes and identities in the gestalt graph

@emmacasolin
Copy link
Contributor Author

Possibility for the gestalt graph to be edited manually: https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/195#note_600203842 - not sure if this functionality is still intended to exist

@emmacasolin
Copy link
Contributor Author

emmacasolin commented Jan 21, 2022

I was actually incorrect in my original understanding of the discoverGestaltBy* functions - I had originally assumed that discovery did NOT set nodes/identities into the gestalt graph and instead only created links between them in the graph, however this is incorrect. Discovery does in fact set discovered nodes and identities into the gestalt graph and so this is a way of growing the gestalt graph.

The caveat to this is that the setting of nodes and identities only happens when a link is found, so you cannot set a new node/identity with no links into the gestalt graph through discovery.

@emmacasolin
Copy link
Contributor Author

Discovery is meant to be for discovering other gestalts, not your own. In this case, we can assume that the Discovery functions won't be used for updating our own gestalt in the gestalt graph. In this case, whenever we make a node or identity claim, the claimed node/identity needs to be put into our gestalt graph manually (along with the relevant links). This can be done using linkNodeAndNode() and linkNodeAndIdentity(), since these methods also set the nodes/identities into the gestalt graph. This could be added to the grpc handlers for identitiesClaim and nodesClaim.

@emmacasolin
Copy link
Contributor Author

Right now, all identitiesInfoGetConnected() (the grpc handler) does is return connected identity datas (the definition of this depends on the provider and doesn't matter since Provider is an abstract class). From my understanding after reading through the inital Gestalt Graph discussion, identitiesInfoGetConnected() should also be calling GestaltGraph.setIdentity() and Discovery.discoverGestaltByIdentity() for each of the found connected identities. The gestalt graph is a small domain so it seems like it shouldn't be too much trouble to inject that into identitiesInfoGetConnected(), however discovery is much larger so it's probably best not to. Either we can assume that the user will call discovery themselves on the found identities, or we could potentially use the new event emitter to call discovery every time a new connected identity is found?

@emmacasolin
Copy link
Contributor Author

We also need our node to be added to the gestalt graph when bootstrapping, which is easy enough to add to bootstrapState. However, wouldn't this node need to be updated whenever we change the root keypair? I think this relates to #317.

Not only would the node itself need to be updated in the gestalt graph, but all of the other nodes/identities in its gestalt, since they would now be linked to a node with a different ID. https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/195#note_630611063 - this comment, in particular the line "This is by design, node id changes induces a need to recreate claims. Which is why key changes imply a change of identity", makes me wonder if key pair changes should trigger links in the gestalt graph to be removed? This could probably be a separate issue though.

@CMCDragonkai
Copy link
Member

Discovery is meant to be for discovering other gestalts, not your own. In this case, we can assume that the Discovery functions won't be used for updating our own gestalt in the gestalt graph. In this case, whenever we make a node or identity claim, the claimed node/identity needs to be put into our gestalt graph manually (along with the relevant links). This can be done using linkNodeAndNode() and linkNodeAndIdentity(), since these methods also set the nodes/identities into the gestalt graph. This could be added to the grpc handlers for identitiesClaim and nodesClaim.

I believe discovery was meant to be a social discovery (not DHT discovery which is supposed to occur automatically in kademlia/nodegraph). And your own gestalt doesn't need to be discovered simply because it's already linked in your sigchain.

However the node graph may need to be populated when you gestalt updates.

This is where the gossip protocol is meant to come in #190, but I haven't fully specced out how that's supposed to work.

@emmacasolin
Copy link
Contributor Author

emmacasolin commented Jan 21, 2022

I believe discovery was meant to be a social discovery (not DHT discovery which is supposed to occur automatically in kademlia/nodegraph). And your own gestalt doesn't need to be discovered simply because it's already linked in your sigchain.

However the node graph may need to be populated when you gestalt updates.

This is where the gossip protocol is meant to come in #190, but I haven't fully specced out how that's supposed to work.

So does this mean that your own gestalt doesn't need to be represented in your gestalt graph, only other people's gestalts? Since this information is stored on your sigchain? This would simplify things a lot considering the gestalt graph is currently not updated/modified by any other domains (besides discovery). The gestalt graph also currently doesn't update or communicate with any other domains besides the ACL. The gestalt graph at the moment is completely cut off from nodes/identities/sigchain and there's no direct communication between them.

From the original gestalt graph discussion it seems that your own gestalt was expected to be represented in your gestalt graph, but I don't think there's a reason that it needs to be?

@CMCDragonkai
Copy link
Member

No your own gestalt would still in the gestalt graph.

Identity related manipulation should mutate the gestalt graph.

@CMCDragonkai
Copy link
Member

The gestalt graph would be consulted when identity decisions and trust decisions, it would be shown to the end user. This is hard to visualise until you see the GUI. The CLI is quite barebones on its visualisation.

@emmacasolin
Copy link
Contributor Author

So the final checklist of what needs to be done to get the relevant tests working as expected is:

  1. Create a CLI command for identitiesInfoGetConnected
  2. Inject GestaltGraph into identitiesInfoGetConnected and call setIdentity() on each connected identity
  3. Either inject Discovery into identitiesInfoGetConnected and call discoverGestaltByIdentity() on each connected identity OR set up an event for discovering an identity that triggers discoverGestaltByIdentity() whenever setIdentity() is called OR expect the user to manually run the identities discover command on the identities that are returned by identitiesInfoGetConnected

Additionally (as a separate issue most likely), in order to have your own gestalt represented in your gestalt graph we need to:

  1. Add newly bootstrapped nodes into the gestalt graph inside bootstrapState()
  2. Modify node claiming to add the claimed node to the gestalt graph for both nodes (could potentially just inject GestaltGraph into nodesClaim to do this for the node that calls nodes claim ... by calling linkNodeAndNode(), but it's likely more complicated for the other node)
  3. Modify identity claiming to add the claimed identity to the gestalt graph (can just inject GestaltGraph into identitiesClaim and call linkNodeAndIdentity())
  4. We also need to consider expected behaviour when the root keypair is changed and when claims otherwise expire/become invalid and what we expect to happen on the gestalt graph. This could be something we use the event bus for.

So there are really two sub issues here:

  1. Fixing and integrating the identitiesInfoGetConnected rpc handler into the CLI
  2. Adding the ability for the gestalt graph to contain your own gestalt

The first issue is the only one blocking #311, the second issue is more complex and can also likely be done alongside #317 (mostly point 4).

@emmacasolin
Copy link
Contributor Author

I think this issue can probably be renamed and cover the identitiesInfoGetConnected issues since those relate more closely to test splitting (which is what this issue was created from). I can then create a separate issue for the "gestalt graph containing own gestalt" issues, which can be addressed later. If I were to create a new branch for fixing identitiesInfoGetConnected would that branch off from master or from the test splitting branch @CMCDragonkai?

@CMCDragonkai
Copy link
Member

If you're just making changes to the src, branch from master first. You can always rebase later.

@CMCDragonkai
Copy link
Member

The overall purpose of the gestalt graph is create and maintain a graph of trust. So you know that you're sharing a vault with a gestalt and not a single node. So all the other pieces come together to enable this. The GG is also eventually consistent and also completely decentralised. Every node in a gestalt should share the gestalt graph DB via the gossip protocol but other nodes may have a completely different gestalt graph simply because their field of view is different.

@CMCDragonkai
Copy link
Member

I'll circle back to this once nodes and vaults are working and we can review in detail.

This is a major part of the decentralised trust arm of PK.

@CMCDragonkai
Copy link
Member

Btw alot of gestalt graph theory was written into the polykey-design issues in gitlab. And was meant to have fully realised in our gestalts wiki article. You might want to refer back to the polykey-design repo too.

@emmacasolin emmacasolin changed the title Review Gestalt Graph CLI Integration Extending and exposing the identitiesInfoGetConnected RPC Handler Jan 21, 2022
@emmacasolin emmacasolin changed the title Extending and exposing the identitiesInfoGetConnected RPC Handler Implement Social Discovery to add found DIs to the Gestalt Graph Jan 23, 2022
@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 24, 2022

I want to point out a few related issues to this:

During our discussion about discovery, and how it should be a background asynchronous spidering process. I was thinking of making use of some async abstractions to implement this:

Unlike the EventBus, things can trigger discovery by queueing up discovery tasks, but this is all done asynchronously asynchronous, so they don't wait for the discovery to finish. The discovery system allows eventual consistency of the gestalt graph.

@emmacasolin
Copy link
Contributor Author

Everything from this issue has been resolved in #320, except for adding nodes/identities to the Gestalt Graph during vault sharing (since this has to wait until the vaults work being done in #266 is completed).

Now we have the following features in order to implement social discovery:

  • To search for connected identities, run the command identities search - under the hood this calls either identitiesInfoGet for finding information about specific identities or identitiesInfoConnectedGet to get information about all connected identities across one or more providers
    • The result of this command will be a list of identities with all of the available information on each identity found
  • Once you have found an identity that you want to trust, you can set the notify permission for it by running the command identities trust [identity] (this also works for nodes) - under the hood this calls gestaltsGestaltTrustByIdentity (or gestaltsGestaltTrustByNode), which sets the identity/node into the gestalt graph and sets the permission for it. This command also queues the identity/node for discovery in order to find the rest of the linked gestalt.
  • Discovery is done automatically and unassisted in the background at all times (while there are identities/nodes queued for discovery), however, you can also manually choose to discover a gestalt by running the command identities discover [identity/node], which will queue the identity/node for discovery and eventually add it to the gestalt graph

With the combination of these features, the gestalt graph will be maintained with the current state of all of the gestalts you trust.

@teebirdy teebirdy added the r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy label Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Requires design development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy
Development

Successfully merging a pull request may close this issue.

3 participants