Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add graphql pub syncing #11

Merged
merged 8 commits into from Aug 5, 2020
Merged

Add graphql pub syncing #11

merged 8 commits into from Aug 5, 2020

Conversation

sgwilym
Copy link
Collaborator

@sgwilym sgwilym commented Aug 4, 2020

This adds a new exported function for syncing documents to a IStorage from a GraphQL endpoint, which means that earthstar-graphql servers can now act as pubs!

As well as this, the syncGraphql function also supports filters, so peers can ask earthstar-graphql pubs for only a subset of documents they're interested in, i.e. efficient sync / partial replication.

This new way of syncing has also been added as an argument to the sync mutation:

mutation SyncMutation {
  sync(workspace: "+comics.123", pubUrl: "https://graphql.pub", format: GRAPHQL) {
     # ... etc
  }
}

Closes #5, Closes #6

@sgwilym sgwilym added this to To do in v3.0.0 via automation Aug 4, 2020
@sgwilym sgwilym moved this from To do to In Progress in v3.0.0 Aug 4, 2020
@cinnamon-bun
Copy link
Member

Awesome!

Should we call these sync queries or sync filters?

There are 2 kinds of them, incoming and outgoing. I haven't implemented them anywhere yet though.

Peer1 --> Peer1's outgoing filter ---- - - - ----> Peer2's incoming filter --> Peer2

Peer1 <-- Peer1's incoming filter <--- - - - ----- Peer2's outgoing filter <-- Peer2

The outgoing filter is what you are willing to share. Maybe you only want to upload /wiki/* but not /photos/* with a certain pub.

The incoming filter is what you want.

Documents have to pass through both filters to be successfully sync'd (they have to be offererd and wanted).

@cinnamon-bun
Copy link
Member

More details in this issue, including that you can stack multiple queries to combine them.

Also, eventually, for efficiency: Peer1 can ask Peer2 for its incoming filter and apply it before transmitting, to reduce the amount of transmitted data.

Peer2 still needs to apply it again, because it doesn't trust Peer1 to do it right.

one-way sync example. double this for 2-way sync

PEER1                                            network       PEER2
===========================================                    ========================
Peer1 --> Peer1.outgoing --> Peer2.incoming --- - - - - -----> Peer2.incoming --> Peer2

@sgwilym sgwilym changed the base branch from document-filters to master August 5, 2020 14:04
@sgwilym
Copy link
Collaborator Author

sgwilym commented Aug 5, 2020

Thanks to your comments and fresh eyes this morning, I came back at this with quite a few changes.

Here’s what syncing looks like between two GraphQL peers with these new additions:

  1. Peer A sends a GraphQL query to Peer B
    • Peer A queries for documents Peer B holds that conform to its sync filters
    • Peer A also queries Peer B’s sync filters
  2. Peer A compiles a list of documents conforming to Peer B’s sync filters
  3. Peer A sends these documents to Peer B using a ingestDocuments mutation
  4. Peer B ingests the documents from Peer A
    • Peer B currently trusts Peer A to have filtered the documents correctly.
  5. Peer A ingests the documents received from step 1
    • Peer A currently trusts Peer B to have filtered the documents correctly.

Some things I noticed while implementing this:

  1. In this implementation sync filters are a global setting that apply to all workspaces, which is probably not the intent.
  2. Implementing the incoming filtering of documents received seemed difficult to do with only access to an array of Document, particularly versionsByAuthors, so I punted on it.

What if a workspace’s sync filters were stored somewhere in its IStorage? Then filtering could happen during document ingestion, and it would be easier to keep track of which workspace has which sync filters.

I also changed the sync mutation’s name to syncWithPub to make it clearer that this mutation would cause earthstar-graphql to initiate a sync rather than be on the receiving end of one (in which case it would receive an ingestDocuments mutation).

This conceptual awkwardness is a side-effect of this lib being able to power a single client and a pub. 😬 I don’t think this’ll be the last time, either.

The two different context creation functions have been replaced by a single one. I think the schema context shape will become easier to manage if IStorage is ever made to store multiple workspaces.

@sgwilym sgwilym merged commit f9d6ac4 into master Aug 5, 2020
v3.0.0 automation moved this from In Progress to Done Aug 5, 2020
@sgwilym sgwilym deleted the sync-graphql branch August 5, 2020 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
v3.0.0
  
Done
Development

Successfully merging this pull request may close these issues.

Add filters for document fields Add syncGraphQL export
2 participants