Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Clarifying what Graphsync means to @jbenet and @whyrusleeping #96

Closed
vmx opened this issue Feb 19, 2019 · 6 comments
Closed

Clarifying what Graphsync means to @jbenet and @whyrusleeping #96

vmx opened this issue Feb 19, 2019 · 6 comments
Labels
kind/support A question or request for support

Comments

@vmx
Copy link
Member

vmx commented Feb 19, 2019

I thought I finally have a good understanding of what Graphsync means to @jbenet and @whyrusleeping. But I don't. Hence I open this issue to see if we can clarify things for me once more. Below is my current view.

The vision

Graphsync is about synchronising remote DAGs. It's a way to make sure you have the data you need locally available. In order to get there, you send requests to other peers which will then reply with the missing bits. Once fully implemented, it works across a peer-to-peer network.

The short-cut for now

In order to get there, we've simplified things down to:

  • Instead of working across a full peer-to-peer network, the networking is restricted to two peers talking to each other.
  • Instead of being able to request very specialised sub-sets of the DAG we restrict it to simple, broader selectors.
  • Instead of being able to upgrade/downgrade/change an already running request, you can only cancel a request.

Thought reverse, we want to build the non-simplified thing in the future.

What we are building now

  • A wire protocol that supports the full system, but we only leverage/implement the parts needed for the short-cut.
  • Simpler selectors that are useful for the use-cases we have right now.
  • No state on the server, once a traversal is started, it either finishes, or gets cancelled. It's never changed while running.

What we are not building

  • A smaller system that is useful by its own, before the full system is ready.
  • A wire protocol that serves the short-cut version better and could be extended later on (e.g. to optimise things).
  • A system that can be used as building block for larger ones (e.g. a system that could be used to implement Graphsync proposal (a/c)).
@vmx vmx added the kind/support A question or request for support label Feb 19, 2019
@whyrusleeping
Copy link
Contributor

graphsync is basically just an upgraded bitswap, with the ability to ask for selectors instead of just single hashes. At the same time, we want to include a few improvements to the protocol, to allow for extensions and better logic to be written around it (the 'I dont have that' error message in particular would have been really helpful for bitswap).

Ideally, graphsync is a drop-in replacement for bitswap, and could be used everywhere bitswap is used. In addition, it would be able to ask for more complicated things, if the caller requests it.

@vmx
Copy link
Member Author

vmx commented Feb 19, 2019

Thanks @whyrusleeping for the reply. Although I heard Graphsync being "a better Bitswap" floating around, but articulated so clearly is news to me. I find the current proposal reaching for a much bigger thing that hasn't much in common with Bitswap. Though it might be my lack of a deep understanding of Bitswap.

@vmx
Copy link
Member Author

vmx commented Feb 19, 2019

@whyrusleeping It would also be great to hear how much you agree/disagree on the individual points I'm making in the issue. Things are still too vague for me. I'd like to get a more concrete point of view on Graphsync (which is the reason why I created this issue).

@whyrusleeping
Copy link
Contributor

@vmx I would encourage you to read through and understand what bitswap does. Graphsync as described in the proposal juan and I wrote up isnt that big of a change from bitswap. The entire name graphsync was weird to me, as for me its always been just "making bitswap support selectors". I'm down with making it a separate thing for 'reasons', but I just want to point out its really not that hard. The hardest part about the change would be making bitswap react well to being sent back error messages.

To respond to each of your points:

* Instead of working across a full peer-to-peer network, the networking is restricted to two peers talking to each other.

A peer to peer network is a bunch of peers communicating with eachother individually. I'm not sure what the misunderstanding here is. To be clear, bitswap and graphsync are both protocols that allow a peer to request data from another peer.

* Instead of being able to request very specialised sub-sets of the DAG we restrict it to simple, broader selectors.

We should be able to support any selectors, but in the short term, we don't have to implement all of them.

* Instead of being able to upgrade/downgrade/change an already running request, you can only cancel a request.

This seems like something that doesnt affect the protocol itself, just the implementation. We can move towards changing live requests later without necessarily changing the protocol.

* A wire protocol that supports the full system, but we only leverage/implement the parts needed for the short-cut.

Yes

* Simpler selectors that are useful for the use-cases we have right now.

Yes

* No state on the server, once a traversal is started, it either finishes, or gets cancelled. It's never changed while running.

I guess. The alternative I think youre saying we are not going to do is "Keep requests for clients that we can't currently fulfil around in case we receive that data in the future" (which is how bitswap works).

* A smaller system that is useful by its own, before the full system is ready.

I think thats what we are building... if we're building less than all the selectors, and not implementing certain parts of the client behavior, then its a smaller (still useful) system.

* A wire protocol that serves the short-cut version better and could be extended later on (e.g. to optimise things).

Yeah, the wire protocol shouldnt have to change, just upgrading clients to support new selectors and such.

* A system that can be used as building block for larger ones (e.g. a system that could be used to implement Graphsync proposal (a/c)).

I'm not exactly sure what youre implying with this. Graphsync is definitely designed to be part of a bigger system. Like IPFS, or Filecoin

@vmx
Copy link
Member Author

vmx commented Feb 20, 2019

Thanks @whyrusleeping for taking the time for the detailed replies. This is really helpful to me.

A peer to peer network is a bunch of peers communicating with eachother individually. I'm not sure what the misunderstanding here is. To be clear, bitswap and graphsync are both protocols that allow a peer to request data from another peer.

What I meant is that in Bitswap it's more of a "give me things from all the peers I know" rather then a point-to-point connection what Graphsync is atm.

This seems like something that doesnt affect the protocol itself, just the implementation. We can move towards changing live requests later without necessarily changing the protocol.

You would need to send a message saying "I want this request to do x instead of y". If you do it over the same request ID (which is what you would need to do if you don't want to add new messages to the protocol), how would you know that the responses coming it are now for the new selector and not the old one?

I guess. The alternative I think youre saying we are not going to do is "Keep requests for clients that we can't currently fulfil around in case we receive that data in the future" (which is how bitswap works).

Correct. It's a good point though. But for the first version we don't do anything like "if you don't have it yourself, ask some other peer" (which is also related to the point above where I mention the point-to-point connection), so getting a block a peer currently doesn't have, but could forward would be a pure coincidence.

What we are not building

With all those points in this section I raise my concerns about the complexity of the wire protocol, the "multiple responses for multiple requests" in the same message. You would need to code a de-muxer yourself instead of being able to just using libp2p-mplex. Hence I don't consider a small building block, but a small piece of a bigger system.

@vmx
Copy link
Member Author

vmx commented Feb 25, 2019

I have a clearer understanding now, hence closing this issue.

@vmx vmx closed this as completed Feb 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/support A question or request for support
Projects
None yet
Development

No branches or pull requests

2 participants