Optimized selector traversal over graphsync #253

hannahhoward · 2021-10-23T00:07:26Z

What

What we'd ultimately like to do is prevent duplicate selector traversals of the same block for certainly types of IPLD graph traversals. While the logic to detect "duplicates" is quite complicated and will evolve over time (our very first implementation is to special case just the traverse-all selector), we need a way to communicate about it over the network.

Rather than try to communicate about how to perform optimizations between peers, we're going to allow each side to optimize according to their OWN best interest and let the other know what they did. Most likely responders will almost always turn this on cause it minimizes their processing time. For client requests, we'll probably make it configurable when you initialize graphsync or perhaps at the request level as well. As we evolve our logic for detecting duplicate traversals, this will enable us to upgrade only one side and still gracefully finish a request.

How

currently, the schema for metadata about links traversed is as follows, encoded in the extension `graphsync/response-metadata:

type LinkMetadata struct {
  link Cid
  blockPresent Bool
}

type ResponseMetadata [LinkMetadata]

This ticket proposes a graphsync/metadata/1.1 extension of this format:

type BlockStatus enum {
  | Present  ("0")
  | Absent   ("1")
  | DuplicateTraversal ("2")
} representation int

type LinkMetadata struct {
  link Cid
  blockStatus BlockStatus
}

type ResponseMetadata [LinkMetadata]

When a responder sends DuplicateTraversal, it is telling the requestor that they their own traversal skipped this link, even though they have this block, because they believe it to be a duplicate. The requestor can expect no blocks or links to be sent for this part of the tree, whether or not the responder is telling the truth that it's a duplicate.

When a requestor receives this message, how it proceeds is up to the local configuration around duplicate traversals, and whether the traversal is indeed a duplicate according to local logic for detecting duplicate traversals. If it does proceed down the path, it can be assured that till it returns to that part of the traversal, all blocks loads will be local based on data previously received. If there is no local block, the requestor should not wait for a remote block.

Backwards Compatibility

It's safe to assume many requestors will support only the legacy metadata format for some time, but responders may still want to limit duplicate traversals for requestors supporting the new format.

With this in mind, I think we had better add a requestor side extension graphsync/response-metadata-version:

type Version int

And those that send the extension with a 1.1 are telling the responder ahead of time they support metadata 1.1

Alternatively, we know we don't want metadata to be an extension at all but rather part of the core protoocol, so maybe we should fold this into graphsync 1.1 encoded as CBOR (#88, ipld/specs#354, ipld/specs#355)

This is a meta-issue that will lead to implementation tickets once I have sign-off from @warpfork and others

The text was updated successfully, but these errors were encountered:

hannahhoward · 2021-10-23T00:43:42Z

Note re labels -- I believe this will be intermediate once an implementation path is well defined, but for now it's blocked pending approval of the overall strategy

rvagg · 2021-10-26T02:39:03Z

This approach seems fine to me, although "DuplicateTraversal" may be too specific, unless you're considering adding further specific reasons to the enum. Some examples (pending work on selectors upstream):

Exhaustive selector where we don't want to touch duplicate blocks - "DuplicateTraversal" makes sense for this
We've seen this block 6 times already, but this time I've decided that it doesn't need to be traversed again (example suggests that traversals can decide to switch modes when they switch from "selector is not exhaustive" to "all selection below me is exhaustive"). "DuplicateTraversal" is probably still fine for this, but it might not send a specific enough signal?
Exceeded traversal budget on this branch, but let's not consider it a failure and agree that the DAG stops there because we've imposed that limit (on one or both sides).

These could all be separate reasons, and maybe that's OK and we're anticipating expanding the enum? But if we just want to send a "I have good reasons not to send you this block and it's not a failure", then the name should probably be changed.

release: v1.10.0

marten-seemann pushed a commit that referenced this issue Mar 2, 2023

Merge pull request #253 from filecoin-project/release/v1.10.0

7870236

release: v1.10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized selector traversal over graphsync #253

Optimized selector traversal over graphsync #253

hannahhoward commented Oct 23, 2021 •

edited

hannahhoward commented Oct 23, 2021

rvagg commented Oct 26, 2021

Optimized selector traversal over graphsync #253

Optimized selector traversal over graphsync #253

Comments

hannahhoward commented Oct 23, 2021 • edited

What

How

Backwards Compatibility

hannahhoward commented Oct 23, 2021

rvagg commented Oct 26, 2021

hannahhoward commented Oct 23, 2021 •

edited