Priortise FastRetrieval candidates from indexer #39

rvagg · 2023-01-17T10:23:33Z

This has been discussed in the context of autoretrieve for some time now but I think originally they were all set to true but now it appears Boost has this wired up so keep-unsealed-copy config = FastRetrieval metadata.

How:

The Metadata field from the indexer results is base64 (padded) encoded varint-prefixed dag-cbor which can be decoded via https://github.com/ipni/index-provider/blob/19931fd5c692e5efd38093c1764cf0a3ca464af4/metadata/graphsync_filecoinv1.go#L66
FastRetrieval = true entries should get priority in the queryCompare function for the prioritywaitqueue when we have multiple queries returned and waiting for attempts.

Questions:

Should prioritisation be even higher than this? If candidate A is FastRetrieval=false and B is FastRetrieval=true and A returns its query first, should we hold off on attempting a retrieval and give B a chance to come back first?
Perhaps we should shorten the first-byte timeout for FastRetrieval=false candidates under the assumption that they may be attempting an unseal?
If we have >X candidates and some high percentage of them are FastRetrieval=true, perhaps we should filter out the others and only attempt the ones we assume have unsealed copies? e.g. 10 candidates for a CID, 7 of them are FastRetrieval=false, don't even bother attempting the other 3?

The text was updated successfully, but these errors were encountered:

hannahhoward · 2023-01-20T15:01:10Z

Let's talk through this in more detail. As we start thinking about how we get the data the fastest, here are some thoughts:

I think we need to find a way to skip the query phase. My proposal is as follows:
- simply propose free deals to each returned peer, one by one -- they'll respond if they don't accept free
- prioritize the order by:
  - first, whether peers we have an existing libp2p connection to first (since this is a ~0.5s penalty)
  - second, by peers who advertise "fast retrieval" and/or "verified data" first (since these are most likely to return free retrievals)
  - eventually, by tracking information about peers and prioritizing the most efficient ones (probabilisticly)

rvagg mentioned this issue Jan 17, 2023

flexibility in protocols returned from indexer #35

Closed

hannahhoward mentioned this issue Mar 10, 2023

Removing the GraphSync Query phase #147

Closed

hannahhoward closed this as completed Apr 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Priortise FastRetrieval candidates from indexer #39

Priortise FastRetrieval candidates from indexer #39

rvagg commented Jan 17, 2023

hannahhoward commented Jan 20, 2023

Priortise FastRetrieval candidates from indexer #39

Priortise FastRetrieval candidates from indexer #39

Comments

rvagg commented Jan 17, 2023

hannahhoward commented Jan 20, 2023