Skip to content

Commit

Permalink
feat(iroh): downloader (#1420)
Browse files Browse the repository at this point in the history
## Description

Adds the `Downloader` as specified in #1334 plus some backchannel convos
Features include:
- Support collections
- Add delays to downloads
- Add download retries with an incremental backoff 
- Keeping peers for a bit longer than necessary in hopes they will be
useful again
- Having the concept of intents and deduplicating downloads efforts
- Cancelling download intents
- Limiting how many concurrent requests are done in total
- Limiting how many concurrent requests are done per peer
- Limiting the number of open connections in total
- Basic error management in the form of deciding whether a peer should
be dropped, the request should be dropped, or if the request should be
retried

## Notes & open questions
### TODOs
- A remaining TODO in the code is whether something special should be
done when dropping quic connections
- Should downloads have a timeout?
- ~<sup>I know I've said this a hundred times with a hundred different
things but would love to test this as well under stress scenarios and a
large number of peers. don't hate me</sup>~
In reality after abstracting away all the IO most scenarios can be
simulated easily. What would remain for a _much_ later time when the
need and opportunity for real case testing scenario arises is to tune
the concurrency parameters

### Future work
#### Downloading Ranges
There was the requirement of downloading a Hash, a range of a Hash, a
collection and (not mentioned but potentially implied) ranges of
collections. There is no support for ranges right now because of the
great duplication of the `get` code in order to take advantage of proper
errors added in #1362. In principle, adding ranges should be really
easy. This is because it's an extension of the `DownloadKind` and would
simply need calculating the missing ranges not based on the difference
between what we have and the whole range, but the given range. I would
prefer to find a way to deduplicate the get code before doing this
extension.
Also, as far as I can tell, there is no need for this yet.
#### Prioritizing candidates per role: `Provider` and `Candidate`
A nice extension, as discussed at some point, is to differentiate
candidates we know have the data, from those that _might_ have the data.
This has added benefit that when a peer is available to perform another
download under the concurrency limits, a hash we know they have could be
downloaded right away instead of waiting for the delay. At this point
making this doesn't make sense because we will likely attempt a download
before the peer has retrieved the data themselves. To implement this, we
would need to add the notifications of fully downloaded hashes as
available into gossip first.
#### Leveraging the info from gossip
When declaring that a hash `X` should be downloaded, it's also an option
to query gossip for peers that are subscribed to the topic to which `X`
belongs to use them as candidates. This could be done connecting the
`ProviderMap` to `gossip`. For now I don't see the need to do this.

### Open questions about Future work
- In line with the described work from above, the registry only allows
querying for peer candidates to a hash since that's as good as it gets
in terms of what we know from a remote right now. It's not clear to me
if we would want to change this to have better availability information
with #1413 in progress.
- More future work: downloading a large data set/blob from multiple
peers would most likely require us to do a three step process:
  1. understanding the data layout/size.
  2. splitting the download.
  3. actually performing the separate downloads.
Generally curious how this will end. My question here is whether we
should do this for every download, or just on data that we expect to be
big. Is there any way to obtain such hint without relying on a query
every single time?


## Change checklist

- [x] Self-review.
- [x] Documentation updates if relevant.
- [x] Tests if relevant.
  • Loading branch information
divagant-martian committed Sep 13, 2023
1 parent 6397d46 commit c217283
Show file tree
Hide file tree
Showing 19 changed files with 2,082 additions and 456 deletions.
93 changes: 47 additions & 46 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions iroh-bytes/src/protocol/range_spec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ use smallvec::{smallvec, SmallVec};
///
/// This is a SmallVec so we can avoid allocations for the very common case of a single
/// chunk range.
#[derive(Deserialize, Serialize, PartialEq, Eq, Clone)]
#[derive(Deserialize, Serialize, PartialEq, Eq, Clone, Hash)]
#[repr(transparent)]
pub struct RangeSpec(SmallVec<[u64; 2]>);

Expand Down Expand Up @@ -152,7 +152,7 @@ impl fmt::Debug for RangeSpec {
///
/// This is a smallvec so that we can avoid allocations in the common case of a single child
/// range.
#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone)]
#[derive(Deserialize, Serialize, Debug, PartialEq, Eq, Clone, Hash)]
#[repr(transparent)]
pub struct RangeSpecSeq(SmallVec<[(u64, RangeSpec); 2]>);

Expand Down
2 changes: 1 addition & 1 deletion iroh-gossip/src/net.rs
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ impl Actor {
let peer_data = postcard::to_stdvec(&info)?;
self.handle_in_event(InEvent::UpdatePeerData(peer_data.into()), Instant::now()).await?;
}
(peer_id, res) = self.dialer.next() => {
(peer_id, res) = self.dialer.next_conn() => {
match res {
Ok(conn) => {
debug!(?me, peer = ?peer_id, "dial successfull");
Expand Down
Loading

0 comments on commit c217283

Please sign in to comment.