Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upReplication may be pushing too many feeds into the connection #39
Comments
pfrazee
added
the
dicussion
label
Jan 4, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Jan 4, 2017
Member
@maxogden I think this might relate to your remarks earlier about the archiver-server being passive. The current archiver-bot does set passive to true when it replicates. There's definitely a scaling issue there.
But, if passive is true, then the public peer wont ask other public peers for anything. I'm I'm understanding this correctly, we'll need some kind of middle-ground; an algorithm for asking for updates with proper throttling.
|
@maxogden I think this might relate to your remarks earlier about the archiver-server being passive. The current archiver-bot does set But, if passive is true, then the public peer wont ask other public peers for anything. I'm I'm understanding this correctly, we'll need some kind of middle-ground; an algorithm for asking for updates with proper throttling. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
maxogden
Jan 4, 2017
Clarifying question on that code (hard for me to understand due to vague method/variable names), is this the line that 'adds' a feed to a connection? https://github.com/mafintosh/hypercore-archiver/blob/dd34d62253d56604c94d8785e5e39b83816fb30f/index.js#L194 So the issue is the archiver will call .replicate many times over one connection?
Why is it doing that in the first place? Can't we just only call .replicate() for the hypercore that the connection is asking for?
maxogden
commented
Jan 4, 2017
|
Clarifying question on that code (hard for me to understand due to vague method/variable names), is this the line that 'adds' a feed to a connection? https://github.com/mafintosh/hypercore-archiver/blob/dd34d62253d56604c94d8785e5e39b83816fb30f/index.js#L194 So the issue is the archiver will call .replicate many times over one connection? Why is it doing that in the first place? Can't we just only call .replicate() for the hypercore that the connection is asking for? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Jan 4, 2017
Member
Why is it doing that in the first place? Can't we just only call .replicate() for the hypercore that the connection is asking for?
As I understand it, you need to call feed.replicate() for every feed you want to sync.
I believe the issue is, that we only have two modes: 1) ask to sync every feed we have stored locally, or 2) don't ask to sync anything and let the peer make the feed.replicate() calls.
The latter is passive-mode. If two passive-mode peers connect, no transfer will occur. That's the problem you remarked on, earlier.
However, non-passive-mode will have a scaling problem at some point. You'll ask to sync too many feeds for the connection.
As I understand it, you need to call I believe the issue is, that we only have two modes: 1) ask to sync every feed we have stored locally, or 2) don't ask to sync anything and let the peer make the The latter is passive-mode. If two passive-mode peers connect, no transfer will occur. That's the problem you remarked on, earlier. However, non-passive-mode will have a scaling problem at some point. You'll ask to sync too many feeds for the connection. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
maxogden
commented
Jan 4, 2017
|
What if we just used 1 connection per .replicate()? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Jan 4, 2017
Member
No that wouldn't solve the problem. Basically the problem is that hyperclouds are interested in too many hypercores. A peer will show up and the hypercore will ask "you have anything new for 10mm cores?" Too thirsty.
We do want the hypercloud to ask about some of their cores. Just not all of them, every time.
|
No that wouldn't solve the problem. Basically the problem is that hyperclouds are interested in too many hypercores. A peer will show up and the hypercore will ask "you have anything new for 10mm cores?" Too thirsty. We do want the hypercloud to ask about some of their cores. Just not all of them, every time. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
joehand
Jan 4, 2017
Collaborator
I'm guessing this means that hypercloud will, at minimum, announce all currently stored archives at the time of connect. That can't scale. Shouldn't the hypercloud sit and wait for requests, passively?
Important to note that announcing is separate from opening the feed. In archiver-server, there is a random timeout to avoid flooding all those announcements but still likely a problem.
But both are issues: 1) have many feeds open and 2) announcing too many things at once
pfrazee: jhand: to clarify, there's two places where a flood could happen. The one you linked to is announcing on the discovery network. The other one, which max and I are discussing, is announcing feeds once a connection is established between peers
Ah!
Important to note that announcing is separate from opening the feed. In archiver-server, there is a random timeout to avoid flooding all those announcements but still likely a problem. But both are issues: 1) have many feeds open and 2) announcing too many things at once
Ah! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
(Max and I clarified our points in IRC) |
pfrazee commentedJan 4, 2017
If you look in hypercore-archiver, the replication code adds all stored feeds to the connection. (Its current usage, in archiver-server, does not set passive to false.)
I'm guessing this means that hypercloud will, at minimum, announce all currently stored archives at the time of connect. That can't scale. Shouldn't the hypercloud sit and wait for requests, passively?