Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner web seed implementation #650

Open
feross opened this issue Mar 1, 2016 · 13 comments
Open

Cleaner web seed implementation #650

feross opened this issue Mar 1, 2016 · 13 comments
Labels

Comments

@feross
Copy link
Member

@feross feross commented Mar 1, 2016

Ideally web seeds would be handled directly as a special-case instead of treating the web seed as another peer and interpreting wire protocol requests as http requests. That was a clever hack to get this shipped quickly, but we can do better.

We can find pieces that have no blocks reserved and give those entire pieces to the web seed. The request size can even span multiple pieces.

Once this is done, we should revert this change to torrent-piece, which amounts to a hack: webtorrent/torrent-piece#1

@grunjol

This comment has been minimized.

Copy link
Contributor

@grunjol grunjol commented Mar 3, 2016

Last week, while testing Diego's changes in webseed handling (nice to see they are merged now!) I found two 'issues'

  1. As you mention, the request size to the server is right now the size of the piece, so depending on the piece size in the torrent definition the web server gets hammered.

  2. If the web server has some kind of connection limitations (like nginx's limit_conn addr), once you complete the first pieces, the server start sending 50x errors and the web peer is discarded. The special-case should consider the number of connections to the webseed.

@DiegoRBaquero

This comment has been minimized.

Copy link
Member

@DiegoRBaquero DiegoRBaquero commented Mar 3, 2016

Another issue I found yesterday and shared on Gitter is that when we have 10+ connections each of 128KB requests and your internet slow, one or more requests will timeout causing the web seed to be removed.

@grunjol

This comment has been minimized.

Copy link
Contributor

@grunjol grunjol commented Mar 3, 2016

Exactly, i was trying to understand why is requesting 10+ connections... tried to mitigate with a choke() but all the requests get removed from wire and then web seed gets removed from client...

I think the solution is not try to handle web seeds as a peer in the swarm. Maybe get back to last year's approach to start a new parallel 'http download service' that reserves the pieces and return them if something gets wrong with the request.

@grunjol

This comment has been minimized.

Copy link
Contributor

@grunjol grunjol commented Mar 9, 2016

@DiegoRBaquero : the fix you sent reduce the amount of requests if it is a slow webseed, but it still calculates the number of connections based on getPipelineLength using speed/piece size, but if the webseed is FAST (like in a CDN), the number of connections increase fast and in the third request you hit again with the limit per IP.

Instead i tried

if (isWebSeed) maxOutstandingRequests = Math.min(maxOutstandingRequests, MAX_CONCURRENT_WEBSEED_CONNECTIONS) // A webseed will handle it's real max requests

...and it works like a charm.

In my quick test i made MAX_CONCURRENT_WEBSEED_CONNECTIONS = 2, but it can be passed as paramater in torrent opts.

@DiegoRBaquero

This comment has been minimized.

Copy link
Member

@DiegoRBaquero DiegoRBaquero commented Mar 9, 2016

@grunjol The idea of the PR is to set the max requests based on the current speed, not limit to a "reasonable" http max requests limit (which I'd say is in 16-32 concurrent).

If the webseed is fast, as you say, then more requests can be done. If it's slow, there will always be the minimum of 2.

I don't think any webserver limits the concurrent requests per IP to a lower amount than 8. And I haven't seen any errors related to a connection limit but rather requests timing out because of bandwidth capacity, which again is the purpose of my PR.

@grunjol

This comment has been minimized.

Copy link
Contributor

@grunjol grunjol commented Mar 9, 2016

@DiegoRBaquero Yes, i understand your PR, and its limiting the amount of connections in the case of a slow bandwith.

I'm just saying that other server limits can be reached like max simultaneous connections (and yes, commercial and free streamers like Wowza or Nimble have [BW and connection] per IP limits) that can be handle with a little modification. In those cases, the server get HTTP 500 errors and the connection is discarded too.

I will send another PR to handle that case (the upper limit case)

@DiegoRBaquero

This comment has been minimized.

Copy link
Member

@DiegoRBaquero DiegoRBaquero commented Mar 9, 2016

@grunjol Yeah, I think we should allow an optional value in the client for webseeds max requests. Looking forward the PR

@grunjol

This comment has been minimized.

Copy link
Contributor

@grunjol grunjol commented Mar 9, 2016

I just sent it

You can test the behavior with your test environment. I think hard limiting the connections prevents the sever being hammered (think it in scale, the number of server connections is limited) and also will help in your case of slow connection too.

@feross

This comment has been minimized.

Copy link
Member Author

@feross feross commented Mar 9, 2016

a "reasonable" http max requests limit (which I'd say is in 16-32 concurrent)

Most browsers enforce no more than 6 concurrent connection to the same server. This has been the case for a very long time. See: https://www.browserscope.org/?category=network

In an ideal world, we would make a single large request to fetch a large contiguous range (ideally around 2-5 pieces) and, once that is complete, make another request. There's no need for lots of concurrent connections.

@DiegoRBaquero

This comment has been minimized.

Copy link
Member

@DiegoRBaquero DiegoRBaquero commented Mar 10, 2016

Yes, but we are not targeting customer-end requests. Webservers by default don't have any limit so unless specified (as grunjol suugested and PRd) I don't see why we should hard limit to any number of max requests.

I like the idea of sockets webseed, thoughts?

@gpetrov

This comment has been minimized.

Copy link
Contributor

@gpetrov gpetrov commented Nov 4, 2016

is this improved already? As we don't see any info when webseeds get stuck.

@gpetrov

This comment has been minimized.

Copy link
Contributor

@gpetrov gpetrov commented Nov 6, 2016

@DiegoRBaquero actually we observed that when used webseed each piece is requested 16 times from the webseed server! Is this a bug or related to the number of simultaneous connections to the webseed server? Maybe this is related to the choke behavior we also get afterwards.

@stale

This comment has been minimized.

Copy link

@stale stale bot commented May 3, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 3, 2018
@feross feross added accepted and removed stale labels May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.