Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bandwidth limiting #3065

Open
whyrusleeping opened this issue Aug 9, 2016 · 57 comments
Open

Implement bandwidth limiting #3065

whyrusleeping opened this issue Aug 9, 2016 · 57 comments
Labels
exp/expert Having worked on the specific codebase is important help wanted Seeking public contribution on this issue status/deferred Conscious decision to pause or backlog topic/libp2p Topic libp2p

Comments

@whyrusleeping
Copy link
Member

whyrusleeping commented Aug 9, 2016

We need to place limits on the bandwidth ipfs uses. We can do this a few different ways (or a combination thereof):

  • per peer limiting on each actual connection object
    • pros:
      • low coordination cost (no shared objects between connections)
      • should have lower impact on performance than blindly rate limiting the whole process
    • cons:
      • no flow control between protocols, dht could drown out bitswap traffic
  • per subnet limiting
    • pros:
      • avoids rate-limiting LAN/localhost connections.
    • cons:
      • it's not always possible to tell what's "local" (e.g., with IPv6).
  • per protocol limiting on each stream
    • pros:
      • should have the lowest impact on system performance of the three options
      • each protocol gets its own slice of the pie and doesnt impact others
    • cons:
      • increased coordination required, need to reference the same limits across multiple streams
      • still makes it difficult to precisely limit the overall bandwidth usage.
  • global limiting using a single rate limiter over all connections
    • pros:
      • will successfully limit the amount of bandwidth ipfs uses.
    • cons:
      • ipfs will be quite slow when rate limited in this way

Related Issues:

@whyrusleeping whyrusleeping added help wanted Seeking public contribution on this issue topic/libp2p Topic libp2p exp/expert Having worked on the specific codebase is important labels Aug 9, 2016
@whyrusleeping whyrusleeping added this to the Resource Constraints milestone Aug 9, 2016
@slothbag
Copy link

slothbag commented Aug 10, 2016

Here's two more related issues :)
#920
#1482

@k0d3g3ar
Copy link

k0d3g3ar commented Sep 10, 2017

This is critical if you want mass adoption. No one is going to risk their own local Internet connection bandwidth unless they can control it. That means using a 3rd party bandwidth limiter in front of IPFS which is just more complexity that isn't necessary.

@fiatjaf
Copy link

fiatjaf commented Sep 10, 2017

Perhaps using the alternative C by default with low limits, but when putting IPFS on an "active" state switch to A (or to no limit at all). The "active" state should be when the user is actively downloading, adding or pinning something and some time after that, or when he is using IPFS from some management GUI or JS app.

@EibrielInv
Copy link

EibrielInv commented Sep 12, 2017

I was thinking to implement (but never did), a script that alternates, every ~120 seconds between "offline" and "online" mode. It can also read the amount of connections, and restart the client when passes some threshold.
Something like:

  • Start client "online"
  • Wait 120 seconds
  • Kill client
  • Start client "offline"
  • Wait 120 seconds
  • Kill client
  • [Repeat]

@voidzero
Copy link

voidzero commented Jan 8, 2018

global limiting using a single rate limiter over all connections
cons:
ipfs will be quite slow when rate limited in this way

Global limiting has my vote. And I'm not sure if this con is true in all cases: bandwidth of course already has a hard limit (the limit of the connection). So if I already have a max of 20mbit down / 2mbit upload, and I limit ipfs to half of this, that is still a decent amount of bandwidth, isn't it?

@guybrush
Copy link

guybrush commented Mar 12, 2018

I think it would be best to do global limitation and then also limit per protocol relative to the global limit. For example let globalLimitUp = 1mbit/sec, globalLimitDown = 2mbit/sec and then every protocol gets its share of the available bandwidth depending on how important it is for ipfs to function properly.

Maybe i misunderstand the problem though, i just came here because i noticed the high use of bandwidth.

700 peers and 3.5 Mbps, both numbers climbing with no end? I am on win10 and ipfs@0.4.13 running the daemon with ipfs daemon --routing=dhtclient.

@Stebalien
Copy link
Member

Stebalien commented Mar 12, 2018

@guybrush FYI, you can limit the bandwidth usage by turning off the DHT server on your node by passing the --routing=dhtclient flag to your daemon.

@hitchhiker
Copy link

hitchhiker commented Apr 23, 2018

This is essential, checking back on this. Without limiting, it's hard for us to package this in projects -> we can't expect end users to accept such a heavy bandwidth requirement.

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Apr 23, 2018

Please just add an emoji to the issue itself to add your support. Comments in this thread should be reserved for discussion around the implementation of the feature itself.

@jefft0
Copy link
Contributor

jefft0 commented Jun 5, 2018

I've been running an IPFS daemon for years without problems. But with the latest builds in the past couple weeks, I have a lot of delays in trying to load web pages or even ssh into another server. It's now at the point where I have to shut down the IPFS daemon to do some tasks. My stats are below. The bandwidth doesn't look so bad, so why does my network suddenly seem clogged?

$ for p in /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 ; do echo ipfs stats bw --proto $p && ipfs stats bw --proto $p && echo "---" ; done
ipfs stats bw --proto /ipfs/bitswap/1.1.0
Bandwidth
TotalIn: 1.1 MB
TotalOut: 6.1 kB
RateIn: 1.9 kB/s
RateOut: 0 B/s

ipfs stats bw --proto /ipfs/dht
Bandwidth
TotalIn: 41 kB
TotalOut: 3.2 kB
RateIn: 483 B/s
RateOut: 1 B/s

ipfs stats bw --proto /ipfs/bitswap
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s

ipfs stats bw --proto /ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s

ipfs stats bw --proto /ipfs/kad/1.0.0
Bandwidth
TotalIn: 21 MB
TotalOut: 1.6 MB
RateIn: 164 kB/s
RateOut: 8.9 kB/s

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 5, 2018

@jefft0 thats odd... those stats seem relatively normal. Are you seeing any odd cpu activity? what sort of bandwidth utilization does your OS report from ipfs? Also, how many connections does your node normally have?

Another question is, since you mentioned noticing this on recent builds, does running an older version of ipfs fix the problem?

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 5, 2018

Also, cc @mgoelzer and @bigs, despite this being on the go-ipfs repo, this is definitely a libp2p issue. Worth getting on the roadmap for sure.

@jefft0
Copy link
Contributor

jefft0 commented Jun 6, 2018

I solved the problem by restarting my Internet router, restarting the computer, wiping the IPFS build directory and rebuilding the current version (but keeping my current ~/.ipfs folder). I know this wasn't very methodical, but I was desperate. Next time I have bandwidths problems I'll try to figure out which one of these causes the problem.

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 6, 2018

@jefft0 interesting. Thats actually more helpful information than you could have provided, thanks

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 6, 2018

Also, just so everyone watching this thread is aware, we have implemented a connection manager that limits the total number of connected peers. This can be configured in your ipfs config under Swarm.ConnMgr, see the config docs for more details.

@bigs
Copy link
Contributor

bigs commented Jun 6, 2018

Definitely a fan of the per-protocol limiting. Perhaps this could be handled with a weighting system? Assign weights to protocols and then set global settings (i.e. throttle after this amt of transfer per duration, halt all transfer after this limit within duration.)

@leshokunin
Copy link

leshokunin commented Aug 3, 2018

Very cool to see progress! How's the bandwidth cap (eg: 50kb/s) coming along? It'd be super useful for our desktop client :)

@douglasmsi
Copy link

douglasmsi commented Aug 24, 2018

Are there news about this topic?

@Stebalien
Copy link
Member

Stebalien commented Aug 24, 2018

Not at the moment. The current recommended approach is to limit bandwidth in the OS.

@dcflachs
Copy link

dcflachs commented Oct 24, 2019

@Bluebie Based on this it looks like µTP transports are already in the works down at the libp2p layer.

@dbaarda
Copy link

dbaarda commented Oct 25, 2019

Note the widely used packet-loss based congestion control for TCP is known as CUBIC. There is now a latency based TCP congestion control algorithm BBR (released/used by Google) that is widely available in Linux, and starting to be enabled by default on some distros.

BBR is MUCH better than CUBIC. I believe it's also used inside QUIC which uses UDP. The really nice thing about it is unlike CUBIC which ends up slowly tending towards keeping all buffers along the path full, it quickly converges on keeping the buffers almost empty, which not only avoids packet-loss hickups almost entirely, but minimizes latency.

The main argument for static limits is they are easy to understand for end users, and some users really do want to bandwidth limit a background service because they have bandwidth quota's they don't want to exhaust. So congestion is only one part of the reason, quotas is the other.

Perhaps a mixture of congestion/latency based control settings for transient spikes and overall "upto 10GB/month" quota settings would be best.

@Bluebie
Copy link

Bluebie commented Oct 27, 2019

It also looks like browser vendors are aiming to implement a proposal called WebRTC RMCAT which adds delay based congestion control among other things to WebRTC traffic, so it looks like we should be able to have good congestion control without global rate limits across all platforms in the future.

I really like the “upto 10GB/month” quota proposal. I think that would be really useful. For example, I’d love to run an IPFS node on my linode server, and enable relay to help support users who are badly stuck behind a nat, but it’s important for me that it doesn’t come at an extra cost so being able to donate, say, 800gb of that sort of support to the ecosystem over a month would be great to take out the risk from doing something like that. Maybe an implementation of that idea could be smart about prioritising generous services, like limiting participation in Relay work first, then when it gets even closer to the limit, starting to get more aggressive in cutting off other services, so stuff like DHT work can be prioritised and relaying doesn’t end up eating the quota quickly and effectively disconnecting the node from the DHT?

@lordcirth
Copy link

lordcirth commented Oct 27, 2019

With a simple hard limit, I would be concerned that we'd get spikes early in the month, and then dead later. There would definitely need to be more smarts involved than that. Perhaps even a daily limiter could work better?

@Bluebie
Copy link

Bluebie commented Oct 27, 2019

On the UI side I feel it would be good to be able to define it as one day, one week, or one month because that’s how people tend to think about data quotas.

On the implementation side, the first thing i’d want to try is this:

  1. on startup choose a random number between 0 and 59 inclusive
  2. setup a number, which counts how many bytes have been used for relay work
  3. set a timer, so every hour, at the random number minute, take that data counter number, store it to an array at the end, and reset it to 0. If the array contains more than 24*7 entries, delete the oldest entry
  4. setup another timer, executing every, lets say, five minutes, which sum’s all the numbers in the array together, and checks if the usage over the past week of hours is under 1 Week Quota * 0.99, and if it is, enable relay. If relay is already enabled, check if current usage over the past week of hours is over 1 Week Quota, and if it is, disable relay.

Then the nodes should gently alternate relay on and off as needed to keep it close to the quota. Choosing the random number for the minute offset at startup would help with ensuring the whole network doesn’t have more relays at the start of an hour and less near the end. The nice thing about it is it would average the availability over time, in little five minute chunks whenever they’re in budget, and hopefully not add too much work to the node in executing timer functions.

Could reduce how often timers run by only executing the quota check in response to actual relay requests, and maybe just turning the relay back on during that once an hour quota log turnover in step 3. So once a relay request comes in that exceeds the quota, it shuts off, and re-enables once more quota has become available in a random amount of time that’s no more than an hour from that moment.

@DanielMazurkiewicz
Copy link

DanielMazurkiewicz commented Oct 27, 2019

On the UI side I feel it would be good to be able to define it as one day, one week, or one month because that’s how people tend to think about data quotas.

Periodical quota limit autorenewal should be optional in my opinion. Would be more happy if could define profiles with connection speeds and data quota that would show up to choose after I crossed limits from current profile.

Predefined profiles could be shipped as a solution for users that are not familiar with technicalities


As for me it would be nice to have profiles consisting:

Limiting options per kinds of traffic:

  1. Ecosystem
  2. Data

Limiting options

  1. Connection speed (eg kbps)
  2. Data quota (eg MiB)
  3. Factor of received data (eg 2 x)

@dbaarda
Copy link

dbaarda commented Oct 27, 2019

On the implementation side, the first thing i’d want to try is this:

Don't do it that way. You want to use a control-systems approach, which will actually be simpler and work better. Use a low-pass filter of transmit rate, or effectively the same, an exponentially decaying traffic count, like this;

  c = (dc + c) * T / (T + dt)

Where;

  dt is the time since c was last updated.
  dc is the amount of traffic sent since the last time c was updated.
  T is the time over which you are averaging.
  c is effectively the amount of traffic sent in the past T time.

This is cheap enough (in compute and storage) you can calculated it at every transmit/lookup, or you could do it periodically. It is important to keep dt small compared to T for it to be accurate. It's not an 100% accurate measure of the traffic in the last T time, but its close enough and behaves better than moving windows when using it as an input to control something.

Toggling enable relay on/off is a pretty rough control mechanism, so you'll need a gap/on-off/bang-bang controller. You would need to tune the "gap" based on the rate of change of c (which depends on T and your available bandwidth), and what kind of rates IPFS can handle enable relay toggling at.

However, proportional control is usually better. If you already have a proportional controller of transmit bandwidth for congestion control, a much better idea would be to integrate the quota signal into it, so that transmit bandwidth is controlled by a single proportional control system that takes into account both congestion and quota signals.

@calroc
Copy link

calroc commented May 10, 2020

I've been told that this sort of thing should be done at the OS level. Maybe a blog post or two on how to do that (or how to figure out how to do that) would be useful?

E.g. if you were to spin up a VM on digital ocean to pin some content, how to ensure you're not going to get a surprise on the bill from excess bandwidth?

@Stebalien
Copy link
Member

Stebalien commented May 13, 2020

How really depends on your situation and google will likely give you a better solution for specific situations than a general-purpose "here's how to restrict bandwidth" blog post.

@davidak
Copy link

davidak commented May 13, 2020

Bittorrent clients usually have a feature to limit bandwith, so users might expect it from this software too.

When general solutions are available, please link to them in the documentation. A user might not be a professional in this field and just want to spin up a "IPFS node" on some hoster, like @calroc said.

@calroc
Copy link

calroc commented May 13, 2020

@Stebalien consider this scenario:

A user searches for information on "IPFS bandwidth limiting" and finds this ticket. It's closed so they hit end to see what the resolution was, and there's a link in the closing comment to a brief article describing the specific situations you might find yourself in, with links to solutions (on third-party sites or wherever such information is to be found.)

Or, they find this ticket, open, three-and-a-half years old, and you telling people to RTFM.

See, it's not a matter of addressing the ticket: the question is do you want this ticket to represent IPFS project policy on this issue?

FWIW, The answer I'm looking for is already in here. (To wit: use trickle. I did. IPFS+Trickle sat on a DO droplet and worked fine for so long I forgot it was running. One day last week I clicked on an old IPFS cloudflare gateway URL and was surprised my content was still accessible, that's how I was reminded! lol)

So just "bless" trickle with an official mention in the docs? And, while you're at it, mention that bandwidth usage might be something the hobbyist user might want to be aware of? It would suck for an unexpected bill to be a part of someone's maiden voyage with IPFS, eh?

Last but not least, how do the IPFS devs and community deal with this in general? Do y'all just know what to do without thinking about it? Or run in datacenters with phat pipes? If you have ways and techniques I'd love to hear about them, on the other hand if this isn't really a problem for you I'd love to know what I'm doing wrong?

@Stebalien
Copy link
Member

Stebalien commented May 13, 2020

@lordcirth
Copy link

lordcirth commented May 14, 2020

Last time I tried using Trickle with IPFS, it only limited the main thread, and all the other threads that used all the network traffic were unlimited. Is there a flag to get around that?

@calroc
Copy link

calroc commented May 15, 2020

@Stebalien cheers! I really want to use and promote IPFS and I sincerely believe this would help.

@lordcirth it was over a year ago that I last tried it. Something may have changed in the meantime, but back then IIRC back then Trickle did limit IPFS overall, not just the main thread.

@constantins2001
Copy link

constantins2001 commented May 16, 2020

I also would like to use IPFS in a P2P CDN, but as I'm unable to provide users with bandwidth limitation settings and this issue hasn't really progressed in years I think IPFS isn't a fit (sadly).

@Clay-Ferguson
Copy link

Clay-Ferguson commented Aug 6, 2020

I read all the above comments but I'm still unsure what the final disposition of this issue was?

Here's my docker compose definition for IPFS, in case anyone familiar with docker has any input, or suggestions, and also in case it helps others:

 ipfs: 
        container_name: ipfs 
        environment:
            routing: "dhtclient" 
            IPFS_PROFILE: "server"
            IPFS_PATH: "/data/ipfs"
        volumes:
            - '${ipfs_staging}:/export'
            - '${ipfs_data}:/data/ipfs'
        ports:
            - "4001:4001"
            - "8080:8080"
            - "5001:5001"
        networks:
            - net-prod
        image: ipfs/go-ipfs:release

I'm shooting for the minimal viable low-bandwidth use case configuration with no swarms, just a single instance of everything. The above config seems to work just fine, but I'm unsure if it's using the least possible bandwidth, or not.

@Stebalien
Copy link
Member

Stebalien commented Aug 10, 2020

Enabling the "lowpower" profile should help. That will disable background data reproviding, set really low connection manager limits, and put your node into dhtclient mode.

@bAndie91
Copy link

bAndie91 commented Sep 28, 2020

regarding to bandwidth limitation, have you considered limiting it externally (OS-level) ?
it'd need ipfs to mark connections according to what kind of traffic does it constitute (dht, bitswap, meta/data transfer, etc) , so traffic could be controlled by eg. tc under Linux. it'd limit bw adaptively, so unlike trickle it uses spare bw.

see this idea here: https://discuss.ipfs.io/t/limiting-bandwidth-at-os-level/9102

@lordcirth
Copy link

lordcirth commented Sep 28, 2020

regarding to bandwidth limitation, have you considered limiting it externally (OS-level) ?
it'd need ipfs to mark connections according to what kind of traffic does it constitute (dht, bitswap, meta/data transfer, etc) , so traffic could be controlled by eg. tc under Linux. it'd limit bw adaptively, so unlike trickle it uses spare bw.

see this idea here: https://discuss.ipfs.io/t/limiting-bandwidth-at-os-level/9102

Some people just want to prevent slowing their connection, but others want to avoid hitting their bandwidth caps. So preferably we'd want support for both adaptive and capped bandwidth. It would be better UX to allow configuring bandwidth in IPFS, rather than 3 different sets of instructions for OS firewalls.

@bAndie91
Copy link

bAndie91 commented Sep 28, 2020

@lordcirth
actually offloading traffic control to an external component can be end up either in bw-capping or prioritizing or anything else depending on the external logic. my concern on provisioned bw is that it does not use idle capacity but still exerts pressure when bw limit reached.
although this 2 solutions can co-exist: basic users would user the embedded bw-capping settings and professional operators would setup their firewall with the help of ipfs packets marked according to ipfs traffic types.

@ciprianiacobescu
Copy link

ciprianiacobescu commented Dec 31, 2020

Good things are done by design( by professionals like IPFS devs ). IPFS is too awesome and useful to remain just a nice thing that experts/hacker use.
End-users expect things to work for real life conditions. Web3 will be a reality when each user has it's own ipfs node. - be it a phone, tablet, notebook ...

Just to mention https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0175360#sec008 may offer some pointers on how to deal with the problems on discovery, query and routing.

@christroutner
Copy link

christroutner commented Apr 10, 2021

I'm reviving this topic, as it seems to be a show stopper with regard to UX. I captured this bandwidth issue in a YouTube video, showing exactly what the issue is and why it's so detrimental to front end UX:

Solutions I've tried:

  • Limit the connections by setting LowWater at 20 and HighWater at 40.
  • Using the 'low-power' profile
  • Tried to block the ipfs.io that seem to be the source of the bandwidth, but was not successful.

Seems to me there are two possible solutions:

  • Create a blacklist filter to block nodes that push too much bandwidth.
  • Create a per-peer or overall bandwidth limit setting as part of the IPFS node software.

If anyone else has a proposed solution to this problem, I'm keen to try it.

I've also cross-posted this response on this IPFS discussion board thread.

@MysticRyuujin
Copy link

MysticRyuujin commented Apr 19, 2021

I'm just here for the comments, on this 5 year old issue....
image

@aschmahmann
Copy link
Contributor

aschmahmann commented Apr 19, 2021

@christroutner it looks like the issues you are running into are occurring with js-ipfs not go-ipfs. I'll put some thoughts on what you may want to look into with your js-ipfs nodes in the forum post.

@calroc
Copy link

calroc commented May 6, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important help wanted Seeking public contribution on this issue status/deferred Conscious decision to pause or backlog topic/libp2p Topic libp2p
Projects
No open projects
Development

No branches or pull requests