Resource Constraints + Limits #1482

jbenet · 2015-07-15T19:28:06Z

jbenet · 2015-07-15T19:28:29Z

The most pressing are:

node.repo.storage_max
node.network_bandwidth_max

jbenet · 2015-07-15T19:29:53Z

@rht would this be an issue you could work on? it's needed sooner than later. particularly node.repo.storage_max (+ running GC if we get close to it) and node.network_bandwidth_max.

@whyrusleeping your help will be needed no matter who implements this.

whyrusleeping · 2015-07-15T19:36:13Z

@jbenet yeap. My concern is that before we even think about configurable limits and such, we need to determine how the system behaves when you are out of a certain resource, whether thats open connections, disk space, or memory. Once we determine how a limit will be manifest in the application, we can start setting those limits.

jbenet · 2015-07-15T19:55:15Z

We already know how some of those would behave, for example, disk. Trigger gc after a threshold, and stop accepting blocks after the limit.

—
Sent from Mailbox

On Wed, Jul 15, 2015 at 12:36 PM, Jeromy Johnson notifications@github.com
wrote:

@jbenet yeap. My concern is that before we even think about configurable limits and such, we need to determine how the system behaves when you are out of a certain resource, whether thats open connections, disk space, or memory. Once we determine how a limit will be manifest in the application, we can start setting those limits.

Reply to this email directly or view it on GitHub:
#1482 (comment)

whyrusleeping · 2015-07-15T20:12:21Z

okay, when we stop accepting blocks, how does that affect the user? Do we just start returning 'error disk full' up the stack everywhere? (probably)

jbenet · 2015-07-16T00:02:49Z

yeah, it's a write error. same would happen if the OS's disk got full.

On Wed, Jul 15, 2015 at 1:12 PM, Jeromy Johnson notifications@github.com
wrote:

okay, when we stop accepting blocks, how does that affect the user? Do we
just start returning 'error disk full' up the stack everywhere? (probably)

—
Reply to this email directly or view it on GitHub
#1482 (comment).

davidar · 2015-09-14T04:14:07Z

👍 the daemon keeps consuming my meager ADSL upload bandwidth

jbenet · 2015-09-14T05:12:01Z

These are a big deal, we should get back on these.

slothbag · 2015-11-08T06:37:24Z

My VPS runs out of RAM pretty quickly with IPFS consuming 80% of it (this is not adding, just idling).. other daemons start to shut down due to out of memory.

Granted my VPS has only 128 or 256mb (cant remember which), but still, I would think its possible to seed some content with minimal resources.

jbenet · 2015-11-10T07:03:24Z

agreed. we should start adding memory constraints as tests for long running nodes to ipfs

rht · 2015-11-24T03:29:16Z

Update here:

Datastore.StorageMax, Datastore.StorageGCWatermark has been implemented. However, it'd say it would consume much less resource to simply calculate / keep track of number of hashes stored in datastore.
For network bandwidth, I haven't found a battle-scarred rate limiting lib to use (there are plenty, but haven't reviewed them), but meanwhile I propose that unit-less constraint can be implemented with golang.org/x/net/netutil, to limit the number of simultaneous connection to the http api/gateway.
swarm bandwidth has been indirectly constrained through the fd limit https://github.com/ipfs/go-ipfs/blob/20b06a4cbce8884f5b194da6e98cb11f2c77f166/p2p/net/swarm/swarm_dial.go#L44 -- if this fdconstraint doesn't exist, does limiting the number of swarms indirectly limits number of fd dials, @whyrusleeping ? If so, it is more intuitive to just limit the swarms, and expose this to config.
memory. I don't need to run the ipfsnode long enough to require double C-c to kill it (is this an evidence of zombie goroutines?). More systematic mem leak reports would open path here.

jbenet · 2015-11-30T08:56:26Z

Thanks for update @rht

Re limits, i think people will mostly want to set hard BW caps in explicit KB/s.

SCBuergel · 2015-12-15T18:21:42Z

What other things are we interested in limiting?

I just randomly found this discussion while trying to limit the overall output traffic (per day / month). I think limiting output traffic could be an interesting thing (especially with respect to file coin one day) as egress traffic is typically limited in cloud settings like AWS or Azure. There I am fine with temporary spikes of high bandwidth as long as my output traffic stays within some bounds per unit of time. Setting a limit per hour / day / month might make sense to prevent from blowing a months volume in a day / hour.

PlanetPlan · 2016-01-17T20:04:07Z

Hi, thanks very much for IPFS.

I did not carefully read the above, so some of the following may be duplicates. This is all long-term things to think about, nothing that is a headache for me right now. The following are some usage models that may suggest features for controlling resources:

My normal network connection is slow by many standards. When I am using the network interactively, I'd like IPFS to avoid/reduce background traffic, though stills serve my foreground file requests at full bandwidth. When I am idle (not interactive), I'd like IPFS to ramp up network usage so my system can be a friendly member of the caching/serving community.
A similar comment applies to IPFS disk bandwidth and CPU usage: back off when I am interactive, use freely when I am idle.
I want actual files to be cached someplace other than ~/.ipfs so they are not part of my backup state.
On a laptop, I have some network connections that are pay-per-byte. I'd like to leave IPFS enabled so I can use it, but I'd like to be an "unfriendly" member of the community because network traffic costs are quite high. Conversely, when I am on a fast/cheap network, I'd like to build up "credit" so I get good service when I am on a high-price network and being "momentarily unfriendly".
A similar comment applies to removable media: I have limited built-in storage on a laptop and so often plug in a removable drive when relatively "stationary". It would be useful to have both a "for sure" area for IPFS on the built-in drive plus an "optional" area on removable drives.

clownfeces · 2016-07-02T10:40:50Z

For vpn users, being able to limit the maximum number of connections is a very important feature, since many vpns automatically disconnect you if you have to many open connections (it's probably some sort of protection to fight spammers and ddosers). IPFS by default creates hundreds of connections, so its barely usable, unless you don't care if you regularly get disconnected.

davidak · 2016-08-06T17:06:45Z

I want to report some resource usage stats:

I have an ipfs node version 0.4.2 running on a VM with 1 core and 1 GB RAM. No files added or pinned!

It uses 465 MB RAM just to keep connections to 214 peers open. (are that all running nodes?)

Kubuxu · 2016-08-07T19:27:38Z

It means that it is directly in connection with 214 peers, those are live nodes in the network, we might want to start limiting that. Deluge (torrent client) by default allows for 200 connections and only 50 active at the time, but it uses utp which we were unable to do successfully due to utp lib for Go hanging.

@davidak is that netdata collector for IPFS? Looks nice, have you published it somewhere?

davidak · 2016-08-07T21:07:05Z

@Kubuxu the IPFS netdata plugin just got merged some minutes ago ;)

netdata/netdata#761

fiatjaf · 2016-08-08T22:54:22Z

What bothers me is the network usage:

Makes even ssh'ing to my VPS horribly slow.

slothbag · 2016-08-08T23:24:44Z

I've had some luck using linux "tc" command to throttle IPFS down to about 10KB/s outbound.. this has the side-effect of dropping incoming down to about 15-20KB/s

I can see IPFS is using 100% of its allocated 10KB/s all day every day, but at least I can calculate how much bandwidth that is per month to ensure I don't go over my quotas.

And a nice bonus is it significantly reduces memory usage, which is now hovering around 50-100Mb.

jbenet · 2016-08-08T23:40:47Z

@slothbag does it work in that condition?
On Mon, Aug 8, 2016 at 19:24 slothbag notifications@github.com wrote:

I've had some luck using linux "tc" command to throttle IPFS down to about
10KB/s outbound.. this has the side-effect of dropping incoming down to
about 15-20KB/s

I can see IPFS is using 100% of its allocated 10KB/s all day every day,
but at least I can calculate how much bandwidth that is per month and
ensure I don't go over my quotas.

And a nice bonus is it significantly reduces memory usage, which is now
hovering around 50-100Mb.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1482 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAIcoVvIbJott0tMhgPRIq8P8Kv9pdA3ks5qd7q9gaJpZM4FZWAT
.

pataquets · 2017-08-28T20:12:36Z

Where applicable, different bw limits for pinned items would be a nice feature to have. Users might be more inclined to providing bandwidth for files they find important enough to pin.

ajbouh · 2017-08-28T20:47:13Z

For my own use case this would be quite valuable. Not all workers I add to the network should serve all files equally. Files they create should be served with much higher priority than files they need and mirror.

…

On Aug 28, 2017 1:12 PM, "Alfonso Montero" ***@***.***> wrote: Where applicable, different bw limits for pinned items would be a nice feature to have. Users might be more inclined to providing bandwidth for files they find important enough to pin. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1482 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAcnUNsThKVpfxBeOBajwvOi7IY7Guuks5scx8_gaJpZM4FZWAT> .

dokterbob · 2017-09-13T08:03:54Z

+1 for node.memlimit

Although @jbenet suggests we can have this done on a higher level, a long-running actively used IPFS daemon will currently eat all memory available on a system which basically means that, without memory constraints it will not be stable.

Obviously, the memory footprint (#3318) could be reduced but given that the project moves forward very fast feature wise, there will be new kinds of memory waste popping up.

haasn · 2017-11-01T02:47:42Z

ipfs for me has several hundreds of open connections, which triggers a number of warning mechanisms including TCP resets/s (many dozens) and makes it look like a network scan.

Connecting to this many peers seems insane for a p2p network. Being able to limit this would be a high priority for me.

whyrusleeping · 2017-11-01T02:50:27Z

This is resolved in the next release, try out the release candidate for 0.4.12

…

On Tue, Oct 31, 2017, 9:47 PM Niklas Haas ***@***.***> wrote: ipfs for me has several hundreds of open connections, which triggers a number of warning mechanisms including TCP resets/s (many dozens) and makes it look like a network scan. Connecting to this many peers seems insane for a p2p network. Being able to limit this would be a high priority for me. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1482 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL4HHqGd5sX9DGVcO8-4sWzZ6S6pqnPks5sx9vQgaJpZM4FZWAT> .

gwpl · 2018-01-17T12:58:18Z

I need also limit for maximum open files! (causes: #4589 )

KrzysiekJ · 2018-01-23T14:42:13Z

@whyrusleeping: go-ipfs v0.4.13 still maintains several hundreds of open connections.

whyrusleeping · 2018-01-23T18:07:18Z

@KrzysiekJ Yeah, DHTs need to maintain a decent number of open connections for proper functioning. You can tweak it lower in your configuration file, Look for Swarm.ConnMgr

EternityForest · 2018-02-05T07:06:01Z

Does the DHT actually need to maintain large numbers of connections to work? It seems like you need to know the locations of a good number of DHT peers, but why actually connect to them?

Can't we just keep a list of a few thousand peers, and figure out if they're still up if/when they're needed?

Connectionless DHT queries should only take 1 UDP round trip per hop if you don't use a handshake or encryption, and it's not like you can't monitor someone pretty easily as is(Connect to them, and watch their wantlist broadcasts).

Congestion doesn't seem like it should be that much of an issue, especially if you limit retries, If they aren't there after 3 or 4 attempts, you just assume they aren't online anymore and try a different path.

An advantage of connectionless is that you can potentially store the last known IP of millions of nodes, meaning most of the network can be within 2 or 3 hops.

That has the issue of concentrating traffic on a few nodes for popular content, but I suspect there's ways of managing that.

Stebalien · 2018-02-07T02:32:31Z

Does the DHT actually need to maintain large numbers of connections to work? It seems like you need to know the locations of a good number of DHT peers, but why actually connect to them?

Correct. Unfortunately, we don't have any working UDP based protocols at the moment anyways. However, we're working on supporting QUIC. While this wouldn't be a connection-less protocol, connections won't take up file descriptors and we can save memory/bandwidth by "suspending" unused connections (remember the connection's session information but otherwise go silent).

In the future, we'd like a real packet transport system but we aren't there yet. The tricky part will be getting the abstractions right will take a bit of work because we try to make all parts of the IPFS/libp2p stack pluggable.

Connectionless DHT queries should only take 1 UDP round trip per hop if you don't use a handshake or encryption, and it's not like you can't monitor someone pretty easily as is(Connect to them, and watch their wantlist broadcasts).

The encryption isn't just about monitoring, it also prevents middle boxes from being "smart". However, as we generally don't care about replay or perfect forward secrecy for DHT messages, we may be able to encrypt these requests without creating a connection (although that gets expensive if we send more than one message). Again, the tricky part will be getting the abstractions correct (and, in this case, not creating a security footgun).

An advantage of connectionless is that you can potentially store the last known IP of millions of nodes, meaning most of the network can be within 2 or 3 hops.

Unfortunately, IPFS nodes tend to go offline/online all the time. Having connections open helps us keep track of which ones are online. However, the solution here is to just not have flaky nodes act as DHT nodes.

andrewchambers · 2018-04-09T01:19:59Z

FWIW: Many operating systems provide facilities for limiting all of those things e.g. consider using linux containers and separate disk partitions. It is then up to ipfs to just handle error conditions returned by the OS properly.

dokterbob · 2018-04-09T10:20:15Z

Your suggestion is strongly against the long standing common practice of Unix daemon design, where daemons should manage their own footprints and only in error conditions should the OS interfere. For example, most forking servers allow the amount of processes to be limited (i.e. Apache, PHP-FPM, Postfix, etc.). Many also allow limits to the memory used (i.e. Elasticsearch, MySQL). In addition, for disk caches it's normal to have hard and soft limits set. Most system administrators consider a daemon that, when unrestrained, just eats up all the resources in a system, to be badly designed. Only recently have these types of behaviours become sonewhat tolerated, but really only amongst users of stuff like Docker. Mind you, many operating systems do not support such newfangled tools and whether or not it is actually safe to rely on will have to be proven (consider the large amount of security issues in the early Xen days). andrewchambers <notifications@github.com> schreef op 9 april 2018 02:20:09 GMT+01:00:

…

FWIW: Many operating systems provide facilities for limiting all of those things e.g. consider using linux containers and separate disk partitions. It is then up to ipfs to just handle error conditions returned by the OS properly.

-- Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn beknoptheid.

Macil · 2018-04-09T16:24:16Z

If you make the OS / docker limit the memory that ipfs uses, then will ipfs be careful to use less than that amount? If not, ipfs might just keep charging headfirst into the limit and get regularly killed/restarted by the system.

dokterbob · 2018-04-09T16:25:35Z

That’s the exact behaviour I’ve been observing for a system with high load.

Kubuxu · 2018-04-10T00:49:28Z

We would hard limit the amount of used memory if Golang allowed for it but it does not.
This means we can only chase bugs and try to fix them to limit the memory usage.

Macil · 2018-04-10T20:19:14Z

I don't want limits in order to limit the impact of bugs; I'm worried about limiting the amount of memory that ipfs uses under arbitrarily high load. I want to do things like set ipfs to refuse or queue new connections if it's processing too many right now, etc.

Kubuxu · 2018-04-11T01:49:14Z

@agentme this isn't a problem right now. Currently, AFAIK, most memory issues are due to bugs.

CocoonCrash · 2018-06-05T17:48:24Z

Bugs happen and no one should only rely on the fact no problems will happen once known ones are corrected. I think most limitations as mentioned by @jbenet are necessary as is a seatbelt while driving.

Golang can't have ressource consumption limits set, but "breathing sleep" of some milliseconds can get coded for an end user not to "loose control" of its device for example. And/or the number of effective TCP connections/used bandwidth could also be limited as those are part of the software design.

My personal understanding about this is that one of the numerous goals of IPFS is efficiency, so consuming a lot of ressources (cpu, memory, bandwich) on edges while in idle mode is not an option as it could be seen as an "uncontrolled" software. Would you want a computer knowing that if connected to internet it couldn't get used as it's ensuring everything is working well? Remember me antivirus running on Windows years ago.

I'm far from an IPFS/Libp2p expert, but maybe each node could implement a pub/sub like scheme to open only one connection to listen for heartbeats sent from other nodes referencing it. And when a node's heartbeat is missing for too long it could trigger the DHT routing table to be renewed the regular TCP way. That would be a compromise between UDP/TCP as discussed by @loadletter and @whyrusleeping earlier.

This could also be used to optimise/adapt routing as it could offer a pseudo-latency or workload/availability shared monitoring between nodes, even if I think libp2p already implement many close things as of nodes auto discovering on a common network or that IPFS intends to work even if part of the network get splited in subnetworks etc...

I really hope this will get improved as I think it currently is an adoption barrier. IPFS is a really great and promising thing, and I really thank every designer/contributor for all the work done, but I also would really love seeing it spreading to the whole universe ;)

theduke · 2018-10-05T12:20:33Z

As a note of reference, I had problems with ipfs-daemon consistently killing my WiFi connection after a few minutes. I had to disconnect and reconnect manually. (OS: Arch Linux + NetworkManager).

After limiting the maximum connections to 300 (with Swarm.ConnMgr.HighWater).

It works fine now, but this is really bad for the average user where they might just not understand why their internet is suddenly so slow or not working correctly.

The default setup should be very conservative with resources used.

priom · 2021-07-13T17:57:16Z

Any new update on this?

guseggert · 2022-03-04T16:06:24Z

Libp2p has recently added a "resource manager" which we are working to integrate with go-ipfs, we are planning to release it in v0.13 (there is a chance it could be delayed to v0.14).

More info: https://github.com/libp2p/go-libp2p-resource-manager

jbenet mentioned this issue Jul 15, 2015

Repo Size Constraints #972

Open

rht self-assigned this Jul 16, 2015

jbenet mentioned this issue Jul 27, 2015

Sprint Q ipfs/team-mgmt#23

Closed

43 tasks

jbenet mentioned this issue Aug 3, 2015

Sprint R - July 27 ipfs/team-mgmt#24

Closed

35 tasks

jbenet mentioned this issue Oct 8, 2015

Need statistics on what content is being uploaded #920

Open

rht mentioned this issue Nov 28, 2015

introduce low memory flag #2012

Closed

rht mentioned this issue Jan 14, 2016

Miniature solarnet lab: resource constraints, testings, and discoveries ipfs/infra#145

Closed

hackergrrl mentioned this issue Apr 13, 2016

When I request some data, its starts to come, but stops after a while, bandwidth is still used ipfs-inactive/faq#108

Closed

clownfeces unassigned rht Jul 2, 2016

lidel mentioned this issue Oct 5, 2018

Tweak default config of IPFS node ipfs/ipfs-desktop#665

Closed

3 tasks

olizilla mentioned this issue Feb 18, 2019

Add a "desktop" friendly init profile #4989

Open

momack2 added this to Inbox in ipfs/go-ipfs May 9, 2019

Stebalien removed this from the Resource Constraints milestone Apr 29, 2020

Stebalien mentioned this issue May 9, 2020

IPFS with cluster-follower run out of memory and got killed by OOM-Job #7292

Closed

mrusme mentioned this issue Dec 29, 2021

Limit number of connections mrusme/superhighway84#14

Closed

lidel mentioned this issue Apr 5, 2022

feat: opt-in Swarm.ResourceMgr (go-libp2p v0.18) #8680

Merged

6 tasks

guseggert closed this as completed in #8680 Apr 8, 2022

chkno referenced this issue in NixOS/nixpkgs Jun 24, 2023

Add IPFS warning

124d8cc

Resource Constraints + Limits #1482

Resource Constraints + Limits #1482

Comments

jbenet commented Jul 15, 2015 • edited by ghost

Possible Limits

What other things are we interested in limiting?

jbenet commented Jul 15, 2015

jbenet commented Jul 15, 2015

whyrusleeping commented Jul 15, 2015

jbenet commented Jul 15, 2015

whyrusleeping commented Jul 15, 2015

jbenet commented Jul 16, 2015

davidar commented Sep 14, 2015

jbenet commented Sep 14, 2015

slothbag commented Nov 8, 2015

jbenet commented Nov 10, 2015

rht commented Nov 24, 2015

jbenet commented Nov 30, 2015

SCBuergel commented Dec 15, 2015

PlanetPlan commented Jan 17, 2016

clownfeces commented Jul 2, 2016

davidak commented Aug 6, 2016

Kubuxu commented Aug 7, 2016 • edited

davidak commented Aug 7, 2016

fiatjaf commented Aug 8, 2016

slothbag commented Aug 8, 2016 • edited

jbenet commented Aug 8, 2016

pataquets commented Aug 28, 2017

ajbouh commented Aug 28, 2017 via email

dokterbob commented Sep 13, 2017

haasn commented Nov 1, 2017

whyrusleeping commented Nov 1, 2017 via email

gwpl commented Jan 17, 2018 • edited

KrzysiekJ commented Jan 23, 2018

whyrusleeping commented Jan 23, 2018

EternityForest commented Feb 5, 2018

Stebalien commented Feb 7, 2018 • edited

andrewchambers commented Apr 9, 2018

dokterbob commented Apr 9, 2018 via email

Macil commented Apr 9, 2018

dokterbob commented Apr 9, 2018 via email

Kubuxu commented Apr 10, 2018

Macil commented Apr 10, 2018 • edited

Kubuxu commented Apr 11, 2018

CocoonCrash commented Jun 5, 2018

theduke commented Oct 5, 2018

priom commented Jul 13, 2021

guseggert commented Mar 4, 2022

jbenet commented Jul 15, 2015 •

edited by ghost

Kubuxu commented Aug 7, 2016 •

edited

slothbag commented Aug 8, 2016 •

edited

gwpl commented Jan 17, 2018 •

edited

Stebalien commented Feb 7, 2018 •

edited

Macil commented Apr 10, 2018 •

edited