Skip to content
This repository has been archived by the owner on Nov 2, 2018. It is now read-only.

OSX has a limit of 256 open files #1485

Closed
avahowell opened this issue Oct 27, 2016 · 21 comments
Closed

OSX has a limit of 256 open files #1485

avahowell opened this issue Oct 27, 2016 · 21 comments

Comments

@avahowell
Copy link
Contributor

This has caused crashing issues for Mac users

@DavidVorick
Copy link
Member

Is this a problem with siad or just the UI?

@lukechampine
Copy link
Member

It has implications for our API design. If the design encourages opening many calls open concurrently (e.g. when downloading many files), then it's a failing of our API design, not of the UI.

@DavidVorick
Copy link
Member

Ok. I will probably not leave this issue open for very long, because there's no action item associated with it. However we should definitely support non-blocking download calls in the api.

@lukechampine
Copy link
Member

Agreed. That feature should dovetail nicely with our merged upload/download loop.

@daniel-lucio
Copy link

Same issue in Linux, siad rises more than 1024 file handlers

@DavidVorick
Copy link
Member

@daniel-lucio how long were you running Sia? Did you perform a specific action to hit 1024?

Trying to figure out the best way to keep this from happening in the future.

@lukechampine would the gateway have one open per peer?

@lukechampine
Copy link
Member

Yes, I believe the current file handler sources are:

  • 1 for the API listener
  • 1 for each peer connection
  • 1 for each database
  • 1 for each host/renter connection
  • 1 for each file being uploaded/downloaded (renter side)
  • 1 (?) for each file being uploaded/downloaded (host side)

However, we don't need to worry about the limit when it comes to peer connections or host operations. In the rare case that you want a huge number of peers, you can manually raise the limit. And for hosts, it's expected that they will be more advanced users, so it's sufficient to document that hosting a large amount of data may require raising the limit. (Of course, minimizing the number of open file handles is still a good idea!)

@DavidVorick
Copy link
Member

I'm struggling to see how that would add up to 1024, though I could see it getting pretty close to 256 if you have the max number of peers. Perhaps we should drop the default maximum to 40.

@daniel-lucio
Copy link

@DavidVorick you need to know your SO. In Linux, I use the ulimit command to raise this limit. I ignore if OSX has that as well.

@avahowell
Copy link
Contributor Author

I could see it adding up to 1024 if you include all the other file descriptors in use on the system, not just Sia.

@DavidVorick
Copy link
Member

I had the impression that it was 256 files per process, not per machine?

@RNabel
Copy link
Contributor

RNabel commented Mar 14, 2017

Same problem on Fedora 24 with a soft limit of 1024 and hard limit of 4096 open files.

siad continues to work though.

@RNabel
Copy link
Contributor

RNabel commented Mar 15, 2017

Just investigated this a little closer, and it seems like siad is opening a ton of new sockets.

Reproduce

  1. Start unlocking of wallet
  2. find process id of siad (ps aux | grep siad or otherwise)
  3. run ls -l /proc/<siad id>/fd | grep socket | wc -l

The count is about 150 to start with and then increases steadily by about 100 every few minutes. At block height 40k there are about 600 open sockets.
After the wallet is unlocked the open socket count drops to 15.

selection_002

@lukechampine
Copy link
Member

That is very interesting. I don't think the wallet itself is opening sockets, but once the wallet is unlocked, the renter can start trying to form/renew contracts, and each of those attempts requires a socket. Perhaps we are leaking those somehow.

@DavidVorick
Copy link
Member

Seems more likely to me that bolt is not cleaning up the file handles very well, I'm guessing it opens a separate socket for each call to db.Read and then eventually cleans them all up.

@lukechampine
Copy link
Member

AFAIK bolt doesn't use sockets, just mmap. We could (sort of) confirm this by examining socket use after deleting something like hostdb.json and restarting. If bolt is responsible, then resyncing any module should cause a bunch of sockets to be opened.

@RNabel, any chance you can use a tool like ss to find out more about the socket connections? Ideally we'd be able to capture some of the data sent. That would tell us what code is responsible for opening the sockets.

@RNabel
Copy link
Contributor

RNabel commented Mar 15, 2017

@lukechampine Sure thing. I'll play around with this over the weekend and report back.

@RNabel
Copy link
Contributor

RNabel commented Mar 19, 2017

Alright, I ran through the above steps again (this time with said v1.1.2) and could not replicate this behavior anymore - could it be fixed by the latest update?

@DavidVorick
Copy link
Member

Were you using v1.1.0 before? It's possible that most of those sockets were created by the HostDB, which got a pretty big upgrade in v1.1.1.

I'm going to leave this open because I'm not convinced that the problem has been resolved, but if nobody can replicate it as of the most recent master branch, I guess we did something to fix it accidentally.

We have been in general working on improving our resource management.

@RNabel
Copy link
Contributor

RNabel commented Mar 22, 2017

I'm pretty sure that I was using v1.1.1. In any case, tried again and can't reproduce the error.

@lukechampine
Copy link
Member

Closing for now, feel free to reopen if this issue presents itself again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants