runInteractiveProcess: pipe: resource exhausted error. #1979

Closed
wpoosanguansit opened this Issue Mar 31, 2016 · 8 comments

Projects

None yet

7 participants

@wpoosanguansit

I am using Stack on Mac El Capitan with resolver lt-5.10. I ran into

stack build --file-watch

and it ran a few times before giving this errors:

user error (could not create file system event stream)

and

/usr/local/bin/ghc-pkg: streamingProcess: runInteractiveProcess: pipe: resource exhausted (Too many open files)

after running a few successful sessions. I am not totally sure if the default limit set was too low or the file-watch just does not clean up things as it should. In any case, I decided to fix it by increasing the ulimit by following the steps in this blog:

http://blog.dekstroza.io/ulimit-shenanigans-on-osx-el-capitan/

It gets back to a working state again. But again, not totally sure if this is the right solution. Thanks for your help.

@harendra-kumar
Collaborator

The OSX hfsevents package requires 2 fds per watch. It forks a C pthread to handle a callback registered with the OSX watch API. This pthread communicates to a Haskell thread via a unix pipe which consumes 2 fds.

On the other hand the Linux hinotify package uses just 1 fd which is created via inotify_init and then passed to inotify_add_watch . But even with 1 fd per dir we can run out of them with a sufficiently large project.

Not sure if hfsevents can be designed to do away with the pipe for communicating from C to Haskell. That would be a better solution. If somehow we can get rid of both the fds altogether that will be awesome.

On our end we can issue a warning to the user when needed. We know how many dirs we are watching and we can find out the ulimit on the system. If there is not enough room we can warn the user and ask her to raise the ulimit.

@mgsloan mgsloan added this to the P3: Optional milestone Mar 31, 2016
@harendra-kumar harendra-kumar referenced this issue in luite/hfsevents Mar 31, 2016
Open

Running out of file descriptors #11

@luite
luite commented Apr 1, 2016

what's the reason for so many watches? you only need one to monitor a whole tree with hfsevents. does fsnotify add one for each subdirectory, or are there really that many roots?

@harendra-kumar
Collaborator

We watch all the parent dirs of files involved in the build. I am not sure if there is a specific motivation for watching immediate dirs of the files. @mgsloan or @snoyberg can perhaps answer that. My guess is that its a natural first design and not optimized because there was no need. Maybe we can improve it by finding common ancestors of dirs as long as the ancestor is within the package's top level dir. That way we may have to filter more spurious events but that may not be a problem.

Though in my opinion we should still fix the hfsevents package, if we can, so that it is robust and scalable irrespective of the application design.

@luite
luite commented Apr 1, 2016

Oh I agree that hfsevents should use fewer fds with the default limit as low as it is, I was just wondering if there is something else in the Stack approach (or in its dependencies) that is not optimal. Something that may be lead to a quicker fix here.

@harendra-kumar
Collaborator

Yeah, a change in the stack code will also be useful for the Linux case where I do not see a way to reduce the fds. Linux inotify seems to require one fd per watch anyway.

@angerman
Contributor
angerman commented Apr 4, 2016

Just wanted to note that I ran into this today with stack-build and lts-5.11 trying to build the aws package.

[...]
x509-system-1.6.3: using precompiled package
x509-validation-1.6.3: using precompiled package
aeson-0.9.0.1: using precompiled package
case-insensitive-1.2.0.6: using precompiled package
resourcet-1.1.7.3: using precompiled package
tls-1.3.4: using precompiled package
Progress: 62/71/path/to/aws/.stack-work/install/x86_64-osx/lts-5.11/7.10.3/flag-cache/aeson-0.9.0.1-ec0a97fff0f75ff54e071d75d0b2f35b: openBinaryFile: resource exhausted (Too many open files)

/usr/local/bin/ghc-pkg: streamingProcess: runInteractiveProcess: pipe: resource exhausted (Too many open files)

/usr/local/bin/ghc-pkg: streamingProcess: runInteractiveProcess: pipe: resource exhausted (Too many open files)

/usr/local/bin/ghc-pkg: streamingProcess: runInteractiveProcess: pipe: resource exhausted (Too many open files)

Running stack build again, resolved the issue, but this was someone unexpected.

@mgsloan
Collaborator
mgsloan commented Apr 4, 2016

@angerman I think that's coming from stack invoking ghc-pkg. The errors get outputted due to an invocation of createProcess_ in streaming-process. This led me to the theory that maybe pipes are getting created and not cleaned up. However, I do not get these resource exhausted errors when doing

{-# LANGUAGE ScopedTypeVariables #-}

import Control.Monad
import Data.ByteString (ByteString)
import Data.Conduit
import Data.Conduit.Process ()
import Data.Streaming.Process

main :: IO ()
main = replicateM_ (1000 * 1000) $
    withCheckedProcess
        (proc "true" [])
        (\ClosedStream (_ :: Source IO ByteString) (_ :: Source IO ByteString) -> return ())

So, I think this is indeed the same issue of having too many file watch fds. I think you just ran out of the resource in a different spot than usual

@borsboom
Contributor

I've just run into this as well. I think may be related to using a large number of precompiled packages from another snapshot (I see those messages @angerman's comment as well). In this case, I build stack using the lts-7 resolver and then switched it to the nightly-2016-09-15 resolver and tried to build it again. It first used 107 precompiled packages from the lts-7 snapshot before it started building a package, and immediately after the copy/register step of the first package it built I got the same Resource exhausted (Too many open files) message as is seen above. I then ran the build again and this time it worked fine. Then I did rm -rf ~/.stack/snapshots/x86_64-osx/nightly-2016-09-15/ and tried again, with exactly the same Resource exhausted failure.

In my case, file-watch is not involved at all, so I guess that there is a file descriptor leak when it copies a precompiled package to a new snapshot.

@snoyberg snoyberg self-assigned this Sep 19, 2016
@snoyberg snoyberg referenced this issue in snoyberg/conduit Sep 19, 2016
Closed

File descriptor leak with processes #280

@borsboom borsboom closed this in 3fd72d2 Sep 19, 2016
@borsboom borsboom removed the in progress label Sep 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment