-
-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running entire suite of 'grunt functional' results in 'EMFILE, too many open files' error #148
Comments
I ran into the same error on OS X when writing tests. /usr/local/lib/node_modules/appium/node_modules/winston/lib/winston/exception.js:29
cwd: process.cwd(),
^
Error: EMFILE, too many open files
at Object.exception.getProcessInfo (/usr/local/lib/node_modules/appium/node_modules/winston/lib/winston/exception.js:29:26)
at Object.exception.getAllInfo (/usr/local/lib/node_modules/appium/node_modules/winston/lib/winston/exception.js:17:24)
at Logger._uncaughtException (/usr/local/lib/node_modules/appium/node_modules/winston/lib/winston/logger.js:600:24)
at process.EventEmitter.emit (events.js:126:20) |
Hi guys, Try running this in your shell:
And then run the testsuite again to see if that helps! (Don't forget to pull and reset your npm dependencies...) |
Appium hasn't crashed yet after running that command. |
I had to bump my ulimit to higher than |
I have tracked this down to half-open sockets on our side (in particular /tmp/instruments). If you do lsof -p and the node.js process, you can see that hundreds are left around. In particular, 'allowHalfOpen' is turned on when the sockets are listened to, but conn.end() is never called to close them up after the data is sent back. I'm working on a patch on this, and I'll put in a pull request if it passes all of the tests. |
Brilliant! I had some theories about instruments but didn't think about half-open sockets. When I wrote that code I wondered what the side effects might be :-) Your patch will be greatly appreciated. |
It just crashed again. I'm also looking forward to the fix. |
Taking a little longer to completely remove it..... soon |
Well, it's definitely a bit more pernicious than I thought it was, and it's actually a larger problem, since the sockets will sit around for as long as the server runs. It's starting to look like there is a long-term reference being held somewhere and it isn't being closed. So far, the only place where the connection appears to be held is in the cmd object in the command queue. |
This is taking quite a bit longer than originally expected. The crux of the matter is some pretty major leakage. It appears to be leaking IOS and Instruments objects until the entire process exits (both by inspection of the open files with lsof and using nodetime and profiling to look for leaks. There are a lot of closures in here and it wouldn't surprise me if we had one somewhere that is retaining itself or otherwise creating a circular reference. I'm going to continue trying to see if I can track this down. |
Thanks @gaige! Let me know if you need any insight into how things are architected. Sounds like you've got things pretty well mapped out by now though. |
At this point, I've definitely tracked down a few leaks and broken some cycles which reduce substantially the number of leaked pipes. However, there is still basically one fd leak per instantiation of Instruments, despite the fact that all of the Instruments and processes are now appropriately disposed of. At this point, all Socket classes are clearing, so I'm now suspecting one of two things:
I have seen comments on other threads about leaked file descriptors with unix domain sockets when trying to interface with mysql and postgres (making me think it might be a general problem), but no indication of any specific bug found or work-around. I'm going to toy with this a little longer and then put up the fixes that I made for this, since they're at least reducing the problem quite a bit. |
Thanks so much for your diligence on this issue, Gaige- I really appreciate On Wed, Feb 13, 2013 at 10:02 AM, Gaige B Paulsen
|
HAH! I do believe I've finally tracked this puppy down.... more news in a minute, but so far I'm running functional and I've accumulated only the one open socket required to run it. |
awesome @gaige! On Feb 14, 2013, at 2:06 PM, Gaige B Paulsen notifications@github.com wrote:
|
This may be a bug in the node.js handling of domain sockets in general, because I've seen a bunch of other posts in various places (other projects, stack exchange) about people having stray domain sockets lying around. However, if we make sure that the server created by the startServer() in the Instruments class is actually closed (forcing it to stop listening), then the socket goes away. Interestingly, I had already stopped the leak of the Instruments itself, and thus the server should have been destroyed, but there is some indication that destroying the server created with a domain socket does not successfully remove the socket for listening and might leave it open. I'm going to do some more leak checking to make sure there isn't anything else showing up, but so far, the full set of functional tests indicate no fd leakage at least. I'm re-running tests and then will put in my pull request. |
#190 is now up for your reviewing pleasure. Please let me know if I've missed something. Otherwise, this should help the stability in low-mem and low-fd situations. |
Handle initial orientation in caps
* Now that we use `apkAnalyzer` to parse activity names, it's not always going to be the case that an activity name will be "fully qualified" * We need to set retry to `true` again so that if the activity does not have '.' in front of it, try again with the dot
Handle initial orientation in caps
* Now that we use `apkAnalyzer` to parse activity names, it's not always going to be the case that an activity name will be "fully qualified" * We need to set retry to `true` again so that if the activity does not have '.' in front of it, try again with the dot
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
The flick gesture test passes consistently when run via this specific command:
$ mocha -t 90000 -R spec test/functional/uicatalog/flick.js
flick gesture
✓ should work via webdriver method (4194ms)
✓ should work via mobile only method (4179ms)
✓ should work via mobile only method with percentage (4186ms)
...
8 tests complete (2 minutes)
However, when running the same tests as part of the large suite of 'grunt functional' tests, I am consistently blocked by this file handler error:
$ grunt functional
Running "functional" task
active
✓ should return active element (3319ms)
...
flick gesture
✓ should work via webdriver method (4235ms)
◦ should work via mobile only method:
/Users/mlai/dev/appium/node_modules/winston/lib/winston/exception.js:29
cwd: process.cwd(),
^
Error: EMFILE, too many open files
at Object.exception.getProcessInfo (/Users/mlai/dev/appium/node_modules/winston/lib/winston/exception.js:29:26)
at Object.exception.getAllInfo (/Users/mlai/dev/appium/node_modules/winston/lib/winston/exception.js:17:24)
at Logger._uncaughtException (/Users/mlai/dev/appium/node_modules/winston/lib/winston/logger.js:600:24)
at process.EventEmitter.emit (events.js:126:20)
Thanks to jlipps for confirming this issue as a problem with appium or one of its dependency failing to release Node's file handlers.
The text was updated successfully, but these errors were encountered: