-
Notifications
You must be signed in to change notification settings - Fork 2.4k
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open file descriptors leak (somehow related to lucene schema-index) #1799
Comments
#1437 might refer to the same issue |
Hi Thomas, can you give us a hand to narrow this down? Is it reproducible? If so, can create a smallest-possible set of operations that causes a problem? |
Hi Alistair,
|
reopening: found by accident an easy way to reproduce it: @apcj can you verify this problem on your machine? |
@tbaum I too am experiencing the same problem and can reproduce the issue using the neo4j-fd-leak test case you created I am running embedded neo4j on mac os x 10.9.2 with the latest jdk 7 and neo4j 2.0.2 (I tried 2.0.1 as well and experienced the same problem). Using neo4j 1.9.* I did not encounter this issue and only discovered it when attempting to upgrade to 2.*. lsof showcases the problem as noted in the first comment Any workarounds? This is obviously a show stopper for us until we can get it resolved. If it helps, here is the stack trace I get when running neo4j-fd-leak
|
It looks like this is a problem specifically when you have two or more MERGE clauses in the same transaction (just breaking them apart from one query into several is not enough) and they don't match existing nodes, so nodes will have to be created. So the test can be simplified to that. Interesting and annoying. Well, it's being looked at. Also, turns out OS X really does not like it when you use up all of its file descriptors :P |
Thanks for writing that test program, @tbaum, it was very helpful. The fix is in the pipeline, so I'm closing the issue. |
was this fix integrated into 2.1.2? |
@danielcalencar Yes. |
I'm still seeing this in 2.1.4 My transaction does:
This happens if I do it all as a single transaction. This also happens if I do it in 3 respective transactions. I see this on Mac OS 10.9.4. I haven't attempted on linux; I'm working on a repro there. Any suggestions? |
I'm seeing this issue in 2.1.4 as well, but haven't had the time to track down what is going wrong. I'm creating 3-6 new nodes with ~10 relationships per operation with 10 operations per transaction. It spits the dummy at around ~20k operations. OSX with file handles to 1000000. I've tried this with sequentially processing a batch and multi-threaded, but the result is the same. messages.log does not have much more than the slightly edited stack trace below.
|
Hey guys, sorry about that, it looks like the fix in 2.1.4 didn't resolve all code paths triggering this. We're working on reproducing locally - any additional details y'all have to expand on the type of load or details of what operations you're running to trigger this above what's already in this ticket is very welcome. |
~500k nodes. in transacted batches of ~100 nodes. It crashes when the file count, of the command below, reaches ~10k - but that's OS dependant.
|
@monowai can you please provide some info about indexes in your setup. How many indexes exists? Are any of them touched during bulk insert? Thanks in advance! |
Approximately 5 different classes of node are in play. Each has at least one unique index. One of the nodes has 3. |
@monowai I'm not able to reproduce the problem :(
This works fine for 100 transactions in a row, though rarely I can see peaks of file descriptors usage up to 6k. |
My setup:
Maybe the volume you are executing is insufficient to see the file resource leak for the hardware you have. The thing to note is that the file handle is never released - hence it exhausts the limit. Feel free to reach me by direct email and I’ll share the email I sent PhilipR On 8/10/2014, at 9:02 am, Konstantin Lutovich notifications@github.com wrote:
|
#3302 makes number of files required for lucene indexes much lower. General file descriptor usage should be more stable with this fix. |
During load (a mixture of Cypher + Java API calls) the number of open file handles increase drastically and finally stops the process (limit 10k)
Running the same code on a Linux-64 box works. The open file handle count is not exceeding 1k.
The application creates a database '/tmp/db3'
messages.log: https://gist.github.com/tbaum/e746a3cf83ba4ddbc806
The text was updated successfully, but these errors were encountered: