-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Massive insertion of edges put OrientDB in pause for long time #4144
Comments
I see the flush get a lock by using: Lock groupLock = lockManager.acquireExclusiveLock(entry.getKey()); Where lockManager is the |
Looking the stack trace it just seems that all the thread are waiting on the flush, so doesn't look like some deadlock is happening. by the way @lvca the ONewLockManager is safe in case of page locking because one thread never lock more than one page at time. |
The question is why everybody is waiting for the flush if the flush get one lock per page? |
Ok, digging more into the code I see that after each operation at cluster level, the page is released. On page release the PeriodicFlushTask() could be submitted. So with a massive insertion against 16 clusters, you could have many PeriodicFlushTask that are working in parallel. Everything is blocked because the cluster operation waits for PeriodicFlushTask completion to release the lock. |
I'm guessing if we could release the lock on cluster before to release the lock on page. |
|
Hi guys, I've been requested to make an exhaustive description of how I handle the operations with OrientDB, hoping it could be more helpful for you :) So, as Luca said, I'm using version 2.1-rc2, accessing the DB via plocal and using default blueprints methods. The script is divided in many threads each using a separate OrientGraphNoTx instance, provided by the given OrientGraphFactory.
and then proceeds to create a large set of edges provided by a shared queue. For each edge the thread has to make a look-up for the starting and ending vertices (they are equipped with an unique hash index). Upon launching the script, it soon starts to pause periodically during runtime, where each pause coincides with the moment a new edge is created. The more the script keeps running, the more the pauses keep growing in time and frequency. Additional options I use while launching the script are:
I tried removing the second option to look for improvements, but the behavior has remained basically the same. Regarding the graph structure, it is comprised of two main classes, one for the vertices and one for the edges. The vertex class is replicated in 64 copies, each equipped with a separate unique hash index. Furthermore, each copy is comprised of 6 clusters (in order to grant concurrent access to the threads) and is set to an oversize factor of 5. The edge class is not replicated nor oversized, but it is comprised of 12 clusters. Hope it helps, Valerio |
Hi Valerio, If you still have situation when insertion of edges is frozen could you send me series thread dumps as you did before ? |
I think @Laa meant: |
Sure thing Andrey, I'll restart the process tomorrow and will provide you additional data in case of more freezes. And yeah, the issue does not appear during vertex insertion: the script runs smoothly till the end. It seems to be a problem limited to the edge's scope. |
@l4nz10 looking forward for your feedback |
Hey @Laa, very sorry for the wait, yesterday has been a busy day. |
I started the new insertion with the option you provided, but now the following exception occurs:
|
I see did you start insertion from scratch or on already created database ? |
From scratch. These exceptions appear only during the edge insertion phase. |
I see could you try -Dsbtreebonsai.freeeSpaceReuseTrigger=0. Does it work ? |
One thing:I generate the graph structure in a separate script, should I use the same option on that section too? Could it be the reason? |
You shold the same option , but to make srcipt more safe even better to use -Dsbtreebonsai.freeeSpaceReuseTrigger=0 option. |
Ok, then I'll rebuild the graph and give it another shot, I'll let you know. |
Just tried with the new option, the problem persists. Any idea on the cause? |
could you send me code, I will reporduce it on my side. Is it thrown at the beggining of the load test or I have to wait for a while ? |
Here they are: Nodes Edges 2015-05-25 9:40 GMT+02:00 Andrey Lomakin notifications@github.com:
|
Let me know when you got them |
Running the test show that is not a problem of just edges. Inserting vertices, after 82M of entries start to do long pauses:
|
That's interesting. Vertex insertion always had good performance overall (even with the pauses, apparently), so I didn't make tests on it. Good to know. |
Hi,
|
Hi @l4nz10, |
@l4nz10 when could you try your app running OrientDB from "new_write_cache" branch? |
Ok I'll give it a try this week end, will let you know as soon as I have some data. |
I just ran the tests and the issue seems to have been indeed fixed. The edge insertion has gone smoothly till the end, with an average insertion speed of 19000 edges per second. In regard of the vertices, the performance are quite the same as before, having an average insertion speed of 70000 vertices per second (with a dedicated hash index). I plan to make an additional test with an even larger dataset, just to make sure the problem is gone for good. I'm looking forward to your further improvements. Thanks for your help! Valerio EDIT: |
I managed to insert about 160 million vertices in 2255 seconds, and 76 million edges in 4010 seconds. So the entire process took about 1 hour and 45 minutes overall. |
Hi @l4nz10 |
Ok I'll test again. I'll report back if something goes wrong. |
Thanks |
Valerio is doing a massive import (plocal, no wal, diskcache flush disabled, version 2.1-rc2). Vertices have no problem. With edges after a while OrientDB freezes and no new edges are imported for log time, even 120 seconds. And then it restart... This is a thread dump during this freeze. Any clue why flushing locks operations?
The text was updated successfully, but these errors were encountered: