Skip to content

DB building suspended with message 'Max cache limit is reached (3000 vs. 3001), sync flush is performed' #3412

@adithyank

Description

@adithyank

OrientDB Version : 2.0 RC-2
OS : CentOS 6.4
JRE : Oracle 1.7.0_55

In our server application we use only one Thread for adding records in OreintDB and multiple Threads will be reading the records. This is how we have designed

We started building our inventory database. After 20% of building, following message got printed in nohup.out continuously for say 110 times and then overall operation halted. The thread dump stayed in the same place when we took the thread dump multiple times.

2015-01-17 00:06:15:187 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:06:25:390 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:06:36:070 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:00:294 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:11:683 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:23:506 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:32:049 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:43:620 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]
2015-01-17 00:07:55:173 INFO  Max cache limit is reached (3000 vs. 3001), sync flush is performed. [ODiskWriteAheadLog$LogSegment]

The thread dump of the writing thread is given below

"InventoryDataPersister" prio=10 tid=0x00007f9cfc14c800 nid=0x4165 waiting on condition [0x00007f9b9affb000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007fa1df3a39a8> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
        at java.util.concurrent.FutureTask.get(FutureTask.java:187)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OWOWCache.addAllocatedSpace(OWOWCache.java:193)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OWOWCache.cacheFileContent(OWOWCache.java:914)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OWOWCache.load(OWOWCache.java:425)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OReadWriteDiskCache.updateCache(OReadWriteDiskCache.java:676)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OReadWriteDiskCache.doLoad(OReadWriteDiskCache.java:302)
        at com.orientechnologies.orient.core.index.hashindex.local.cache.OReadWriteDiskCache.load(OReadWriteDiskCache.java:270)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.addEntry(OPaginatedCluster.java:1338)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createRecord(OPaginatedCluster.java:387)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doCreateRecord(OAbstractPaginatedStorage.java:1582)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitEntry(OAbstractPaginatedStorage.java:2064)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:907)
        at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:483)
        at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:147)
        at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2369)
        at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2339)
        at in.co.nmsworks.documenttdb.db.DocDB.commit(DocDB.java:594)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.RcaInvWriter.addAll(RcaInvWriter.java:164)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.store.RcaInvWriter.addAll(RcaInvWriter.java:117)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.dao.InventoryDAO.save(InventoryDAO.java:106)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.queue.InventoryDataPersister.saveDAOData(InventoryDataPersister.java:356)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.queue.InventoryDataPersister.processAddition(InventoryDataPersister.java:331)
        at in.co.nmsworks.cygnet.telecom.fault.rcasia.inventory.queue.InventoryDataPersister.run(InventoryDataPersister.java:72)

   Locked ownable synchronizers:
        - <0x00007fa1c07dfda0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1c10871f8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1cb056038> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded819f8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded81d68> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded81fa8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded82270> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded82a40> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded82cc0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded82dd0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded82f90> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded83188> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded83380> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded83578> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded83770> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded83ff0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded85460> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded88ad8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded8d130> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded912a8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded91a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1ded95148> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1deda09c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dedaa518> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dedaf918> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dedb0e00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dedb1040> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dedc90d8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dee2d3b0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dee34288> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        - <0x00007fa1dee344c8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)

In this stage, we checked the size of the orientdb directory, it was 163 GB. Usually, the size of the DB will be 60GB after the inventory building is 100% over. Now, even with 20% of inventory building, the size of the DB is 163 GB. The database directory had around 700 *.wal files totaling to 153 GB. This is another unusual behavior we observed. The free disk space was 50GB at this stage.

Then we did shutdown our application gracefully using kill (not kill -9 ). After that, we got message in nohup that orient shutdown complete.

Then when we restarted the application, all indices started rebuilding.

So, to avoid long time in index rebuilding, we had to delete the database directory and starting the inventory building again from starting. Since, it is a testing environment, we could do this. But, this may occur again as we did not do any other change.

Another Input:
We explicit set below two variables

OGlobalConfiguration.STORAGE_LOCK_TIMEOUT.setValue(5 * 60 * 1000);
OGlobalConfiguration.MVRBTREE_TIMEOUT.setValue(5 * 60 * 1000);

My Questions

  1. Why the *.wal files occupy 153 DB when the inventory building is just 20% over.
  2. Even after proper shutdown of Orient, why the indices started rebuilding?

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions