Skip to content

[Enhancement] Reduce hold writeLock time for DatabaseTransactionMgr to improve stability of stream load #17379

@caiconghui

Description

@caiconghui

Search before asking

  • I had searched in the issues and found no similar issues.

Description

We encounter so many error like the following:
"message":"errCode = 2, detailMessage = get tableList write lock timeout, tableList=(Table [id=16422760, name=xxxxxx, type=OLAP])

and find that

1816:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
1857:	- <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2144:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2306:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2332:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2417:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2443:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2493:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2558:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2584:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2756:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2782:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2832:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2858:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2884:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2910:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2984:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
3979:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
11039:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20422:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20448:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20474:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20500:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20646:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20722:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20748:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
20870:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21016:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21042:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21164:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21190:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21264:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21290:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21364:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21510:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21560:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21586:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21636:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21758:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21784:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
21957:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22031:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22057:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22107:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22157:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22183:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22257:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22283:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22333:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22383:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22457:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22507:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22581:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22607:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22633:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22659:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22709:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22735:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22833:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22859:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
22909:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23103:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23129:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23251:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23325:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23351:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23428:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23454:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23480:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23530:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23606:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23656:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23684:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23734:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23856:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23906:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23932:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
23982:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
24032:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
24106:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
24132:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
24209:	- parking to wait for  <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
 1833 "txnCleaner" #77 daemon prio=5 os_prio=0 cpu=224451476.74ms elapsed=6735363.14s tid=0x00007f176c0162e0 nid=0xe43a runnable  [0x00007f17a08cb000]
 1834    java.lang.Thread.State: RUNNABLE
 1835     at java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk(java.base@11.0.2/Native Method)
 1836     at java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk(java.base@11.0.2/StackStreamFactory.java:370)
 1837     at java.lang.StackStreamFactory$AbstractStackWalker.walk(java.base@11.0.2/StackStreamFactory.java:243)
 1838     at java.lang.StackWalker.walk(java.base@11.0.2/StackWalker.java:498)
 1839     at org.apache.logging.log4j.util.StackLocator.calcLocation(StackLocator.java:97)
 1840     at org.apache.logging.log4j.util.StackLocatorUtil.calcLocation(StackLocatorUtil.java:121)
 1841     at org.apache.logging.log4j.spi.AbstractLogger.getLocation(AbstractLogger.java:2216)
 1842     at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2159)
 1843     at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2142)
 1844     at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2017)
 1845     at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1983)
 1846     at org.apache.logging.log4j.spi.AbstractLogger.info(AbstractLogger.java:1320)
 1847     at org.apache.doris.transaction.DatabaseTransactionMgr.clearTransactionState(DatabaseTransactionMgr.java:1342)
 1848     at org.apache.doris.transaction.DatabaseTransactionMgr.unprotectedRemoveExpiredTxns(DatabaseTransactionMgr.java:1324)
 1849     at org.apache.doris.transaction.DatabaseTransactionMgr.removeExpiredTxns(DatabaseTransactionMgr.java:1301)
 1850     at org.apache.doris.transaction.DatabaseTransactionMgr.removeExpiredAndTimeoutTxns(DatabaseTransactionMgr.java:1625)
 1851     at org.apache.doris.transaction.GlobalTransactionMgr.removeExpiredAndTimeoutTxns(GlobalTransactionMgr.java:378)
 1852     at org.apache.doris.catalog.Catalog$2.runAfterCatalogReady(Catalog.java:2231)
 1853     at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58)
 1854     at org.apache.doris.common.util.Daemon.run(Daemon.java:116)
 1855 
 1856    Locked ownable synchronizers:
 1857     - <0x0000100e80a07a58> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)

clear transaction state log occupy too much time

Solution

change clear transaction log level from info to debug

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions