Let the DataPartitionTable be automatically cleanable#14737
Let the DataPartitionTable be automatically cleanable#14737OneSizeFitsQuorum merged 3 commits intomasterfrom
Conversation
| for (int retry = 0; retry < 120; retry++) { | ||
| boolean partitionTableAutoCleaned = true; | ||
| TDataPartitionTableResp resp = client.getDataPartitionTable(req); | ||
| if (TSStatusCode.SUCCESS_STATUS.getStatusCode() == resp.getStatus().getCode()) { | ||
| Map<String, Map<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>>> | ||
| dataPartitionTable = resp.getDataPartitionTable(); | ||
| for (Map.Entry< | ||
| String, | ||
| Map<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>>> | ||
| e1 : dataPartitionTable.entrySet()) { | ||
| for (Map.Entry<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>> | ||
| e2 : e1.getValue().entrySet()) { | ||
| if (e2.getValue().size() != 1) { | ||
| // The PartitionTable of each database should only contain 1 time partition slot | ||
| partitionTableAutoCleaned = false; | ||
| break; | ||
| } | ||
| } | ||
| if (!partitionTableAutoCleaned) { | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| if (partitionTableAutoCleaned) { | ||
| return; | ||
| } | ||
| TimeUnit.SECONDS.sleep(1); |
There was a problem hiding this comment.
- Considering using Awaitility to replace the outer for-loop and sleep
- (As Chatgpt suggested :) please checkout whether these two inner loop can be simplified, such as:
partitionTableAutoCleaned = resp.getDataPartitionTable().entrySet().stream()
.flatMap(e1 -> e1.getValue().entrySet().stream())
.allMatch(e2 -> e2.getValue().size() == 1);
There was a problem hiding this comment.
An awesome suggestion! The corresponding test codes are simplified significantly!
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/concurrent/ThreadName.java
Outdated
Show resolved
Hide resolved
| for (String database : databases) { | ||
| long subTreeMaxTTL = getTTLManager().getDatabaseMaxTTL(database); | ||
| databaseTTLMap.put( | ||
| database, Math.max(subTreeMaxTTL, databaseTTLMap.getOrDefault(database, -1L))); |
There was a problem hiding this comment.
add some judgement like "isDatabaseExisted(database) && 0 < ttl && ttl < Long.MAX_VALUE" here to remove overhead?
BTW, If all the databases don't have ttl, we can logically just do this check and find that none of them need to be cleaned up, so there's no need to do a consensus write
There was a problem hiding this comment.
Thanks for pinpointing this logic enhancement. The judgement is available at PartitionTableAutoCleaner.
| this.regionMaintainer = | ||
| IoTDBThreadPoolFactory.newSingleThreadScheduledExecutor( | ||
| ThreadName.CONFIG_NODE_REGION_MAINTAINER.getName()); | ||
| this.partitionCleaner = |
There was a problem hiding this comment.
maybe try to reuse procedure periodic tasks
There was a problem hiding this comment.
Sure. I now employ the PartitionTableAutoCleaner rather than creating an extra thread pool.
...src/test/java/org/apache/iotdb/confignode/it/partition/IoTDBPartitionTableAutoCleanTest.java
Outdated
Show resolved
Hide resolved
...src/test/java/org/apache/iotdb/confignode/it/partition/IoTDBPartitionTableAutoCleanTest.java
Outdated
Show resolved
Hide resolved
|
liyuheng55555
left a comment
There was a problem hiding this comment.
Have you considered concurrency issues, such as setTTL and Cleaner running simultaneously?
(I think one solution could be to clean up partitions only after they’ve exceeded the TTL by one hour, but you might have solved this in another way :)
Thanks for raising this critical issue. To address your concern, please refer to the Here, the removing condition is |
liyuheng55555
left a comment
There was a problem hiding this comment.
Good work! Thank you for addressing my feedback patiently!
* seems finished * Use periodic procedure 4 partition table cleaner * Update ThreadName.java
…pache#14737) (apache#14759) * Let the DataPartitionTable be automatically cleanable (apache#14737) * seems finished * Use periodic procedure 4 partition table cleaner * Update ThreadName.java * Update PartitionTableAutoCleaner.java



A data partition is not necessary to exist when all corresponding data are expired given the pre-configurated TTL. Hence, this PR added a thread to automatically clean these expired data partitions, making the cache size of the DataPartitionTable is always acceptable! Specifically, the main updates include:
Incidentally, the corresponding IT is available at "IoTDBPartitionTableAutoCleanTest."