[IOTDB-6209] Pipe: Solving the topological order of the progress index in the historical data collection phase#11478
Merged
SteveYurongSu merged 12 commits intoapache:masterfrom Nov 9, 2023
Merged
Conversation
.../test/java/org/apache/iotdb/db/storageengine/dataregion/TsFileResourceProgressIndexTest.java
Outdated
Show resolved
Hide resolved
...ommons/src/main/java/org/apache/iotdb/commons/consensus/index/impl/MinimumProgressIndex.java
Outdated
Show resolved
Hide resolved
...-core/node-commons/src/main/java/org/apache/iotdb/commons/consensus/index/ProgressIndex.java
Outdated
Show resolved
Hide resolved
...a/org/apache/iotdb/db/pipe/extractor/historical/PipeHistoricalDataRegionTsFileExtractor.java
Outdated
Show resolved
Hide resolved
...a/org/apache/iotdb/db/pipe/extractor/historical/PipeHistoricalDataRegionTsFileExtractor.java
Outdated
Show resolved
Hide resolved
...a/org/apache/iotdb/db/pipe/extractor/historical/PipeHistoricalDataRegionTsFileExtractor.java
Show resolved
Hide resolved
...-core/node-commons/src/main/java/org/apache/iotdb/commons/consensus/index/ProgressIndex.java
Show resolved
Hide resolved
…a doc in ProgressIndex
Member
|
pull and merge master please |
...ommons/src/main/java/org/apache/iotdb/commons/consensus/index/impl/MinimumProgressIndex.java
Show resolved
Hide resolved
…ect & make a singleton TotalOrderSumTuple in MinimumProgressIndex
…inimumProgressIndex
SteveYurongSu
approved these changes
Nov 9, 2023
Member
SteveYurongSu
left a comment
There was a problem hiding this comment.
Make Pipe Great Again!!!
HTHou
pushed a commit
that referenced
this pull request
Dec 20, 2023
…x in the historical data collection phase (#11478) * Problem: When the pipe performs historical data collection, it currently sends sequential data first and then disorganized data, which is obviously wrong because the progress index of some of the disorganized data may be smaller than that of sequential files, which causes the pipe to record the wrong progress index as the progress information, resulting in some of the disorganized files not being sent. * Solution: After collecting all the historical data, solve for the topological order according to the progress index contained in the TsFile. Use the topological order of the file as the order in which the historical data is collected.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem:
When the pipe performs historical data collection, it currently sends sequential data first and then disorganized data, which is obviously wrong because the progress index of some of the disorganized data may be smaller than that of sequential files, which causes the pipe to record the wrong progress index as the progress information, resulting in some of the disorganized files not being sent.
Solution:
After collecting all the historical data, solve for the topological order according to the progress index contained in the TsFile. Use the topological order of the file as the order in which the historical data is collected.
问题:
在 pipe 进行历史数据收集的时候,目前是先发送顺序数据,再发送乱序数据,这明显是错误的,因为部分乱序数据的 progress index 可能比顺序文件更小,导致 pipe 会记录错误的 progress index 作为进度信息,导致部分乱序文件没有被发送。
解决方案:
在收集完所有历史数据之后,按照 TsFile 包含的 progress index 求解拓扑序。使用文件的拓扑序作为历史数据收集的顺序。