New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature(2.3): add compaction task for delta files #1894
Conversation
090d2fd
to
565015f
Compare
565015f
to
e189a94
Compare
@@ -116,6 +116,41 @@ impl TimeRange { | |||
self.min_ts = self.min_ts.min(other.min_ts); | |||
self.max_ts = self.max_ts.max(other.max_ts); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try optimize the algo merge-intervals
.context(error::ReadTsmSnafu)? | ||
} | ||
}; | ||
if let Some((min_ts, _max_ts)) = data_block.time_range() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simply the code
} | ||
} | ||
|
||
#[derive(Debug, Default, Clone, PartialEq, Eq, Ord, PartialOrd, Hash)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define enum compaction type to replace CompactTaskKey
|
Please see the new PR #1945. |
Rationale for this change
Related #1244.
Conclusion
Others
DataBlock
- Add methodsplit_at(self, index)-> (DataBlock, DataBlock)
andintersection(self, time_range) -> Option<DataBlock>
.ColumnFile
,LevelInfo
- Implementstd::fmt::Display
, also addColumnFiles<F: AsRef<ColumnFile>>(&[F])
andLevelInfos(&[LevelInfo])
to decrease code when we need to log them.TimeRange
, add functioncompact(time_ranges: &mut Vec<TimeRange>)
to compact time ranges.Compaction
"db", "ts_family", "level"
to"db", "ts_family", "in_level", "out_level"
.delta_compact.rs
for delta compaction:out_level_max_ts
of the out_level.Vec<CompactingBlockMetaGroup>
.CompactingBlockMetaGroup
, get merge-split blocks(only data in time_range0..=out_level_max_ts
is needed) and write them into files in the out_level, store these files inVersionEdit
.xxxxxxx.tombstone.compact.tmp
, includes all it's field_ids and time_range0..=out_level_max_ts
, because data in the time_range is already merged into the out_level. Tombstones will be compacted bycompact(time_ranges: &mut Vec<TimeRange>)
before write.VersionEdit
into summary, and rename thesexxxxxxx.tombstone.compact.tmp
files toxxxxxxx.tombstone
.Task definition
Some changes in
CompactTask
:Pick
Picker: Send + Sync + Debug
, add two function instead of it:pick_level_compaction(version: Arc<Version>) -> Option<CompactReq>
pick_delta_compaction(version: Arc<Version>) -> Option<CompactReq>
pick_delta_compaction
only for picking delta files, it partly merges data in level-0 files to the destination level, the merged data will leave a tombstone file for the source level-0 files.Scheduler
schedule_compaction()
not only check if a tseries family is cold but also check num of level-0 files with configcompact_trigger_file_num
or constantDEFAULT_COMPACT_TRIGGER_DETLA_FILE_NUM
(This change makes unit tests in summary.rs won't stop, so I fixed it by some changes in unit test cases in summary.rs).schedule_compaction()
holds a shared reference ofArc<TseriesFamily
, now we must manually stop it byTseriesFamily::close()
. Implementation ofDrop
forDatabase
will call that method of all it's tseries families.compaction::job::CompactProcessor
is changed to hold the map of(TsFamilyId, IsDeltaCompaction)
toShouldFlushBeforeCompaction
.TODOs
There may be some logs for debug may be deleted.
Are there any user-facing changes?
No.