-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduced MergeTask and MutateTask #25165
Merged
nikitamikhaylov
merged 124 commits into
ClickHouse:master
from
nikitamikhaylov:merge-task-save
Sep 16, 2021
Merged
Changes from all commits
Commits
Show all changes
124 commits
Select commit
Hold shift + click to select a range
e05dd10
save
nikitamikhaylov aa66c7f
save
nikitamikhaylov 5c22168
save
nikitamikhaylov ae82fe2
save
nikitamikhaylov 18708e0
save
nikitamikhaylov 27ec511
mvp
nikitamikhaylov 6fdc94d
fix tests
nikitamikhaylov fb9d661
fix tests
nikitamikhaylov 37a7e60
fix tests
nikitamikhaylov 2923f70
fix tests
nikitamikhaylov 6246262
add boost future
nikitamikhaylov ec9e9fd
better
nikitamikhaylov e47c53b
add contib changes
nikitamikhaylov 8816955
get rid of priority in thread pool and fix pvs check
nikitamikhaylov 08b7dd2
get rid of wait_time_microseconds, because it is not used anywhere
nikitamikhaylov 3d51ee9
added PrioritizedThreadPool
nikitamikhaylov c82aa7f
MergeTask for merge from log entry (replicated)
nikitamikhaylov 2fa018e
support plain merge tree
nikitamikhaylov 573caf7
fix msan
nikitamikhaylov ec70a18
fix pvs check
nikitamikhaylov 2753cc1
fix gcc build
nikitamikhaylov 298b4ad
fix deadlock
nikitamikhaylov 4163384
fix use after free (maybe)
nikitamikhaylov 1d61791
a little suppression
nikitamikhaylov 161520f
fix nullptr dereference
nikitamikhaylov cae874c
improve thread pool
nikitamikhaylov b7900fc
better
nikitamikhaylov ac5b5ca
remove boost
nikitamikhaylov a25fab4
better
nikitamikhaylov 4943806
get rid of exceptions
nikitamikhaylov 614b8e9
fix build
nikitamikhaylov 10075d1
better
nikitamikhaylov 7aa878d
fix tests
nikitamikhaylov 5c6cad9
rebased
nikitamikhaylov 1ae9715
better
nikitamikhaylov e4744d1
fix build
nikitamikhaylov 25c7fc9
better
nikitamikhaylov 47c7017
fix test 00721
nikitamikhaylov 981cdc2
better
nikitamikhaylov d62af5c
test impl
nikitamikhaylov 7e99527
fix
nikitamikhaylov 50daa38
fix tests
nikitamikhaylov 9ed3aac
better
nikitamikhaylov c76d6b4
fix nullptr dereference
nikitamikhaylov 967f40b
fix tests with encrypted disk
nikitamikhaylov 7b6910b
try fix race
nikitamikhaylov 89ea88c
delete some code about memory tracker
nikitamikhaylov 28a875c
try fix race
nikitamikhaylov ff41123
delete some files
nikitamikhaylov fac351f
clean up code
nikitamikhaylov 43d2534
updated ya.make
nikitamikhaylov 48543c4
Update 01532_execute_merges_on_single_replica.sql
nikitamikhaylov 103b02e
Update 01532_execute_merges_on_single_replica.sql
nikitamikhaylov 6f3eddc
execute MergeTask with BackgroundJobsExecutor
nikitamikhaylov 786139d
add normal priorities
nikitamikhaylov 58c6c46
add mutate task
nikitamikhaylov dabb7e4
added MutatePlainMergeTree task
nikitamikhaylov 1cc8cf4
support for replicated mutate task
nikitamikhaylov bc54b1f
increase queue size
nikitamikhaylov c969faf
fix scheduleOrThrowOnError
nikitamikhaylov f4280d4
fix segfault
nikitamikhaylov 1c224b0
remove extra copying of MutationCommands
nikitamikhaylov c3273f1
fix build
nikitamikhaylov 6b7d2f8
fixed empty MutationCommands creation
nikitamikhaylov d7c248e
fix PVS check
nikitamikhaylov 522e1ee
fix build
nikitamikhaylov efa7161
fix build
nikitamikhaylov 2ac8869
added submit continuation method and fixed build
nikitamikhaylov 4a44ea0
fix deadlock
nikitamikhaylov 7cfd80a
add exception to a log entry
nikitamikhaylov 9c35863
extract common code for both tasks
nikitamikhaylov 0185a97
try to split mutations on tasks
nikitamikhaylov 2657d64
Merge branch 'master' into merge-task-save
nikitamikhaylov c6ab58c
save
nikitamikhaylov c3c7651
save
nikitamikhaylov 5120703
Fix build
nikitamikhaylov a0e87e6
finish mutations
nikitamikhaylov 78a4e28
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov 9fa856f
fix build
nikitamikhaylov 47de66f
fix nullptr dereference
nikitamikhaylov 566b34e
fix projection test
nikitamikhaylov 1482e9f
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov eba2780
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov b1c6fc0
Merge with master
nikitamikhaylov 7dad110
added an executor instead of thread pool
nikitamikhaylov ba8f98e
fix selecting task scheduling
nikitamikhaylov a91152c
better BackgroundJobExecutor
nikitamikhaylov 3b9b9d1
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov e6ae751
fix deadlock
nikitamikhaylov a4dbbdb
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov 6765777
save
nikitamikhaylov b09dff4
added global MergeTreeBackgroundExecutor
nikitamikhaylov f8d1c1b
better
nikitamikhaylov 9c54bd9
avoid cycles of shared_ptrs
nikitamikhaylov 458d05c
add backoff
nikitamikhaylov 4f9dfa6
better
nikitamikhaylov 0b1cb94
Merge branch 'master' of github.com:ClickHouse/ClickHouse into merge-…
nikitamikhaylov 3faa509
better
nikitamikhaylov daf3486
better
nikitamikhaylov d4cb671
better
nikitamikhaylov c319717
Substact a number of removed tasks from global counter
nikitamikhaylov 10663c4
fix background moves
nikitamikhaylov 46af7cf
better
nikitamikhaylov 7e62078
Merge upstream/master into merge-task-save (using imerge)
nikitamikhaylov 47d13c4
Fix build
nikitamikhaylov 32fb6d5
Merge upstream/master into merge-task-save (using imerge)
nikitamikhaylov 592c0c3
Fix build
nikitamikhaylov 4656b29
better
nikitamikhaylov ad667ed
Merge upstream/master into merge-task-save (using imerge)
nikitamikhaylov b5516bc
Fixes after merge
nikitamikhaylov 353a1a9
Another fix after merge
nikitamikhaylov 2b09472
Fix race + delete file
nikitamikhaylov 986c248
Merge upstream/master into merge-task-save (using imerge)
nikitamikhaylov 67bbf91
Style
nikitamikhaylov 64d74e5
Better
nikitamikhaylov 187e450
Merge upstream/master into merge-task-save (using imerge)
nikitamikhaylov 7ff200f
get rid of state
nikitamikhaylov 780714c
Save changes
nikitamikhaylov d1931e9
Refactor MergeTask
nikitamikhaylov 6f6a48a
Try to fix race and fix build
nikitamikhaylov 279c86a
Review fixes
nikitamikhaylov adce5eb
Fix style, build and PVS check
nikitamikhaylov 45124a0
Better
nikitamikhaylov c4dfd3b
Better test
nikitamikhaylov File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
#pragma once | ||
|
||
#include "Storages/MergeTree/IMergeTreeDataPart.h" | ||
|
||
|
||
namespace DB | ||
{ | ||
|
||
/* Allow to compute more accurate progress statistics */ | ||
class ColumnSizeEstimator | ||
{ | ||
using ColumnToSize = MergeTreeDataPartInMemory::ColumnToSize; | ||
ColumnToSize map; | ||
public: | ||
|
||
/// Stores approximate size of columns in bytes | ||
/// Exact values are not required since it used for relative values estimation (progress). | ||
size_t sum_total = 0; | ||
size_t sum_index_columns = 0; | ||
size_t sum_ordinary_columns = 0; | ||
|
||
ColumnSizeEstimator(ColumnToSize && map_, const Names & key_columns, const Names & ordinary_columns) | ||
: map(std::move(map_)) | ||
{ | ||
for (const auto & name : key_columns) | ||
if (!map.count(name)) map[name] = 0; | ||
for (const auto & name : ordinary_columns) | ||
if (!map.count(name)) map[name] = 0; | ||
|
||
for (const auto & name : key_columns) | ||
sum_index_columns += map.at(name); | ||
|
||
for (const auto & name : ordinary_columns) | ||
sum_ordinary_columns += map.at(name); | ||
|
||
sum_total = std::max(static_cast<decltype(sum_index_columns)>(1), sum_index_columns + sum_ordinary_columns); | ||
} | ||
|
||
Float64 columnWeight(const String & column) const | ||
{ | ||
return static_cast<Float64>(map.at(column)) / sum_total; | ||
} | ||
|
||
Float64 keyColumnsWeight() const | ||
{ | ||
return static_cast<Float64>(sum_index_columns) / sum_total; | ||
} | ||
}; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
#include "Storages/MergeTree/FutureMergedMutatedPart.h" | ||
|
||
|
||
namespace DB | ||
{ | ||
|
||
namespace ErrorCodes | ||
{ | ||
extern const int LOGICAL_ERROR; | ||
} | ||
|
||
void FutureMergedMutatedPart::assign(MergeTreeData::DataPartsVector parts_) | ||
{ | ||
if (parts_.empty()) | ||
return; | ||
|
||
size_t sum_rows = 0; | ||
size_t sum_bytes_uncompressed = 0; | ||
MergeTreeDataPartType future_part_type = MergeTreeDataPartType::UNKNOWN; | ||
for (const auto & part : parts_) | ||
{ | ||
sum_rows += part->rows_count; | ||
sum_bytes_uncompressed += part->getTotalColumnsSize().data_uncompressed; | ||
future_part_type = std::min(future_part_type, part->getType()); | ||
} | ||
|
||
auto chosen_type = parts_.front()->storage.choosePartTypeOnDisk(sum_bytes_uncompressed, sum_rows); | ||
future_part_type = std::min(future_part_type, chosen_type); | ||
assign(std::move(parts_), future_part_type); | ||
} | ||
|
||
void FutureMergedMutatedPart::assign(MergeTreeData::DataPartsVector parts_, MergeTreeDataPartType future_part_type) | ||
{ | ||
if (parts_.empty()) | ||
return; | ||
|
||
for (const MergeTreeData::DataPartPtr & part : parts_) | ||
{ | ||
const MergeTreeData::DataPartPtr & first_part = parts_.front(); | ||
|
||
if (part->partition.value != first_part->partition.value) | ||
throw Exception( | ||
"Attempting to merge parts " + first_part->name + " and " + part->name + " that are in different partitions", | ||
ErrorCodes::LOGICAL_ERROR); | ||
} | ||
|
||
parts = std::move(parts_); | ||
|
||
UInt32 max_level = 0; | ||
Int64 max_mutation = 0; | ||
for (const auto & part : parts) | ||
{ | ||
max_level = std::max(max_level, part->info.level); | ||
max_mutation = std::max(max_mutation, part->info.mutation); | ||
} | ||
|
||
type = future_part_type; | ||
part_info.partition_id = parts.front()->info.partition_id; | ||
part_info.min_block = parts.front()->info.min_block; | ||
part_info.max_block = parts.back()->info.max_block; | ||
part_info.level = max_level + 1; | ||
part_info.mutation = max_mutation; | ||
|
||
if (parts.front()->storage.format_version < MERGE_TREE_DATA_MIN_FORMAT_VERSION_WITH_CUSTOM_PARTITIONING) | ||
{ | ||
DayNum min_date = DayNum(std::numeric_limits<UInt16>::max()); | ||
DayNum max_date = DayNum(std::numeric_limits<UInt16>::min()); | ||
for (const auto & part : parts) | ||
{ | ||
/// NOTE: getting min and max dates from part names (instead of part data) because we want | ||
/// the merged part name be determined only by source part names. | ||
/// It is simpler this way when the real min and max dates for the block range can change | ||
/// (e.g. after an ALTER DELETE command). | ||
DayNum part_min_date; | ||
DayNum part_max_date; | ||
MergeTreePartInfo::parseMinMaxDatesFromPartName(part->name, part_min_date, part_max_date); | ||
min_date = std::min(min_date, part_min_date); | ||
max_date = std::max(max_date, part_max_date); | ||
} | ||
|
||
name = part_info.getPartNameV0(min_date, max_date); | ||
} | ||
else | ||
name = part_info.getPartName(); | ||
} | ||
|
||
void FutureMergedMutatedPart::updatePath(const MergeTreeData & storage, const IReservation * reservation) | ||
{ | ||
path = storage.getFullPathOnDisk(reservation->getDisk()) + name + "/"; | ||
} | ||
|
||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear why do we use
shared_ptr
forminmax_idx
now.Also maybe unique can do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use
shared_ptr
, because it has a copy construct which is useful for merging projections. There we construct a bunch of MergeTasks inside a MergeTask with the same MinMaxIndex object. Before this there was a reference.