CLICKHOUSE-606: query deduplication based on parts' UUID #17348

xjewer · 2020-11-24T00:33:35Z

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Distributed query deduplication is a followup to #16033 and partially resolves the proposal #13574

add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default to false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

add _part_uuid virtual column, allowing to use UUIDs in predicates.

src/Common/ErrorCodes.cpp

src/DataStreams/RemoteQueryExecutor.cpp

nvartolomei · 2020-11-24T11:16:38Z

src/DataStreams/RemoteQueryExecutor.cpp

+    }
+    catch (const DB::Exception & ex)
+    {
+        if (ex.code() == ErrorCodes::DUPLICATED_UUIDS)


I have an awful failure case for this where it will never make any progress:

3 shard cluster

shard 1 has part 1, shard 2 has part 2 and shard 3 has both part 1 and part 2 (2 concurrent part moves).

the packet with part uuids is received first from shard 1

then from shard 3

now this query is resent as it contains part duplicates (in flight)

packet with part uuids is received from shard 2

packet with part uuids is received from shard 3 which again contains duplicates

One solution is multiple retries with a very awful worst case.

I'm ok with ignoring this case for now but it is worth being mentioned.

One solution is multiple retries with a very awful worst case.

From the initial proposal #13574

duplicated parts can be found during the query processing only during a short period of time:

destination shard attaches a recently moved part
X - (DEDUPLICATION HAPPENS HERE)

source shard detaches moving part

it multi parts movement it's a know tradeoff -> better fail and retry later, when switching is done.

surely, number of retries can be configured later if necessary, but I don't think we should proceed with this.

And max_distributed_connections=1 is a workaround for such case, right?

src/DataStreams/RemoteQueryExecutor.cpp

src/Interpreters/Context.cpp

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

tests/integration/test_query_deduplication/test.py

src/Interpreters/Context.h

alesapin · 2020-12-04T14:06:30Z

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

-        auto prev_parts = parts;
-        parts.clear();
+        Context & query_context = context.hasQueryContext() ?
+                                    const_cast<Context &>(context).getQueryContext() : const_cast<Context &>(context);


Better to make get*PartUUIDs constant than const_cast context each time.

but there's initialization inside getter:

auto lock = getLock(); if (!part_uuids) part_uuids = std::make_shared<PartUUIDs>();

hence this dirty trick...

src/DataStreams/RemoteQueryExecutor.cpp

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

alesapin · 2020-12-04T14:14:23Z

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

+                }
+
+                /// populate UUIDs and exclude ignored parts if enabled
+                if (query_context.getSettingsRef().allow_experimental_query_deduplication && part->uuid != UUIDHelpers::Nil)


As far as I know, currently, all parts have UUIDs. So we will send all of them each time?

only when setting allow_experimental_query_deduplication=1, later, when moving part logic is implemented, it's possible to look at memtable/zk to say which parts are switching/should be deduplecated.

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

src/Server/TCPHandler.cpp

tests/integration/test_query_deduplication/test.py

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp

src/Interpreters/Context.h

azat · 2020-12-05T12:49:56Z

programs/client/Client.cpp

@@ -1749,6 +1749,9 @@ class Client : public Poco::Util::Application

        switch (packet.type)
        {
+            case Protocol::Server::PartUuids:
+                return true;


Seems that client should not get them (if so LOGICAL_ERROR looks better), or I'm missing something?

it can potentially, because server can have parts with UUIDs and if settings is enabled, it will send them.

I'd just ignore this packet for now, rather than changing the protocol where client explicitly asking for those UUIDs to be sent back (which is done now via setting).

src/Client/MultiplexedConnections.cpp

src/DataStreams/RemoteQueryExecutor.cpp

xjewer · 2020-12-07T18:16:51Z

committed changed submodule by mistake, fixed

tests/integration/test_query_deduplication/test.py

* add the query data deduplication excluding duplicated parts in MergeTree family engines. query deduplication is based on parts' UUID which should be enabled first with merge_tree setting assign_part_uuids=1 allow_experimental_query_deduplication setting is to enable part deduplication, default ot false. data part UUID is a mechanism of giving a data part a unique identifier. Having UUID and deduplication mechanism provides a potential of moving parts between shards preserving data consistency on a read path: duplicated UUIDs will cause root executor to retry query against on of the replica explicitly asking to exclude encountered duplicated fingerprints during a distributed query execution. NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will update part's UUID. * add _part_uuid virtual column, allowing to use UUIDs in predicates. Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com> address comments

src/DataStreams/RemoteQueryExecutor.cpp

alesapin

Ok, the code quite isolated, so we can merge it with the experimental flag.

robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Nov 24, 2020

xjewer mentioned this pull request Nov 24, 2020

Query deduplication based on parts fingerprints #14885

Closed

alesapin added the can be tested label Nov 24, 2020

nvartolomei reviewed Nov 24, 2020

View reviewed changes

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 99bfb0a to b1ca682 Compare November 24, 2020 15:58

alesapin self-assigned this Nov 25, 2020

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from b1ca682 to 17ffbe8 Compare November 25, 2020 11:45

alesapin reviewed Dec 4, 2020

View reviewed changes

azat reviewed Dec 5, 2020

View reviewed changes

nvartolomei mentioned this pull request Dec 7, 2020

Part movement between shards #17871

Merged

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 4dc30fe to 2b29c56 Compare December 7, 2020 18:07

robot-clickhouse added the submodule changed At least one submodule changed in this PR. label Dec 7, 2020

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 2b29c56 to 1e55c7e Compare December 7, 2020 18:15

robot-clickhouse removed the submodule changed At least one submodule changed in this PR. label Dec 7, 2020

azat reviewed Dec 7, 2020

View reviewed changes

tests/integration/test_query_deduplication/test.py Outdated Show resolved Hide resolved

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 0545759 to 98c3f25 Compare December 15, 2020 18:06

xjewer added 2 commits February 2, 2021 16:53

Send cancel packet and cancel read_context before retrying the query

d05c644

xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 4cf8449 to d05c644 Compare February 3, 2021 00:07

xjewer commented Feb 3, 2021

View reviewed changes

src/DataStreams/RemoteQueryExecutor.cpp Outdated Show resolved Hide resolved

alesapin added 2 commits February 5, 2021 12:54

More isolated code

7cbc135

More checks for setting

449e8e3

godliness mentioned this pull request Feb 5, 2021

clickhouse 扩容shard支持re-balance cloudnativecube/octopus#21

Open

Better types

aafadc0

alesapin approved these changes Feb 5, 2021

View reviewed changes

alesapin merged commit 011109c into ClickHouse:master Feb 5, 2021

xjewer deleted the alex/CLICKHOUSE-606_deduplication_UUID branch February 5, 2021 20:06

azat mentioned this pull request Apr 11, 2023

Remove lock for duplicated parts UUIDs (allow_experimental_query_deduplication=1) #48670

Merged

azat mentioned this pull request Jun 9, 2023

[RFC] Fix possible parts duplicates after ALTER TABLE MOVE PART TO SHARD #50777

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLICKHOUSE-606: query deduplication based on parts' UUID #17348

CLICKHOUSE-606: query deduplication based on parts' UUID #17348

xjewer commented Nov 24, 2020 •

edited

nvartolomei Nov 24, 2020

xjewer Nov 24, 2020

azat Dec 7, 2020

alesapin Dec 4, 2020

xjewer Dec 7, 2020

alesapin Dec 4, 2020

xjewer Dec 7, 2020

azat Dec 5, 2020

xjewer Dec 7, 2020

xjewer commented Dec 7, 2020

alesapin left a comment

CLICKHOUSE-606: query deduplication based on parts' UUID #17348

CLICKHOUSE-606: query deduplication based on parts' UUID #17348

Conversation

xjewer commented Nov 24, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xjewer commented Dec 7, 2020

alesapin left a comment

Choose a reason for hiding this comment

xjewer commented Nov 24, 2020 •

edited