Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLICKHOUSE-606: query deduplication based on parts' UUID #17348

Merged

Conversation

xjewer
Copy link
Contributor

@xjewer xjewer commented Nov 24, 2020

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Distributed query deduplication is a followup to #16033 and partially resolves the proposal #13574

  • add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default to false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

  • add _part_uuid virtual column, allowing to use UUIDs in predicates.

src/Common/ErrorCodes.cpp Outdated Show resolved Hide resolved
src/DataStreams/RemoteQueryExecutor.cpp Outdated Show resolved Hide resolved
}
catch (const DB::Exception & ex)
{
if (ex.code() == ErrorCodes::DUPLICATED_UUIDS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an awful failure case for this where it will never make any progress:

  • 3 shard cluster
  • shard 1 has part 1, shard 2 has part 2 and shard 3 has both part 1 and part 2 (2 concurrent part moves).
  • the packet with part uuids is received first from shard 1
    • then from shard 3
      • now this query is resent as it contains part duplicates (in flight)
  • packet with part uuids is received from shard 2
  • packet with part uuids is received from shard 3 which again contains duplicates

One solution is multiple retries with a very awful worst case.

I'm ok with ignoring this case for now but it is worth being mentioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution is multiple retries with a very awful worst case.

From the initial proposal #13574

duplicated parts can be found during the query processing only during a short period of time:

  1. destination shard attaches a recently moved part
    X - (DEDUPLICATION HAPPENS HERE)
  2. source shard detaches moving part

it multi parts movement it's a know tradeoff -> better fail and retry later, when switching is done.

surely, number of retries can be configured later if necessary, but I don't think we should proceed with this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And max_distributed_connections=1 is a workaround for such case, right?

src/DataStreams/RemoteQueryExecutor.cpp Outdated Show resolved Hide resolved
src/Interpreters/Context.cpp Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
tests/integration/test_query_deduplication/test.py Outdated Show resolved Hide resolved
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 99bfb0a to b1ca682 Compare November 24, 2020 15:58
@alesapin alesapin self-assigned this Nov 25, 2020
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from b1ca682 to 17ffbe8 Compare November 25, 2020 11:45
src/Interpreters/Context.h Outdated Show resolved Hide resolved
src/Interpreters/Context.h Outdated Show resolved Hide resolved
auto prev_parts = parts;
parts.clear();
Context & query_context = context.hasQueryContext() ?
const_cast<Context &>(context).getQueryContext() : const_cast<Context &>(context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to make get*PartUUIDs constant than const_cast context each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but there's initialization inside getter:

    auto lock = getLock();
    if (!part_uuids)
        part_uuids = std::make_shared<PartUUIDs>();

hence this dirty trick...

src/DataStreams/RemoteQueryExecutor.cpp Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
}

/// populate UUIDs and exclude ignored parts if enabled
if (query_context.getSettingsRef().allow_experimental_query_deduplication && part->uuid != UUIDHelpers::Nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, currently, all parts have UUIDs. So we will send all of them each time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only when setting allow_experimental_query_deduplication=1, later, when moving part logic is implemented, it's possible to look at memtable/zk to say which parts are switching/should be deduplecated.

src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
src/Server/TCPHandler.cpp Outdated Show resolved Hide resolved
tests/integration/test_query_deduplication/test.py Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated Show resolved Hide resolved
src/Interpreters/Context.h Outdated Show resolved Hide resolved
@@ -1749,6 +1749,9 @@ class Client : public Poco::Util::Application

switch (packet.type)
{
case Protocol::Server::PartUuids:
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that client should not get them (if so LOGICAL_ERROR looks better), or I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can potentially, because server can have parts with UUIDs and if settings is enabled, it will send them.

I'd just ignore this packet for now, rather than changing the protocol where client explicitly asking for those UUIDs to be sent back (which is done now via setting).

src/Client/MultiplexedConnections.cpp Outdated Show resolved Hide resolved
src/DataStreams/RemoteQueryExecutor.cpp Outdated Show resolved Hide resolved
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 4dc30fe to 2b29c56 Compare December 7, 2020 18:07
@robot-clickhouse robot-clickhouse added the submodule changed At least one submodule changed in this PR. label Dec 7, 2020
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 2b29c56 to 1e55c7e Compare December 7, 2020 18:15
@xjewer
Copy link
Contributor Author

xjewer commented Dec 7, 2020

committed changed submodule by mistake, fixed

@robot-clickhouse robot-clickhouse removed the submodule changed At least one submodule changed in this PR. label Dec 7, 2020
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 0545759 to 98c3f25 Compare December 15, 2020 18:06
* add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

* add _part_uuid virtual column, allowing to use UUIDs in predicates.

Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>

address comments
@xjewer xjewer force-pushed the alex/CLICKHOUSE-606_deduplication_UUID branch from 4cf8449 to d05c644 Compare February 3, 2021 00:07
Copy link
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the code quite isolated, so we can merge it with the experimental flag.

@alesapin alesapin merged commit 011109c into ClickHouse:master Feb 5, 2021
@xjewer xjewer deleted the alex/CLICKHOUSE-606_deduplication_UUID branch February 5, 2021 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants