Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add system.distributed_ddl_queue table #17656

Merged
merged 20 commits into from
Jan 11, 2021
Merged

add system.distributed_ddl_queue table #17656

merged 20 commits into from
Jan 11, 2021

Conversation

bharatnc
Copy link
Contributor

@bharatnc bharatnc commented Dec 1, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Adds a new table called system.distributed_ddl_queue that displays the queries in the DDL worker queue.

Detailed description / Documentation draft:

This PR adds a new table called the system.distributed_ddl_queue that lists all the queries that are currently in the DDL worker queue.

To accomplish this, the zookeeper path for distributed_ddl.path (default is /clickhouse/task_queue/ddl/) is polled for
all the queries and for each query, subpaths /active and /finished are queried to get the list of nodes that are present under the active and finished zookeeper paths. The data is then populated into the table as follows:

Querying the system.distributed_ddl_queue table from one of the shards:

SELECT *
FROM system.distributed_ddl_queue
WHERE cluster = 'test_cluster'
LIMIT 1
FORMAT Vertical

Query id: a1c53d35-5a84-4e3d-8176-13b1f4791141

Row 1:
──────
entry:             query-0000000000
host_name:         clickhouse01
host_address:      172.23.0.11
port:              9000
status:            Finished
cluster:           test_cluster
query:             CREATE DATABASE test_db UUID '4a82697e-c85e-4e5b-a01e-a36f2a758456' ON CLUSTER test_cluster
initiator:         clickhouse01:9000
query_start_time:  2020-12-30 13:07:51
query_finish_time: 2020-12-30 13:07:51
query_duration_ms: 6
exception_code:    ZOK

1 rows in set. Elapsed: 0.038 sec. 

relates to #17082

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Dec 1, 2020
@tavplubix tavplubix self-assigned this Dec 1, 2020
@bharatnc bharatnc marked this pull request as ready for review December 6, 2020 19:17
@bharatnc
Copy link
Contributor Author

bharatnc commented Dec 6, 2020

@tavplubix , @alesapin this is my current implementation of the ddl_worker_queue table. It's now ready for review.

The table structure is as follows:

:) describe system.ddl_worker_queue;

DESCRIBE TABLE system.ddl_worker_queue

Query id: 5c760afb-f63c-4ee9-b359-e4ee38c663f9

┌─name─────┬─type──────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ name     │ String        │              │                    │         │                  │                │
│ active   │ Array(String) │              │                    │         │                  │                │
│ finished │ Array(String) │              │                    │         │                  │                │
└──────────┴───────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

3 rows in set. Elapsed: 0.004 sec. 

Also I rely on zookeeper for fetching a list of queries and the entries under active and finished paths. I'm pretty sure that there might be a better way to accomplish this end result and also the table can include more columns probably or even the structure perhaps might not be the one that you all would have visualized. Do let me know in your reviews.

Copy link
Collaborator

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the table can include more columns probably

How about adding the following columns?

  • query itself (from query-X znode)
  • query_duration_ms (mtime for the parent znode (query-X) and query-X/finished/{node})
  • exception_code (finished/{node} znode)

This way it can be used to track errors and is query executed eventually (since right now if the timeout will be reached you will not know the status of the query in the client)

Also since znodes in zookeeper periodically cleaned up, looks like adding system.ddl_worker_queue_log seems useful (but can be done as a separate step I guess), but it should not be enabled by default of course (since the content is the same for all cluster)

And by the way it worth mention this new table in system-tables.md

@tavplubix
Copy link
Member

Maybe unfold arrays in table structure? I mean something like:

┌─entry────────────┬─host───────┬─status──────┬─some_column─┐
│ query-0000000042 │ node1:9000 │ active      │ ...         │
│ query-0000000042 │ node2:9000 │ finished    │ ...         │
│ query-0000000042 │ node3:9000 │ unknown     │ ...         │
│ ...              │ ...        │ ...         │ ...         │
└──────────────────┴────────────┴─────────────┴─────────────┘

And I suggest the following columns:

  • entry String (or entry_name) instead of name
  • cluster String - cluster name, we can extract it from query, but it's easier to use where cluster='smth' in select
  • host String (or maybe two separate columns with host and port) - name of host, which should execute the query. We can obtain it from hosts list in DDL entry (query-xxxxxxxxxx node data). However, hosts list in DDL entry contains IP addresses rather than hostnames, so I'm not sure if it will be convenient to use. We also can try to resolve it back to hostnames (see DDLWorker::parseQueryAndResolveHost(...)), but it's too complicated. Anyway, DDLQueryStatusInputStream relies on this hosts list when it waiting for query to be executed on all hosts, so we should take it into account and show all involved hosts in system.ddl_worker_queue, even if some of them have not started to execute query yet (and have not created corresponding active node yet).
  • status Enum('unknown' = 1, 'active' = 2, 'finished' = 3, 'error' = 4, ...) - status of query execution on each host, unknown is for hosts which have not created either active or finished node in zk, error is for finished nodes with non-zero exception code.
  • query, query_duration_ms and exception_code will be useful too
  • Maybe some additional columns with name of replica which executes query and tries_to_execute count (if query should be executed on leader, see DDLWorker::tryExecuteQueryOnLeaderReplica(...))

As for system.ddl_worker_queue_log, now all such queries are logged into system.query_log with special comment, so it's possible to get query execution status with select ... from clusterAllReplicas('cluster', 'system.query_log') where query like '/* ddl_entry=query-xxxxxxxxxx */%'. But it's a bit different thing and I agree that system.ddl_worker_queue_log maybe also useful in some cases.

Another consideration: DDLWorker is a name of class in code, so name ddl_worker_queue is clear for developers, but not for users. Maybe distributed_ddl_queue or on_cluster_ddl_queue is more suitable?

@azat, @alesapin, do you have any comments?

@azat
Copy link
Collaborator

azat commented Dec 7, 2020

host String (or maybe two separate columns with host and port) - name of host, which should execute the query

system.clusters has host_address/host_name and port, so better to use the same naming.

Maybe unfold arrays in table structure? I mean something like:

Completely agree with everything and unfolding arrays in particular.

@bharatnc
Copy link
Contributor Author

bharatnc commented Dec 7, 2020

Thank you for the review @azat and @tavplubix let me take sometime to read the reviews more closer and also make these changes.

@bharatnc
Copy link
Contributor Author

bharatnc commented Dec 15, 2020

@tavplubix and @azat

I have implemented the following as per your review:

  • entry String (or entry_name) instead of name
  • cluster String - cluster name, we can extract it from query, but it's easier to use where cluster='smth' in select
  • host String (or maybe two separate columns with host and port) - name of host, which should execute the query .....
  • status Enum('unknown' = 1, 'active' = 2, 'finished' = 3, 'error' = 4, ...) - status of query execution on each host, .....
  • query, query_duration_ms and exception_code will be useful too
  • Maybe some additional columns with name of replica which executes query and tries_to_execute count (if query should be executed on leader, see DDLWorker::tryExecuteQueryOnLeaderReplica(...))
  • - replace getChildren() with tryGetChildren()
  • - using the table name as system.distributed_ddl_queue - Let me know if you want me to change the file names to match as well. But I think DDLWorkerQueue sounds clearer. for developers as pointed out.

Additional points:

  • Current query output looks like the following:
Row 1:
──────
entry:             query-0000000000
host_name:         clickhouse01
host_address:      172.23.0.11
port:              9000
status:            finished
cluster:           test_cluster
values:            version: 1
query: CREATE DATABASE test_db UUID '40ac7692-70d3-48a9-bc29-4ade18957f59' ON CLUSTER test_cluster
hosts: ['clickhouse01:9000','clickhouse02:9000','clickhouse03:9000','clickhouse04:9000']
initiator: clickhouse01:9000
query_start_time:  2020-12-15 10:06:35
query_finish_time: 2020-12-15 10:06:35
query_duration_ms: 7
exception_code:    ZOK

1 rows in set. Elapsed: 0.037 sec. 
  • I've written the checks in such a way that if any one of the columns are unavailable perhaps, due to exception, then the columns are populated with best effort.

Concerns:

  • I'm worried about the complexity of the code that I've written. The current flow looks like:
    • I'm now using WHERE cluster='cluster' in the query.
    • For each cluster I'm querying all replicas.
    • For all the queries, I'm populating all the relevant fields using the replicas and also iterating through a list of queries to
      determine the status of the queries.

Let me know if the implementation looks better now (but it's doing more now). Thank you!

@bharatnc bharatnc changed the title add system.ddl_worker_queue table add system.distributed_ddl_queue table Dec 16, 2020
@bharatnc
Copy link
Contributor Author

@tavplubix pushed a second round of changes since your last review and have updated the PR description to contain the latest o/p. Let me know if it looks good to you. I hope that I've addressed most of the concerns.

One note about exception handling, if anywhere a zk exception occurs, the exception code is updated and we leave the column empty instead of throwing that exception. This is so that we don't fail on all results one of the zk query fails / timeouts.

@bharatnc
Copy link
Contributor Author

bharatnc commented Jan 7, 2021

@tavplubix @azat just following up - let me know if you have any other inputs, I have implemented the changes based on your latest reviews.

@alexey-milovidov
Copy link
Member

@tavplubix is on vacation until Jan 11.

Copy link
Collaborator

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/Storages/System/StorageSystemDDLWorkerQueue.cpp Outdated Show resolved Hide resolved
src/Interpreters/DDLWorker.h Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants