Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward profile events to client #26177

Closed
alexey-milovidov opened this issue Jul 10, 2021 · 0 comments · Fixed by #28364
Closed

Forward profile events to client #26177

alexey-milovidov opened this issue Jul 10, 2021 · 0 comments · Fixed by #28364
Assignees
Labels

Comments

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jul 10, 2021

Server should send to the client the data in form of collections of
host_name, current_time, thread_id, type, name, value records.
type is either increment or gauge.

This data contains info from ProfileEvents (in form of increments) and from MemoryTracker (in form of gauge).
The data is sent at some intervals similarly to the Progress packets.

New type of packet will be introduced to send this data and it will be sent only to supporting clients (by checking the client version).

In distributed query processing, every server sends data about itself and also forwards all already received data from other servers. The server does not sum increments but only sends (forwards) them to the client (but integration during intervals between consecutive sends is ok). It also does not sum data neither by hosts or by threads. It is responsibility of the final client application (clickhouse-client or similar) to interpret and possibly sum increments. The gauges and increments are sent if they are non-zero.

The data of some metrics can be not per-thread, then thread_id = 0 is provided.

The data will be used for the following purposes:

  • display some metrics in clickhouse-client in realtime (e.g. total number of CPU cores participated in query processing will look cool and help for users of a cluster to maintain awareness of the vast amount of computing resources);
  • extend quotas to allow arbitrary metrics to limit query complexity (imagine you want to limit the amount of page faults in 5 minutes interval for some user - not a practical use case but you've got the idea);
  • a building block for Minimal implementation of resource pools (RFC) #8449.

Note that all these data is already available in the query_log and query_thread_log.

Additional notes

The implementation can be somewhat similar to the logs sending. The data can be serialized in form of a Block.

We can also introduce another system log that will contain these values. E.g. to draw a graph of the metrics of selected query with fine-grained resolution.

A small refactoring of ThreadStatus/ThreadGroupStatus will be needed to obtain the values of ProfileEvents from every thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants