-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance ClickHouse Profile: generate a uniq id for steps and processors #63518
base: master
Are you sure you want to change the base?
Conversation
74135e0
to
74e07ca
Compare
This is an automated comment for commit ee5d22c with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
74e07ca
to
7c9afe1
Compare
Not sure this idea is work for ClickHouse, if worked, maybe I will add more test case for this. |
7c9afe1
to
09dada1
Compare
for explain plan and pipeline I don't think there is a lot of confusion, since they already contain formatting that displays hierarchy. also we have a lot of tests that check plan or pipeline specifically. they all will break. |
In fact, I believe that the explain plan plays a crucial role in this PR. When we use processor_profile_log or open telemetry_span_log to identify a specific process or step with slow execution, how can we determine the corresponding details for this steps? Therefore, I think it's essential for us. I have observed that it break some stateless test cases. I believe it's worthwhile to modify the test case content. It's not hard to change. |
I actually think, that it's make more sense to add them (ids) in json output for those statements, which is more suitable format for consumption by program, and introduction of new field simpler here. |
I also add a field in explain json. I will fix it later. |
09dada1
to
8be1aec
Compare
Summary this feature:
|
@@ -1336,7 +1336,14 @@ class Context: public ContextData, public std::enable_shared_from_this<Context> | |||
std::shared_ptr<Clusters> getClustersImpl(std::lock_guard<std::mutex> & lock) const; | |||
|
|||
/// Throttling | |||
|
|||
size_t step_count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not put it inside Context
. e.g. it could be a static data member of IQueryPlanStep
. the same for IProcessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put in context can make explain result more stable, every time we explain same query will get same id.
If we put in static data, we can not get stable result, so I put it in context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we better have unstable results than polluting context with this random counters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For stateless case 01786_explain_merge_tree
, if not stable, the result may not have fixed result. So How to fix this case? just disable json output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just disable json output
I guess it makes no difference to the test logic what output format we use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just worries background thread call Plan. I will remove json output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we better have unstable results than polluting context with this random counters
Recently I think stable result is a important feature, If we worry about polluting context, how about put a point of int in CurrentThread::ThreadStatus. This should not polluting context and get a better stable index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recently I think stable result is a important feature
what value exactly do you see in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy to Debugging.
When I found some processors slow, we can use explain query to identify its' steps. if we use static value , It hard to do it if query is complex.
If you think it is not a important, I will change it to static value.
8be1aec
to
ae16ea5
Compare
ae16ea5
to
ee5d22c
Compare
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
uniform step and pipeline
Clickhouse current profiling are sometimes confuse for me:
explain plan
we got step nameexplain pipeline
we got processor namesystem.processor_profile_log
/system.opentelemetry_span_log, we got an pointer addressWhen we need analyze a complex query with dup name, it's hard to identify those things.
I think we need generate an uniq ID for every processor and step which should meaningful and can not change for every query. I use
${NAME}_${INDEX}
patten as uniq ID style.${NAME}
means step/processor name,${INDEX}
is generated by generated time.After this PR, for a query
select * from t1 as t join t1 as t2 on t.a=t2.a where t.a=1
Documentation entry for user-facing changes
Modify your CI run
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: