Better logging for MSQ worker task by cryptoe · Pull Request #13790 · apache/druid

cryptoe · 2023-02-10T12:56:18Z

Adding more logs statements to the worker implementation which makes it easier to debug.

Key changed/added classes in this PR

WorkerImpl
WorkerStageKernel

This PR has:

been self-reviewed.
been tested in a test Druid cluster.

paul-rogers · 2023-02-12T21:10:47Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

+          if (log.isDebugEnabled()) {
+            log.debug("Processing work order: %s", context.jsonMapper().writeValueAsString(kernel.getWorkOrder()));
+          } else {
+            log.info("Processing work order for stage[%d]", kernel.getStageDefinition().getStageNumber());


Would recommend first emitting the info-level general comment. If debug is enabled, emit the work order as well. Else when searching for this message, one has to know what log level was set to know which message to search for.

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

paul-rogers · 2023-02-12T21:12:45Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

  @Override
  public void postWorkOrder(final WorkOrder workOrder)
  {
+    log.info("Got work order for stage[%d]", workOrder.getStageNumber());


Nit: better to use standard English spacing: a space between "stage" and the bracket: stage [%d].

The brackets are really only needed for items that can contain spaces, so we know what is the message and what is the value. But, this is a number. so state %d.

In MSQ, in most places we are using this [] convention. I would hate to change it partially. Hence, if we decide to move away from using no brackets for numbers, then we should do it as part of another pr which touches all MSQ code.

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

paul-rogers · 2023-02-12T21:14:02Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

  @Override
  public ClusterByStatisticsSnapshot fetchStatisticsSnapshot(StageId stageId)
  {
+    log.info("Fetching statistics for stage:[%d]", stageId.getStageNumber());


As above. Inconsistent use of colon. No colon is needed here: stage %d.

Does the logger automagically fill in the task ID? If not, would be good to include that so we can find events for a specific task.

Each worker can only have one taskID and each log is generally task log is generally pulled from the druid console.
In the case, people are using popular log agg tools, then they should add a dimension taskId to the "log line" via the constructs provided by the tool, "looking at file name in case of druid to figure out the taskId" is one such approach.
Hence I am not pushing taskId everywhere which just bloats the message.

paul-rogers · 2023-02-12T21:14:32Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java

  public ClusterByStatisticsSnapshot fetchStatisticsSnapshotForTimeChunk(StageId stageId, long timeChunk)
  {
+    log.debug(
+        "Fetching statistics for stage:[%d] with timechunk:[%d]",


Edit as above stage %d, time chunk %d

paul-rogers · 2023-02-12T21:15:13Z

...re/multi-stage-query/src/main/java/org/apache/druid/msq/kernel/worker/WorkerStageKernel.java

  private void transitionTo(final WorkerStagePhase newPhase)
  {
    if (newPhase.canTransitionFrom(phase)) {
+      log.info(


Same comments as above.

cryptoe · 2023-02-21T14:00:57Z

@paul-rogers This PR is waiting on review/approval on you. Let me know if there are more things that are needed to be addressed.

paul-rogers

LGTM

Adding more logs to MSQ worker.

9850e00

cryptoe changed the title ~~Better logging to MSQ worker task~~ Better logging for MSQ worker task Feb 10, 2023

cryptoe requested a review from rohangarg February 11, 2023 05:50

paul-rogers reviewed Feb 12, 2023

View reviewed changes

a2l007 added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Feb 14, 2023

cryptoe added 3 commits February 14, 2023 10:19

Review comments.

95f0c8d

Review comments.

ca85a79

Review comments.

e6deb97

paul-rogers approved these changes Feb 24, 2023

View reviewed changes

cryptoe merged commit 6bb5eff into apache:master Feb 25, 2023

clintropolis added this to the 26.0 milestone Apr 10, 2023

techdocsmith mentioned this pull request Apr 12, 2023

[DRAFT] 26.0.0 release notes #14064

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Better logging for MSQ worker task#13790

Better logging for MSQ worker task#13790
cryptoe merged 4 commits intoapache:masterfrom
cryptoe:master

cryptoe commented Feb 10, 2023

Uh oh!

paul-rogers Feb 12, 2023

Uh oh!

Uh oh!

paul-rogers Feb 12, 2023

Uh oh!

cryptoe Feb 14, 2023

Uh oh!

Uh oh!

paul-rogers Feb 12, 2023

Uh oh!

cryptoe Feb 14, 2023

Uh oh!

paul-rogers Feb 12, 2023

Uh oh!

paul-rogers Feb 12, 2023

Uh oh!

cryptoe commented Feb 21, 2023

Uh oh!

paul-rogers left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

cryptoe commented Feb 10, 2023

Key changed/added classes in this PR

Uh oh!

paul-rogers Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paul-rogers Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

cryptoe Feb 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paul-rogers Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

cryptoe Feb 14, 2023

Choose a reason for hiding this comment

Uh oh!

paul-rogers Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

paul-rogers Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

cryptoe commented Feb 21, 2023

Uh oh!

paul-rogers left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants