Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logs and metrics for append segments upgraded by a concurrent replace task #16563

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

kfaraz
Copy link
Contributor

@kfaraz kfaraz commented Jun 6, 2024

Changes

  • Add metrics segment/upgraded/count and segment/upgradedRealtime/count
    emitted with dimensions taskId, dataSource, taskType, groupId, interval, etc.
  • Add new logs and cleanup existing ones
  • Minor method renames

Copy link
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kfaraz. These metrics will be very helpful in monitoring concurrent append and replace.
I have a minor comment about changing realtime to pending in the metric name and description.
Also, do you think that we should also add a metric in SegmentTransactionalAppendAction whenever an upgrade segments entry is created because of a segment append within an interval locked by a REPLACE lock?

@@ -280,6 +280,8 @@ If the JVM does not support CPU time measurement for the current thread, `ingest
|`segment/added/bytes`|Size in bytes of new segments created.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
|`segment/moved/bytes`|Size in bytes of segments moved/archived via the Move Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
|`segment/nuked/bytes`|Size in bytes of segments deleted via the Kill Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
| `segment/upgraded/count`| Number of published segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
| `segment/upgradedRealtime/count`| Number of realtime segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to pendingSegment/upgraded/count or something similar.
Upgraded pending segments need not correspond only to realtime tasks.

@@ -50,16 +50,22 @@ public class SegmentPublishResult
@Nullable
private final String errorMsg;
@Nullable
private final List<DataSegment> upgradedAppendSegments;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just upgradedSegments seems sufficient.

);
versionToNumUpgradedSegments.forEach(
(version, numUpgradedSegments) -> log.info(
"Task[%s] of datasource[%s] upgraded [%d] append segments to replace version[%s].",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... upgraded [%d] segments ...

docs/operations/metrics.md Outdated Show resolved Hide resolved
docs/operations/metrics.md Outdated Show resolved Hide resolved
Comment on lines +2452 to +2455
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it is within the line limit.

Suggested change
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);
final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);

Copy link
Contributor Author

@kfaraz kfaraz Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The soft limit for better readability is actually 80 chars.
120 chars is the hard limit which we should not exceed.


Obviously, all of this is subjective and depends on personal coding style.
But generally, I too avoid breaking lines in method/constructor calls if

  • they can be fit "comfortably" in a single line
  • they are not difficult to follow

The end goal is to have easily readable code.

So I would prefer

performAction(arg1, arg2, arg3, arg4, arg5);

over

performAction(
    arg1,
    arg2,
    arg3,
    arg4,
    arg5
);    

but NOT

performAction(computeFirstArg(), object.getSecondArg(), arg3, arg4, thirdArgIfNonNull());

over

performAction(
    computeFirstArg(),
    object.getSecondArg(),
    arg3,
    arg4,
    thirdArgIfNonNull()
);

For the case at hand, we have two options:

Option 1:

final List<List<DataSegment>> segmentsLists = Lists.partition(
        new ArrayList<>(segments),
        SEGMENT_INSERT_BATCH_SIZE
);

Option 2:

final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);

I feel Option 1 here reads better for 2 reasons:

  • Option 2 is definitely too long, it's right at the 120 char limit.
  • Option 2 is a little more complex cognitively, it has too much stuff happening in a single line.

Hope you find that helpful (and not too verbose!). 🙂

Co-authored-by: Vishesh Garg <vishesh.garg@imply.io>
@kfaraz
Copy link
Contributor Author

kfaraz commented Jun 10, 2024

Thanks for the suggestions, @AmatyaAvadhanula ! Let me try to rename the added metrics and add another one for the transactional append action.

Copy link

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 12, 2024
@kfaraz kfaraz removed the stale label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants