-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logs and metrics for append segments upgraded by a concurrent replace task #16563
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kfaraz. These metrics will be very helpful in monitoring concurrent append and replace.
I have a minor comment about changing realtime to pending in the metric name and description.
Also, do you think that we should also add a metric in SegmentTransactionalAppendAction whenever an upgrade segments entry is created because of a segment append within an interval locked by a REPLACE lock?
docs/operations/metrics.md
Outdated
@@ -280,6 +280,8 @@ If the JVM does not support CPU time measurement for the current thread, `ingest | |||
|`segment/added/bytes`|Size in bytes of new segments created.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | |||
|`segment/moved/bytes`|Size in bytes of segments moved/archived via the Move Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | |||
|`segment/nuked/bytes`|Size in bytes of segments deleted via the Kill Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | |||
| `segment/upgraded/count`| Number of published segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | |||
| `segment/upgradedRealtime/count`| Number of realtime segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this to pendingSegment/upgraded/count
or something similar.
Upgraded pending segments need not correspond only to realtime tasks.
@@ -50,16 +50,22 @@ public class SegmentPublishResult | |||
@Nullable | |||
private final String errorMsg; | |||
@Nullable | |||
private final List<DataSegment> upgradedAppendSegments; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just upgradedSegments seems sufficient.
); | ||
versionToNumUpgradedSegments.forEach( | ||
(version, numUpgradedSegments) -> log.info( | ||
"Task[%s] of datasource[%s] upgraded [%d] append segments to replace version[%s].", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... upgraded [%d] segments ...
… into add_segment_upgrade_metrics
...src/main/java/org/apache/druid/indexing/common/actions/SegmentTransactionalAppendAction.java
Outdated
Show resolved
Hide resolved
final List<List<DataSegment>> segmentsLists = Lists.partition( | ||
new ArrayList<>(segments), | ||
SEGMENT_INSERT_BATCH_SIZE | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it is within the line limit.
final List<List<DataSegment>> segmentsLists = Lists.partition( | |
new ArrayList<>(segments), | |
SEGMENT_INSERT_BATCH_SIZE | |
); | |
final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The soft limit for better readability is actually 80 chars.
120 chars is the hard limit which we should not exceed.
Obviously, all of this is subjective and depends on personal coding style.
But generally, I too avoid breaking lines in method/constructor calls if
- they can be fit "comfortably" in a single line
- they are not difficult to follow
The end goal is to have easily readable code.
So I would prefer
performAction(arg1, arg2, arg3, arg4, arg5);
over
performAction(
arg1,
arg2,
arg3,
arg4,
arg5
);
but NOT
performAction(computeFirstArg(), object.getSecondArg(), arg3, arg4, thirdArgIfNonNull());
over
performAction(
computeFirstArg(),
object.getSecondArg(),
arg3,
arg4,
thirdArgIfNonNull()
);
For the case at hand, we have two options:
Option 1:
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);
Option 2:
final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);
I feel Option 1 here reads better for 2 reasons:
- Option 2 is definitely too long, it's right at the 120 char limit.
- Option 2 is a little more complex cognitively, it has too much stuff happening in a single line.
Hope you find that helpful (and not too verbose!). 🙂
Co-authored-by: Vishesh Garg <vishesh.garg@imply.io>
Thanks for the suggestions, @AmatyaAvadhanula ! Let me try to rename the added metrics and add another one for the transactional append action. |
This pull request has been marked as stale due to 60 days of inactivity. |
Changes
segment/upgraded/count
andsegment/upgradedRealtime/count
emitted with dimensions
taskId
,dataSource
,taskType
,groupId
,interval
, etc.