-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit metrics for S3UploadThreadPool #16616
Emit metrics for S3UploadThreadPool #16616
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the right method of emitting metrics. We shouldn't need to write a wrapper around ExecutorService
for every new set of metrics that needs to be emitted.
There are 2 ways of emitting metrics in Druid:
- Directly from the relevant code
- Using a
Monitor
, e.g.TaskCountStatsMonitor
. Monitors are typically useful when we want the metrics to be emitted optionally and not always.
Changes required in this PR
- Remove the wrapper
WaitTimeMonitoringExecutorService
. - Just emit metrics directly from the
S3UploadManager
. If you face difficulties in emitting some metrics here, we can discuss that. - Use constant Strings for the metric names so that they are easy to identify
- Add documentation for the new metrics in
metrics.md
. - Add a heading "Release Note" in the PR description which describes the new metrics
I needed a way to get the time spent by a task in the queue before execution. Can you suggest how to achieve that without adding a wrapper? |
Yes, I suspected this was the motivation, 🙂 .
Do the same thing, just not in the wrapper, i.e. in To determine queue size, you can keep an Hope that helps! |
Makes sense, let me try it out! Thanks!
Why not use |
To do this, you need to update the interface to return a Compared to this, maintaining the queue count in an integer is an easy workaround and not unclean either. |
@kfaraz Fair enough, making the change! Thanks for the quick guidance on this! |
@kfaraz Have made the suggested change, and have updated the PR description also with the same. Can you please take another look? Thanks! |
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/test/java/org/apache/druid/storage/s3/S3StorageConnectorProviderTest.java
Outdated
Show resolved
Hide resolved
...core/s3-extensions/src/test/java/org/apache/druid/storage/s3/output/S3UploadManagerTest.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final suggestions, rest looks good.
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java
Outdated
Show resolved
Hide resolved
.../s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java
Outdated
Show resolved
Hide resolved
...ons-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3UploadManager.java
Outdated
Show resolved
Hide resolved
@Akshat-Jain , some ITs have failed. The failures seem genuine but unrelated. @findingrish , the failures are for centralized schema ITs. Have you seen these before? |
@kfaraz Seeing the CDS test group failures for the first time. I see these failures on this PR https://github.com/apache/druid/actions/runs/9567431225/job/26382234870?pr=16619 as well. I don't see them failing for any of the merged PRs. I will look into the failures. |
The task failures are unrelated. Going ahead with merge. |
Description
This pull request introduces the functionality to emit metrics for the
S3UploadThreadPool
.This aims to provide better visibility into the behavior of the thread pool, and tasks submitted to it.
Test plan
I verified that following metrics are being emitted:
Release note
5 new metrics have been added to provide better visibility into the behavior of the thread pool used for uploading parts (of a multi-part upload) to S3 when durable storage is enabled.
s3/upload/part/queuedTime
: Milliseconds spent by a task in queue before it starts uploading a part (of a multi-part upload) to S3 when durable storage is enabled.s3/upload/part/queueSize
: The number of tasks that are currently queued and waiting to upload a part (of a multi-part upload) to S3 when durable storage is enabled.s3/upload/part/time
: Milliseconds taken by a task to upload a part (of a multi-part upload) to S3 when durable storage is enabled.s3/upload/total/time
: The total time taken in milliseconds for uploading all parts of a file to S3 when durable storage is enabled.s3/upload/total/bytes
: The total number of bytes uploaded across all parts of a file to S3 when durable storage is enabled.This PR has: