-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timestamp field to SegmentMetadataEvent #16613
base: master
Are you sure you want to change the base?
Add timestamp field to SegmentMetadataEvent #16613
Conversation
d5b1e4b
to
9dd0457
Compare
9dd0457
to
07e23b3
Compare
@@ -85,7 +86,7 @@ public SegmentMetadataEvent( | |||
) | |||
{ | |||
this.dataSource = dataSource; | |||
this.createdTime = createdTime; | |||
this.createdTime = createdTime != null ? createdTime : DateTimes.nowUtc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think this field is ever passed as null
in the current usages of this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is never passed as null right now. The current and only caller of the SegmentMetadataEvent.create pass DateTimes.nowUtc() but I want to be extra safe. This is in case some new user comes along and call SegmentMetadataEvent.create without setting the createdTime (either though using the SegmentMetadataEvent.create without passing eventTime or calling the SegmentMetadataEvent constructor without setting createdTime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it makes sense to ensure that the timestamp
is always non-null.
But if it is important for the field to be non-null, then we should be more explicit about the value we use for it. We should probably avoid silently converting a null to "now" (though there are other places in the Druid code that do it 😅 ).
So I would advise to do the following:
- Retain the public constructor but do a validation of this field thus forcing callers to pass a non-null value.
- Remove arg
eventTime
from method.create()
and always useDateTime.nowUtc()
. - Add a javadoc to both the constructor and
create()
clarifying this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am doing something similar to what ServiceMetricEvent
does, which most other metrics use. This is what @gianm suggested at https://apachedruidworkspace.slack.com/archives/C0309C9L90D/p1715638010942329?thread_ts=1715634244.240639&cid=C0309C9L90D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, this comment was not a blocker anyway. We can proceed with this merge and cleanup usage of both ServiceMetricEvent
and SegmentMetadataEvent
later.
@@ -104,6 +105,7 @@ public EventMap toMap() | |||
|
|||
return EventMap.builder() | |||
.put(FEED, getFeed()) | |||
.put("timestamp", createdTime.toString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to set the timestamp
field, but if we are emitting this with the SegmentMetadataEvent
, we can probably stop emitting the createdTime
since it contains the same value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The createdTime may already be used by whoever/whatever is consuming this metric. I want to make sure we do not create regression and not remove/modified any existing fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I assumed that you retained it for backwards compat.
But I think it is fine to make a breaking change in this case as I don't think there would be a lot of downstream applications relying on this particular event. At least I have not come across any major use cases. Moreover, as you point out in the description, this event was already behaving differently than the other emitted events. So it's okay to fix that up.
Although, if you feel differently and have come across applications that depend on the createdTime
field, we can retain it while marking it as deprecated so that we can get rid of it eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not come across applications that depend on the createdTime field but I am also not the person who added (or even use) this metric haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. In that case, I would advise we get rid of the extra field since I couldn't find in the docs either. So I don't think we are required to have backward compatibility in this case.
But if you want to go ahead with this PR, we can do that later too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending comments are non-blockers.
@maytasm , there are some test failures. Please take a look. |
Add timestamp field to SegmentMetadataEvent
Description
Every other metrics emitted by Druid has a timestamp field. The fact that the SegmentMetadataEvent metric does not have a timestamp field can cause problem for the metric system like Druid. If you are ingesting Druid's emitted to Druid, then timestamp is very important. If you are running Druid version earlier than 27 (which doesn't have #14413), then ingestion will fail if there is no timestamp. While if you are running Druid version later than 27, you may be dropping these no timestamp metrics. (Note that the above is assuming you do not have missingValue set for your Druid's supervisor that is ingesting Druid emitted metrics).
Anyway, let's just add a timestamp so that all metrics are consistent.
This PR has: