Skip to content

OBSDATA-8643 Adding a column in druid metric#316

Merged
Indrajeet Garse (igarse) merged 29 commits into30.0.1-confluentfrom
obs8643-late-arriving-data
Apr 23, 2025
Merged

OBSDATA-8643 Adding a column in druid metric#316
Indrajeet Garse (igarse) merged 29 commits into30.0.1-confluentfrom
obs8643-late-arriving-data

Conversation

@igarse
Copy link
Copy Markdown

@igarse Indrajeet Garse (igarse) commented Apr 8, 2025

Description

Adds a column 'deviated_minutes' and 'deviated_seconds' in the metrics. Represents the time difference from Telemetry Kafka emitter to ingestion in Druid in minutes and in seconds respectively.

Release note

For tips about how to write a good release note, see Release notes.


Key changed/added classes in this PR
  • OpenTelemetryMetricsProtobufReader
  • KafkaRecordEntity
  • KafkaEntity

This PR has:

  • been self-reviewed.
  • using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@pagrawal10 Parth Agrawal (pagrawal10) marked this pull request as ready for review April 11, 2025 03:35
@pagrawal10 Parth Agrawal (pagrawal10) requested a review from a team as a code owner April 11, 2025 03:35
event.put("delayed_minutes", delayMinutes);
}
catch (Exception e) {
log.warn(e, "Could not extract create_time from KafkaRecordEntity");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be error log

long timeUnixNano = dataPoint.getTimeUnixNano();
try {
// Get the getRecord method reflectively
Method getRecordMethod = entity.getClass().getMethod("getRecord");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should avoid using java reflection.
This can cause problems at runtime.
Instead typecast the entity into KafkaConsumerRecordEntity

something like:

if (entity instanceof KafkaConsumerRecordEntity) {
      KafkaConsumerRecordEntity ke = (KafkaConsumerRecordEntity) entity;
      ConsumerRecord<?,?> record = ke.record();
      long timestamp = record.timestamp();    

@igarse Indrajeet Garse (igarse) marked this pull request as draft April 21, 2025 11:20
COPY --chown=druid:druid --from=builder /opt /opt
COPY distribution/docker/druid.sh /druid.sh
COPY distribution/docker/peon.sh /peon.sh
COPY --chown=druid:druid distribution/docker/extra_jars/ /opt/druid/lib/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this removed ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking locally, I will add it again.

catch (ClassCastException e) {
log.error(e, "Failed to cast source entity to TimestampedEntity.");
}
catch (NullPointerException e) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not catch NPEs.
Instead check if the value is non-null before accessing.


try {
long timeUnixNano = dataPoint.getTimeUnixNano();
long createdTime = ((TimestampedEntity) source.getEntity()).getRecordTimestampMillis();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to check instance type here instead of catching exception and throw away the event:

Object entity = source.getEntity();

  if (!(entity instanceof TimestampedEntity)) {
    throw new ClassCastException("Entity is not TimestampedEntity");
  }

Let me know you thoughts ?

long deviated_seconds = (createdTime - (timeUnixNano / NANOS_TO_MILLIS)) / MILLIS_PER_SECOND;
long deviated_minutes = deviated_seconds / 60;
event.put("deviated_seconds", deviated_seconds);
event.put("deviated_minutes", deviated_minutes);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why another derived column of minutes ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier we added only 'minutes', but later it was decided to add the 'seconds' also, for more clarity and ease in bucketing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucketing will anyways be done using transformSpec during ingestion into druid.
imo, only deviated_seconds was sufficient.

Comment on lines +24 to +25
* This provides a way for extensions to safely access record timestamp information
* across ClassLoader boundaries without causing ClassCastExceptions.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not mention this in JavaDocs.

Suraj Goel (suraj-goel) and others added 6 commits April 23, 2025 14:07
* run "services" tests separately in semaphore

* Avoid local artifacts for running tests of "services"

* update command for "services" tests

* Run "services" tests sequentially

* pass params in MAVEN_OPTS

* fix MAVEN_OPTS in "Services"
…cient (#309)

* Add maxInterval to kill config and make kill tasks efficient

* resolve CI
@igarse Indrajeet Garse (igarse) merged commit 106cbd7 into 30.0.1-confluent Apr 23, 2025
2 checks passed
@igarse Indrajeet Garse (igarse) deleted the obs8643-late-arriving-data branch April 23, 2025 10:30
@igarse Indrajeet Garse (igarse) changed the title [WIP] OBSDATA-8643 Adding a column in druid metric OBSDATA-8643 Adding a column in druid metric Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants