Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EIGHT_HOUR into possible list of Granularities. #12717

Merged
merged 5 commits into from
Jul 5, 2022

Conversation

didip
Copy link
Contributor

@didip didip commented Jun 29, 2022

Description

We have a situation where our upstream mixed up PST vs UTC, but luckily the difference is always 8 hours (not taking into account Daylight Savings Time). And in a heterogeneous environment, sometimes you don't have control over the upstream setup.

Because of this, one of our ingestion always spans 2 days: today+8hours - tomorrow+8hours.

In this situation, we cannot use the following segmentGranularities: DAY or SIX_HOUR.

Without this patch, we are forced to use segmentGranularities: HOUR, which is un-optimal. EIGHT_HOUR is the next best thing we can have.


Key changed/added classes in this PR
  • Granularities
  • GranularityType

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@suneet-s
Copy link
Contributor

Without this patch, we are forced to use segmentGranularities: HOUR, which is un-optimal. EIGHT_HOUR is the next best thing we can have.

I think you do not need this change to achieve what you are trying to do. Have you tried using period granularities - https://druid.apache.org/docs/latest/querying/granularities.html#period-granularities

PT8H should be the equivalent to what is done in this PR

@didip
Copy link
Contributor Author

didip commented Jun 30, 2022

@suneet-s But isn’t it only available during query time?

@kfaraz
Copy link
Contributor

kfaraz commented Jul 1, 2022

@didip , you can use it for segment granularity too during ingestion.
Maybe it is not called out clearly in the docs.

screenie

screenie2

@kfaraz
Copy link
Contributor

kfaraz commented Jul 1, 2022

Created issue to improve these docs: #12726

@suneet-s
Copy link
Contributor

suneet-s commented Jul 1, 2022

Thanks for providing screenshots @kfaraz!

I got to those docs about segmentGranularity by navigating from https://druid.apache.org/docs/latest/ingestion/compaction.html#compaction-granularity-spec so I think the compaction docs on their own are pretty informative.

@gianm
Copy link
Contributor

gianm commented Jul 1, 2022

There is one reason we do need to add new explicit granularities. Currently, allocation logic (for streaming ingest tasks, and batch tasks in append mode) can only work properly with segments that are predefined granularities. The reason is they use Granularity.granularitiesFinerThan which references GranularityType.values(). If we can fix that, then this patch isn't necessary. But if we can't fix that, then this patch is still useful.

Anyone got some time to look into the allocation thing?

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, after CI passes and if we don't end up adjusting allocation logic to make this unnecessary. (If we do adjust the logic, then as others have pointed out, this could be done by using a Period-based granularity for segmentGranularity.)

@didip
Copy link
Contributor Author

didip commented Jul 2, 2022

@gianm looks like the Travis error is unrelated.

@gianm gianm merged commit 06251c5 into apache:master Jul 5, 2022
@abhishekagarwal87 abhishekagarwal87 added this to the 24.0.0 milestone Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants