Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor] IncrementalIndex extension preparations #12122

Closed

Conversation

liran-funaro
Copy link
Contributor

@liran-funaro liran-funaro commented Jan 5, 2022

Description

This PR refactors the code in preparation for new extensions for IncrementalIndex (i.e., #10001).
It encapsulates common IncrementalIndex functionality in IncrementalIndex.java as much as possible.
In addition, it avoids exposing the dimensions storage outside of IncrementalIndexRow. That is, fetch each dimension directly without fetching the entire dimension array.

The modifications to IncrementalIndex includes:

  1. Changed the getMetric methods to accept IncrementalIndexRow instead of int rowOffset, because OakIncrementalIndex does not keep a mapping from row index to an actual row.
    This does not affect the other implementations' performance because in all the cases these methods are used, the caller already had an IncrementalIndexRow object.
  2. Move CachingColumnSelectorFactory to IncrementalIndex

The modifications to IncrementalIndexRow allow lazy evaluation of off-heap keys, without adding an overhead to the on-heap keys case. We add/modify the following methods:

  1. Changed public Object[] getDims() to public Object getDim(int index)
  2. Add public int getDimsLength()
  3. Add public boolean isDimNull(int index)
  4. Add public IndexedInts getIndexedDim(int index, @Nullable IndexedInts cachedIndexedInts)

getIndexedDim() purpose is to support a lazy-evaluation version of an indexed ints dimension instead of the array of integers that is returned by getDim().
The modified implementation in StringDimensionIndexer, utilize this method and query using IndexedInts instead of the array. This allows our implementation (OakIncrementalIndex) to use the lazy-evaluation approach, while the other implementations use the existing integer array, without performance degradation.


Key changed/added classes in this commit
  • Changed:
    • IncrementalIndex
    • IncrementalIndexRow
    • OnheapIncrementalIndex
    • StringDimensionIndexer

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

…reparations

# Conflicts:
#	processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java
#	processing/src/main/java/org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
@stale
Copy link

stale bot commented Apr 16, 2022

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

Copy link

github-actions bot commented Dec 8, 2023

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants