Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework cursor creation #16533

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Jun 1, 2024

Description

This PR reworks Cursor and VectorCursor creation from CursorFactory (StorageAdapter) to be more efficient and push additional details about the query into the process with a newly introduced CursorBuildSpec. A new CursorMaker interface CursorMaker has been added that is constructed by a new method CursorFactory.asCursorMaker, which accepts the CursorBuildSpec and replaces the arguments of CursorFactory.makeCursors and CursorFactory.makeVectorCursors.

The primary goal here is to make it easy to push these details down to cursor creation for an upcoming feature under development, projections, which are effectively a type of materialized views that live within segments and will be created at ingestion time. More details on this coming later once I finish writing the proposal, but basically we need to know stuff like what grouping columns, filters, aggregations, etc, are involved in a query in order to be able to automatically select the appropriate projection available within the segment, or using the regular default segment rows if no matching projections are available.

In addition to the primary goal, this refactor also allows cursor creation to be more efficient. Since CursorMaker is created from the CursorBuildSpec, the canVectorize, makeCursors, and makeVectorCursor methods can all share and re-use the same resources.

StorageAdapter is a @PublicApi, and extends CursorFactory, so a default implmentation of CursorMaker that is backed by the old canVectorize, makeCursors, and makeVectorCursor methods is provided. That said, all of the built-in query engines and tests have been migrated to use asCursorMaker, all CursorFactory implementations have implemented asCursorMaker, and the old methods of CursorFactory have been marked as @Deprecated.

summary of changes

  • Added CursorBuildSpec which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of CursorFactory.canVectorize, CursorFactory.makeCursors, and CursorFactory.makeVectorCursors
  • added new interfaces CursorMaker and new method asCursorMaker to CursorFactory, which takes a CursorBuildSpec as an argument and replaces CursorFactory.canVectorize, CursorFactory.makeCursors, and CursorFactory.makeVectorCursors
  • Deprecated CursorFactory.canVectorize, CursorFactory.makeCursors, and CursorFactory.makeVectorCursors
  • updated all CursorFactory implementations to implement asCursorMaker
  • updated all query engines to use asCursorMaker

Release note

(for developers)

There is a change to the StorageAdapter interface, which is a 'public' api for extension writers to add additional types of segments to participate in Druids query engines. StorageAdapter extends the CursorFactory interface, whose methods canVectorize, makeCursors, and makeVectorCursor have been deprecated. These methods are replaced with a new method, asCursorMaker, which accepts a CursorBuildSpec and returns a new interface CursorMaker which defines no argument versions of canVectorize, makeCursors, and makeVectorCursor. A default implementation of asCursorMaker is provided that uses the existing deprecated methods, so no immediate action is needed for StorageAdapter implementors on upgrade, but implementors should plan in the future to migrate to implementing asCursorMaker directly.


Key changed/added classes in this PR
  • CursorFactory
  • CursorMaker
  • CursorBuildSpec

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

changes:
* Added `CursorBuildSpec` which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors`
* added new interfaces `CursorMaker` and new method `asCursorMaker` to `CursorFactory`, which takes a `CursorBuildSpec` as an argument and replaces `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors`
* Deprecated `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursors`
* updated all `CursorFactory` implementations to implement `asCursorMaker`
* updated all query engines to use `asCursorMaker`
@Override
public boolean canVectorize()
{
return CursorFactory.this.canVectorize(spec.getFilter(), spec.getVirtualColumns(), spec.isDescending());

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
CursorFactory.canVectorize
should be avoided because it has been deprecated.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intended so that existing CursorFactory/StorageAdapter implementations can keep working with query engines without implementing asCursorMaker

Comment on lines +55 to +62
return CursorFactory.this.makeCursors(
spec.getFilter(),
spec.getInterval(),
spec.getVirtualColumns(),
spec.getGranularity(),
spec.isDescending(),
spec.getQueryMetrics()
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
CursorFactory.makeCursors
should be avoided because it has been deprecated.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intended so that existing CursorFactory/StorageAdapter implementations can keep working with query engines without implementing asCursorMaker

Comment on lines +68 to +75
return CursorFactory.this.makeVectorCursor(
spec.getFilter(),
spec.getInterval(),
spec.getVirtualColumns(),
spec.isDescending(),
spec.getQueryContext().getVectorSize(),
spec.getQueryMetrics()
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
CursorFactory.makeVectorCursor
should be avoided because it has been deprecated.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intended so that existing CursorFactory/StorageAdapter implementations can keep working with query engines without implementing asCursorMaker

@@ -668,4 +724,43 @@
return new DescendingTimestampCheckingOffset(baseOffset.clone(), timestamps, timeLimit, allWithinThreshold);
}
}

private final class CursorResources implements Closeable

Check notice

Code scanning / CodeQL

Inner class could be static Note

CursorResources could be made static, since the enclosing instance is used only in its constructor.
newFilter = null;
}
} else {
newFilter = new AndFilter(ImmutableList.of(spec.getFilter(), filterOnDataSource.toFilter()));

Check warning

Code scanning / CodeQL

Dereferenced variable may be null Warning

Variable
filterOnDataSource
may be null at this access as suggested by
this
null guard.
Copy link
Member

@asdf2014 asdf2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method needs to be removed as it is never used to pass the intellij-inspections check in Github Action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants