-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pinot-query-planner module #8340
Add pinot-query-planner module #8340
Conversation
pinot-core/src/main/java/org/apache/pinot/core/routing/RouteManager.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/catalog/PinotCatalog.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/nodes/MailboxReceiveNode.java
Show resolved
Hide resolved
Please add a link to the design doc in your PRs, thanks. |
bf796b8
to
105a255
Compare
Codecov Report
@@ Coverage Diff @@
## multi_stage_query_engine #8340 +/- ##
===========================================================
Coverage ? 30.47%
===========================================================
Files ? 1642
Lines ? 86111
Branches ? 12999
===========================================================
Hits ? 26246
Misses ? 57485
Partials ? 2380
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
Outdated
Show resolved
Hide resolved
pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/routing/RouteManager.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/calcite/jdbc/CalciteSchemaBuilder.java
Show resolved
Hide resolved
|
||
|
||
/** | ||
* The {@code QueryEnvironment} contains the main entrypoint for query planning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for completeness and readers to be aware, can you also add info into javadoc on the mapping between QueryEnvironment and a SQL query ? Is this created on a per query basis ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this actually is a good question on a 2nd thought. some of the components are actually not reusable. let me rethink this design
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline with @walterddr - This is global but there is serialization needed at the Calcite planner level. So there is still a major TODO to make this scalable. Probably it is ok to instantiate the planner on each call (and eat that cost) to planQuery() and still keep QueryEnvironment global to avoid catalog instantiation per query
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Show resolved
Hide resolved
SqlNode validated = _validator.validate(parsed); | ||
if (null == validated || !validated.getKind().belongsTo(SqlKind.QUERY)) { | ||
throw new IllegalArgumentException( | ||
String.format("unsupported SQL query, cannot validate out a valid sql from:\n%s", parsed)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include the original SQL query as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
original query will be wrapped in upper-level planQuery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I wanted to include the original string SQL query in string. May be SqlNode.toString() takes care of that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. one will not directly call validate. when it goes throw planQuery it will print the original query string. b/c exception will be caught and rethrow with the sqlString attached to the message.
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
* | ||
* <p>It provide the higher level entry interface to convert a SQL string into a {@link QueryPlan}. | ||
*/ | ||
public class QueryEnvironment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called as PinotQueryPlanner
or QueryPlanner
since it is not just an environment holder and does the entire planning ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason why I called it environment is because it holds some stateful info during the planning and it is not stateless. but let me think more on the naming for this (and javadoc)
pinot-query-planner/src/main/java/org/apache/pinot/query/context/PlannerContext.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
return this.createSqlType(SqlTypeName.VARCHAR); | ||
case BYTES: | ||
return this.createSqlType(SqlTypeName.VARBINARY); | ||
case JSON: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON is a recognized type in Pinot so we should not throw Unsup for that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON will also be implemented as Struct type i suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO is fine for now I guess. We must handle JSON since it's a first class type in Pinot now
return builder.build(); | ||
} | ||
|
||
private RelDataType toRelDataType(FieldSpec fieldSpec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle/factor array / MV ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calcite supports Map, Array and Struct type. but we are throwing here until operator support is added.
/** | ||
* Extends Java-base TypeFactory from Calcite. | ||
*/ | ||
public class TypeFactory extends JavaTypeFactoryImpl { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we need to extend from JavaTypeFactoryImpl
instead of RelDataTypeFactory
- The interface is experimental and subject to change in future as per Calcite.
- JavaTypeFactory is not experimental but the purpose of that interface seems to be to map a type / recordType to a java class ? Why do we need that model ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for conveniency on implementing the JavaTypeFactory. but yes we can definitely create our own clean impl in future optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be add a TODO here ?
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/catalog/PinotCatalog.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/config/provider/TableCache.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Show resolved
Hide resolved
public class Validator extends SqlValidatorImpl { | ||
|
||
public Validator(SqlOperatorTable opTab, SqlValidatorCatalogReader catalogReader, RelDataTypeFactory typeFactory) { | ||
super(opTab, catalogReader, typeFactory, Config.DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current CalciteSqlParser code in Pinot uses SqlConformanceEnum.BABEL
and IIUC it was done during migration from PQL to SQL to relax few things on syntax and semantics.
Should we use BABEL here as well instead of DEFAULT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given PQL is deprecated. i dont think we should use BABEL solely because of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant to say was after we migrated to SQL, we use BABEL conformance in CalciteSqlParser code.
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/nodes/StageNode.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
import org.apache.calcite.rel.RelDistribution; | ||
|
||
|
||
public class MailboxSendNode extends AbstractStageNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the discussion thread in design doc, I think we should have an abstraction of ExchangeNode. ExchangeNode should encapsulate sender and receiver node.
Similarly, there should be an abstraction for sender and receiver themselves.
Something like following....
Exchange
- BroadcastExchange
- SingleMergeExchange
- HashPartitionExchange
Sender
- BroadcastSender
- SingleSender
- HashPartitionSender
Receiver
- OrderedReceiver
- UnorderedReceiver
BroadcastExchange encapsulates
- BroadcastSender
- SomeReceiver
HashPartitionExchange encapsulates
- HashPartitionSender
- SomeReceiver
So ideally MailboxSend and MailboxReceive should be modeled as sender and receiver abstractions respectively as opposed to concrete implementations imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree that we need to add more attributes to the stage nodes.
should we consider start simple and add attributes to the ExchangeNode
?
to me the only thing we need to separate is SendExchangeNode
and ReceiveExchangeNode
. the items you mentioned above can be inferred by the Exchange.Type
- e.g. a
SendExchangeNode
withExchange.Type == BROADCAST
result in a broadcastSender
benefit of having this is we can add more attributes to the ExchangeNode
without exploding the combination of possible attributes. say later we want to have a HashPartitionButOrderedWithinPartitionSender
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/nodes/TableScanNode.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/nodes/TableScanNode.java
Outdated
Show resolved
Hide resolved
|
||
public abstract class AbstractStageNode implements StageNode { | ||
|
||
protected final String _stageId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int (to reduce heap usage) ? or do we think this is arbitrary bytes and String is better ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline - we will consider this during serialization design
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding TODO to the PR description
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/nodes/AbstractStageNode.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/StageNodeConverter.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/StagePlanner.java
Show resolved
Hide resolved
return new QueryPlan(_queryStageMap, _stageMetadataMap); | ||
} | ||
|
||
// non-threadsafe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a thread safety related question on QueryEnvironment. If that class is instantiated per compiled query, then it implies calls to StagePlanner should be thread safe ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the non-threadsafe-ness comes from more on the calcite's planner (and RelNode). my interpretation for this is - there cannot be 2 queries in planning at the same time. but the planner can be reused.
a5b1c9b
to
66de3ba
Compare
9f2c550
to
9b6ccc2
Compare
- fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter
9b6ccc2
to
8898a5b
Compare
@Override | ||
public Expression getExpression(@Nullable SchemaPlus parentSchema, String name) { | ||
requireNonNull(parentSchema, "parentSchema"); | ||
return Schemas.subSchemaExpression(parentSchema, name, getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a flat namespace as of now so we don't support sub-schema and the calcite root schema is created with empty name so what is this code doing with sub-schema ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is a default implementation. in our case it is as good as returning null since we don't support it.
|
||
@Override | ||
public RelProtoDataType getType(String name) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@walterddr For this and all below functions, we should ideally throw UnsupOperationException
instead of returning null or empty list as we probably can't predict from where and all calcite planning code will call them and if it does, better to fail the compilation through this exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to https://calcite.apache.org/javadocAggregate/org/apache/calcite/schema/Schema.html. calcite excepts null if not found. same logic exist in its default https://calcite.apache.org/javadocAggregate/org/apache/calcite/schema/impl/AbstractSchema.html
|
||
@Override | ||
public Schema getSubSchema(String name) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@walterddr It looks like Calcite doesn't expect this to be null ?
As per calcite docs, during query validation, calcite will call getSubSchema()
on the registered root schema and then on the retrieved Schema
, it will call getTable(schemaName)
to get Table / PinotTable ?
Our root schema should be PinotCatalog but based on the above, I wonder how query validation is going to work when this function is invoked ?
On the other hand, if we create a dummy root schema with exactly one child / sub-schema as PinotCatalog
, then this seems to work
- dummyRootSchema.getSubSchema("Pinot")
- returns instance of PinotCatalog
- catalog.getTable(tableName)
- returns corresponding PinotTable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only true if we registered the user-facing Schema
class and hoisted the contents up to CalciteSchema
for example, the user overrided schema class contains tables and user-defined functions, one can register those by extract all the tables into CalciteSchema.tableMap
, functions into CalciteSchema.functionMap
, etc.
This is not ideal for pinot because PinotCatalog is actually backed by TableCache, which is sort of a ever changing list of tables.
Therefore we use the SimpleCalciteSchema
which doesn't go through the protected member variables inside CalciteSchema, instead directly falls through to the user-facing schema to acquire the data.
e.g. instead of getTable() { return tableMap.get(tableName); }
it instead directly calls the Schema.getTable()
.
This way we dont have to create a calcite schema object everytime a new query comes in. one of the reason why we can have one query environment and reuse it on multiple queries.
obviously there's a drawback, if the schema/table changes in the middle of planning there potentially can be a race condition. but IMO we are better of in this case fail the query and retry since schema/table config change doens't happen so often --> the overhead to recreate an entire planner context takes more valuable E2E latency overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
* add pinot-query-planner - fix calcite upgrade compilation issue - fix query compilation runtime after calcite 1.29 upgrade - linter * address diff comments and add more TODOs Co-authored-by: Rong Rong <rongr@startree.ai>
Summary
initial commit for the multi-stage query planner.
Design Doc
https://docs.google.com/document/d/10-vL_bUrI-Pi2oYudWyUlQl9Kf0cLrW-Z8hGczkCPik/edit#heading=h.f7j5q82j0slb
TODO