-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-6381: Add support for index based planning and execution #1512
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Secondary Index planning interfaces and abstract classes like DBGroupScan, DbSubScan, IndexDecriptor etc. 2. Statistics and Cost model interfaces/classes: PluginCost, Statistics, StatisticsPayload, AbstractIndexStatistics 3. ScanBatch and RecordReader to support repeatable scan 4. Secondary Index execution related interfaces: RangePartitionSender, RowKeyJoin, PartitionFunction 5. MD-3979: Query using cast index plan fails with NPE Co-authored-by: Aman Sinha <asinha@maprtech.com> Co-authored-by: chunhui-shi <cshi@maprtech.com> Co-authored-by: Gautam Parai <gparai@maprtech.com> Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com> Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com> Conflicts: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java protocol/src/main/java/org/apache/drill/exec/proto/beans/CoreOperatorType.java protocol/src/main/protobuf/UserBitShared.proto
1. MD-3960: Update Drill to build with MapR-6.0.1 libraries 2. MD-3995: Do not pushdown limit 0 past project with CONVERT_FROMJSON 3. MD-4054: Restricted scan limit is changed to dynamically read rows using the rowcount of the rightside instead of 4096. 4. MD-3688: Impersonating a view owner doesn't work with security disabled in 6.0 5. MD-4492: Missing limit pushdown changes in JsonTableGroupScan Co-authored-by: chunhui-shi <cshi@maprtech.com> Co-authored-by: Gautam Parai <gparai@maprtech.com> Co-authored-by: Vlad Rozov <vrozov@mapr.com> Conflicts: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatPlugin.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBSubScan.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/BinaryTableGroupScan.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/CompareFunctionsProcessor.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonConditionBuilder.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/MaprDBJsonRecordReader.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java pom.xml
…Secondary Indexes 1. Index Planning Rules and Plan generators - DbScanToIndexScanRule: Top level physical planning rule that drives index planning for several relational algebra patterns. - DbScanSortRemovalRule: Physical planning rule for index planning for Sort-based operations. - Plan Generators: Covering, Non-Covering and Intersect physical plan generators. - Support planning with functional indexes such as CAST functions. - Enhance PlannerSettings with several configuration options for indexes. 2. Index Selection and Statistics - An IndexSelector that support cost-based index selection of covering and non-covering indexes using statistics and collation properties. - Costing of index intersection for comparison with single-index plans. 3. Planning and execution operators - Support RangePartitioning physical operator during query planning and execution. - Support RowKeyJoin physical operator during query planning and execution. - HashTable and HashJoin changes to support RowKeyJoin and Index Intersection. - Enhance Materializer to keep track of subscan association with a particular rowkey join. 4. Index Planning utilities - Utility classes to perform RexNode analysis, including conversion to and from SchemaPath. - Utility class to analyze filter condition and an input collation to determine output collation. - Helper classes to maintain index contexts for logical and physical planning phase. - IndexPlanUtils utility class for various helper methods. 5. Miscellaneous - Separate physical rel for DirectScan. - Modify LimitExchangeTranspose rule to handle SingleMergeExchange. - MD-3880: Return correct status from RangePartitionRecordBatch setupNewSchema Co-authored-by: Aman Sinha <asinha@maprtech.com> Co-authored-by: chunhui-shi <cshi@maprtech.com> Co-authored-by: Gautam Parai <gparai@maprtech.com> Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com> Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com> Conflicts: exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashJoinPOP.java exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillMergeProjectRule.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillScanRel.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/BroadcastExchangePrel.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/DrillDistributionTrait.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java exec/java-exec/src/main/resources/drill-module.conf logical/src/main/java/org/apache/drill/common/logical/StoragePluginConfig.java Resolve merge comflicts and compilation issues.
…dary indexes 1. Implementation of the index descriptor for MapR-DB. 2. MapR-DB specific costing for covering and non-covering indexes. 3. Discovery componenent to discover the indexes available for a MapR-DB table including CAST functional indexes. 4. Utility functions to build a canonical index descriptor. 5. Statistics: fetch and initialize statistcs from MapR-DB for a query condition. Maintain a query-scoped cache for the statistics. Utility functions to compute selectivity. 6. Range Partitioning: partitioning function that takes into account the tablet map to find out where a particular rowkey belongs. 7. Restricted Scan: support doing restricted (i.e skip) scan through lookups on the rowkey. Added a group-scan and record reader for this. 8. MD-3726: Simple Order by queries (without limit) when an index is used are showing regression. 9. MD-3995: Do not pushdown limit 0 past project with CONVERT_FROMJSON 10. MD-4259 : Account for limit during hashcode computation Co-authored-by: Aman Sinha <asinha@maprtech.com> Co-authored-by: chunhui-shi <cshi@maprtech.com> Co-authored-by: Gautam Parai <gparai@maprtech.com> Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com> Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com> Conflicts: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatMatcher.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBPushProjectIntoScan.java contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/index/rules/DbScanSortRemovalRule.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SortPrel.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/TopNPrel.java Fix additional compilation issues.
DRILL-6381: Add missing joinControl logic for INTERSECT_DISTINCT. - Modified HashJoin's probe phase to process INTERSECT_DISTINCT. - NOTE: For build phase, the functionality will be same as for SemiJoin when it is added later. DRILL-6381: Address code review comment for intersect_distinct. DRILL-6381: Rebase on latest master and fix compilation issues. DRILL-6381: Generate protobuf files for C++ native client. DRILL-6381: Use shaded Guava classes. Add more comments and Javadoc.
Since the original PR has received +1 from committers, I will be merging this one directly. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a replacement for the original PR #1466. Please see that PR for all review comments and resolutions. I have created this new PR after addressing review comments, doing a full rebase and resolving merge conflicts and squashing a subset of the commits. Pushing these to the original PR would have caused some comments to be lost, hence creating the new one.