Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-17432: Enable join and aggregate materialized view rewriting #245

Closed
wants to merge 1 commit into from

Conversation

jcamachor
Copy link
Contributor

No description provided.

@jcamachor jcamachor force-pushed the calcite-mvs branch 2 times, most recently from b2338e3 to a0ebb58 Compare September 10, 2017 03:07
if (crtView.getSerde() == null) {
if (storageHandler == null) {
LOG.info("Default to LazySimpleSerDe for materialized view " + crtView.getViewName());
tbl.setSerializationLib(LazySimpleSerDe.class.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look safe to assume serialization to be LazySimple. My suggestion is to throw in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is similar to the 'create table' statement logic:


If user does not specifies a custom serde, we follow same logic. I will factor out the code in fact so the method is reused for both (and if default changes for tables, it changes for materialized views too).

tbl.setInputFormatClass(crtView.getInputFormat());
tbl.setOutputFormatClass(crtView.getOutputFormat());
if (crtView.getInputFormat() != null && !crtView.getInputFormat().isEmpty()) {
tbl.getSd().setInputFormat(tbl.getInputFormatClass().getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a comment that we need to call setIF on both tbl and tbl.getSD()

List<String> parentTableQualifiedName = Lists.newArrayList(parentFullyQualifiedName);
Table parentTab = null;
try {
parentTab = Hive.get().getTable(parentDatabaseName, parentTableName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a cache for Table objects in SemanticAnalyzer::getTableObjectByName() We need to move that cache elsewhere and use it from places like this.
Could be a follow-up: Leave a TODO

ImmutableList.Builder<IntPair> keys = ImmutableList.builder();
for (ForeignKeyCol fkCol : fkCols) {
int fkPos = -1;
for (int i = 0; i < rowType.getFieldNames().size(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (String fieldName : rowType.getFieldNames()) will be easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote it to make it more readable and avoid two counters, though I still needed at least one.

}
int pkPos = -1;
for (int i = 0; i < parentTab.getAllCols().size(); i++) {
if (parentTab.getAllCols().get(i).getName().equals(fkCol.parentColName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be rewritten to make it easier to read.

@@ -1501,6 +1523,10 @@ public RelOptMaterialization apply(RelOptMaterialization materialization) {
// Add view-based rewriting rules to planner
planner.addRule(HiveMaterializedViewRule.INSTANCE_PROJECT_FILTER);
planner.addRule(HiveMaterializedViewRule.INSTANCE_FILTER);
planner.addRule(HiveMaterializedViewRule.INSTANCE_PROJECT_JOIN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding so many rules will slow down compilation. All these rules slightly vary in pattern they are looking for and roughly does same thing once they apply.
I suspect we have to create so many rules because Calcite's pattern matching on operator tree isnt powerful. We shall try to improve on this otherwise we traverse tree multiple times which could be avoided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess that will depend on how the planner (Volcano in this case) handles internally the triggering of rules, we should aim at triggering them doing a single pass over the query plan.
Although currently Calcite has no way to indicate that a rule should not be triggered only once over a specific set of equivalent expressions, I think we should be safe on that end because we have only the MV rewriting specific rules in this group, thus they will only work over current plan.

SessionState.getSessionConf(), crtViewDesc.getStorageHandler());
}

Class<? extends Deserializer> serdeClass = LazySimpleSerDe.class;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem like a good idea to use lazysimple if none is specified. Shall we throw in some cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is also similar to the one for the normal table descriptors, so we use the same strategy. I will consolidate the code here too.

@jcamachor
Copy link
Contributor Author

@ashutoshc , I have addressed the comments in the first PR (sorry, I did not realized and I have pushed -f instead of creating a new commit on top of the old one).

@asfgit asfgit closed this in 073e847 Oct 3, 2017
@jcamachor jcamachor deleted the calcite-mvs branch October 3, 2017 23:37
SirOibaf pushed a commit to SirOibaf/hive that referenced this pull request Apr 9, 2019
…sus Camacho Rodriguez, reviewed by Ashutosh Chauhan)

Close apache#245
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants