Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate query digests to enable caching query results #2645

Closed
wants to merge 5 commits into from
Closed

Generate query digests to enable caching query results #2645

wants to merge 5 commits into from

Conversation

denizdemir
Copy link

  • Query digest is generated from an optimized plan, and returned as part of the query results. If two digests are the same, the query results when executed must be identical.
  • If the generated digest matches the digest provided by X-Presto-Digest, the query state machine is terminated with DIGEST_MATCHED state, which is added to QueryState as a new terminal state.
  • Connectors compute the digest for the partitions or the table. Hive connector computes the digest based on the names and the last modification timestamp of either the partitions or the table, depending on the number of partitions that it needs to fetch the metadata.
  • There are some randomizations in the plan generation for certain types of queries that result in different digests even for the same query, especially with joins.
  • Query digests are logged as part of query completion event.

@electrum
Copy link
Contributor

electrum commented Apr 7, 2015

The commit messages are too long. Please use the standard format: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

@yuananf
Copy link
Contributor

yuananf commented Apr 8, 2015

I tried this PR, but I can not query any unpartitioned table in hive.
error: loadAll failed to return a value for HivePartitionName

@denizdemir
Copy link
Author

@yuananf, do you have any log lines?

@yuananf
Copy link
Contributor

yuananf commented Apr 8, 2015

presto:orc> select * from lineitem limit 10;
Query 20150408_034246_00066_qk436 failed: loadAll failed to return a value for HivePartitionName{hiveTableName=HiveTableName{databaseName=orc, tableName=lineitem}, partitionName=}
com.google.common.cache.CacheLoader.InvalidCacheLoadException: loadAll failed to return a value for HivePartitionName{hiveTableName=HiveTableName{databaseName=orc, tableName=lineitem}, partitionName=}
at com.google.common.cache.LocalCache.getAll(LocalCache.java:3992)
at com.google.common.cache.LocalCache$LocalLoadingCache.getAll(LocalCache.java:4838)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.getAll(CachingHiveMetastore.java:258)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.getPartitionsByNames(CachingHiveMetastore.java:586)
at com.facebook.presto.hive.HiveSplitManager.computeDigest(HiveSplitManager.java:238)
at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorSplitManager.computeDigest(ClassLoaderSafeConnectorSplitManager.java:53)
at com.facebook.presto.split.SplitManager.computeDigest(SplitManager.java:83)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitTableScan(PlanDigestGenerator.java:107)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitTableScan(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.TableScanNode.accept(TableScanNode.java:175)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:218)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.LimitNode.accept(LimitNode.java:72)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitSources(PlanDigestGenerator.java:533)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitExchange(PlanDigestGenerator.java:180)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitExchange(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.ExchangeNode.accept(ExchangeNode.java:140)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:218)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.LimitNode.accept(LimitNode.java:72)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitOutput(PlanDigestGenerator.java:201)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitOutput(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.OutputNode.accept(OutputNode.java:79)
at com.facebook.presto.sql.planner.PlanDigestGenerator.generate(PlanDigestGenerator.java:88)
at com.facebook.presto.execution.SqlQueryExecution.analyzeQuery(SqlQueryExecution.java:215)
at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:150)
at com.facebook.presto.execution.QueuedExecution.lambda$start$45(QueuedExecution.java:68)
at com.facebook.presto.execution.QueuedExecution$$Lambda$147/467879886.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

@denizdemir
Copy link
Author

thanks @yuananf. I'll fix it.

@martint
Copy link
Contributor

martint commented Apr 29, 2015

This needs to be rebased and migrated to the new TableLayout API

@martint
Copy link
Contributor

martint commented May 6, 2016

Closing. We'll probably tackle this once the new optimizer is in place.

I'm keeping the code under https://github.com/martint/presto/tree/query-digest for reference.

@martint martint closed this May 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants