Generate query digests to enable caching query results #2645

denizdemir · 2015-04-07T03:54:30Z

Query digest is generated from an optimized plan, and returned as part of the query results. If two digests are the same, the query results when executed must be identical.
If the generated digest matches the digest provided by X-Presto-Digest, the query state machine is terminated with DIGEST_MATCHED state, which is added to QueryState as a new terminal state.
Connectors compute the digest for the partitions or the table. Hive connector computes the digest based on the names and the last modification timestamp of either the partitions or the table, depending on the number of partitions that it needs to fetch the metadata.
There are some randomizations in the plan generation for certain types of queries that result in different digests even for the same query, especially with joins.
Query digests are logged as part of query completion event.

electrum · 2015-04-07T15:19:36Z

The commit messages are too long. Please use the standard format: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

yuananf · 2015-04-08T02:56:18Z

I tried this PR, but I can not query any unpartitioned table in hive.
error: loadAll failed to return a value for HivePartitionName

denizdemir · 2015-04-08T03:40:08Z

@yuananf, do you have any log lines?

yuananf · 2015-04-08T03:43:57Z

presto:orc> select * from lineitem limit 10;
Query 20150408_034246_00066_qk436 failed: loadAll failed to return a value for HivePartitionName{hiveTableName=HiveTableName{databaseName=orc, tableName=lineitem}, partitionName=}
com.google.common.cache.CacheLoader.InvalidCacheLoadException: loadAll failed to return a value for HivePartitionName{hiveTableName=HiveTableName{databaseName=orc, tableName=lineitem}, partitionName=}
at com.google.common.cache.LocalCache.getAll(LocalCache.java:3992)
at com.google.common.cache.LocalCache$LocalLoadingCache.getAll(LocalCache.java:4838)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.getAll(CachingHiveMetastore.java:258)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.getPartitionsByNames(CachingHiveMetastore.java:586)
at com.facebook.presto.hive.HiveSplitManager.computeDigest(HiveSplitManager.java:238)
at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorSplitManager.computeDigest(ClassLoaderSafeConnectorSplitManager.java:53)
at com.facebook.presto.split.SplitManager.computeDigest(SplitManager.java:83)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitTableScan(PlanDigestGenerator.java:107)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitTableScan(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.TableScanNode.accept(TableScanNode.java:175)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:218)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.LimitNode.accept(LimitNode.java:72)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitSources(PlanDigestGenerator.java:533)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitExchange(PlanDigestGenerator.java:180)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitExchange(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.ExchangeNode.accept(ExchangeNode.java:140)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:218)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitLimit(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.LimitNode.accept(LimitNode.java:72)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitOutput(PlanDigestGenerator.java:201)
at com.facebook.presto.sql.planner.PlanDigestGenerator$Visitor.visitOutput(PlanDigestGenerator.java:92)
at com.facebook.presto.sql.planner.plan.OutputNode.accept(OutputNode.java:79)
at com.facebook.presto.sql.planner.PlanDigestGenerator.generate(PlanDigestGenerator.java:88)
at com.facebook.presto.execution.SqlQueryExecution.analyzeQuery(SqlQueryExecution.java:215)
at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:150)
at com.facebook.presto.execution.QueuedExecution.lambda$start$45(QueuedExecution.java:68)
at com.facebook.presto.execution.QueuedExecution$$Lambda$147/467879886.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

denizdemir · 2015-04-08T03:59:07Z

thanks @yuananf. I'll fix it.

martint · 2015-04-29T16:43:55Z

This needs to be rebased and migrated to the new TableLayout API

martint · 2016-05-06T19:54:54Z

Closing. We'll probably tackle this once the new optimizer is in place.

I'm keeping the code under https://github.com/martint/presto/tree/query-digest for reference.

facebook-github-bot added the CLA Signed label Apr 7, 2015

electrum removed the CLA label Apr 8, 2015

denizdemir added 5 commits April 12, 2015 14:31

Generate digest for a query

2b7df80

Support digest in hive connector

1e968f3

Support digest in API and CLI

5994650

Add digest into query completion event

ebb2f70

Add tests for digest generation

a14e68c

cberner assigned martint Apr 20, 2015

cberner mentioned this pull request Apr 27, 2015

Dose Presto have the plan to implement cache hierarchy? #2084

Closed

martint closed this May 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate query digests to enable caching query results #2645

Generate query digests to enable caching query results #2645

denizdemir commented Apr 7, 2015

electrum commented Apr 7, 2015

yuananf commented Apr 8, 2015

denizdemir commented Apr 8, 2015

yuananf commented Apr 8, 2015

denizdemir commented Apr 8, 2015

martint commented Apr 29, 2015

martint commented May 6, 2016

Generate query digests to enable caching query results #2645

Generate query digests to enable caching query results #2645

Conversation

denizdemir commented Apr 7, 2015

electrum commented Apr 7, 2015

yuananf commented Apr 8, 2015

denizdemir commented Apr 8, 2015

yuananf commented Apr 8, 2015

denizdemir commented Apr 8, 2015

martint commented Apr 29, 2015

martint commented May 6, 2016