Skip to content

Commit 06d27cb

Browse files
author
Erik Froseth
committed
WL#2241 Implement hash join
This patch adds hash joins to MySQL. We have chosen to implement hash join with spill to disk if the input(s) become too big to fit into memory. We start out by trying to do everything in memory. If we run out of memory, we partition each input into smaller partitions, also called hash join chunk. Each pair of partitions is then processed as a normal in-memory hash join. The amount of memory available is controlled by the variable "join_buffer_size". In this round, we will replace block-nested loop (BNL) (both with and without join conditions) with hash join whenever possible. That is, if the join condition is an equi-join where each side of the condition refers to each side of the join, and the plan can be executed using the iterator execution engine, we will replace BNL with hash join. This version of hash join only supports inner joins, so outer joins will be available at a later stage. There have been no changes to the join optimizer, so the optimizer is still making execution plans thinking it is going to execute a BNL. This should clearly be changed, but that is left to a later patch/worklog. Note that we are relying on the new iterator execution engine to execute hash joins. This means that any query that executes using the old execution engine won't be able to execute hash joins. We have benchmarked hash join against both block-nested loop and index lookups. Hash join is faster than BNL in almost all cases, especially with larger inputs; in many queries, hash join completes in seconds where BNL is still running after several hours. Compared with index lookups, we have seen cases in DBT-3 where an execution plan with hash join is both on par and faster than doing a nested loop with index lookup. But the differences here are not that big, and there are many cases where an index lookup is faster than hash join. Change-Id: Ib3b3daaa8abc93567f99bf36bd1ef2c7633368a7
1 parent 43a68bf commit 06d27cb

File tree

82 files changed

+7163
-1650
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+7163
-1650
lines changed

include/map_helpers.h

+15
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,21 @@ class memroot_unordered_map
316316
Memroot_allocator<std::pair<const Key, Value>>(mem_root)) {}
317317
};
318318

319+
// std::unordered_multimap, but allocated on a MEM_ROOT.
320+
template <class Key, class Value, class Hash,
321+
class KeyEqual = std::equal_to<Key>>
322+
class memroot_unordered_multimap
323+
: public std::unordered_multimap<
324+
Key, Value, Hash, KeyEqual,
325+
Memroot_allocator<std::pair<const Key, Value>>> {
326+
public:
327+
memroot_unordered_multimap(MEM_ROOT *mem_root, Hash hash)
328+
: std::unordered_multimap<Key, Value, Hash, KeyEqual,
329+
Memroot_allocator<std::pair<const Key, Value>>>(
330+
/*bucket_count=*/10, hash, KeyEqual(),
331+
Memroot_allocator<std::pair<const Key, Value>>(mem_root)) {}
332+
};
333+
319334
/**
320335
std::unordered_map, but collation aware and allocated on a MEM_ROOT.
321336
*/

mysql-test/include/join_cache.inc

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ SELECT city.Name, country.Name, countrylanguage.Language
5353
city.Name LIKE 'L%' AND country.Population > 3000000 AND
5454
countrylanguage.Percentage > 50;
5555

56-
set join_buffer_size=256;
56+
set join_buffer_size=2048;
5757
show variables like 'join_buffer_size';
5858

5959
EXPLAIN

0 commit comments

Comments
 (0)