The repartition method of insert-select (#910)

citusdata · Mar 17, 2020 · aa37db0 · aa37db0
1 parent b349566
commit aa37db0
Show file tree

Hide file tree

Showing 2 changed files with 20 additions and 4 deletions.
diff --git a/develop/api_guc.rst b/develop/api_guc.rst
@@ -376,6 +376,13 @@ citus.enable_repartition_joins (boolean)
 
 Ordinarily, attempting to perform :ref:`repartition_joins` with the adaptive executor will fail with an error message. However setting ``citus.enable_repartition_joins`` to true allows Citus to temporarily switch into the task-tracker executor to perform the join. The default value is false.
 
+.. _enable_repartitioned_insert_select:
+
+citus.enable_repartitioned_insert_select (boolean)
+**************************************************
+
+By default, an INSERT INTO … SELECT statement that cannot be pushed down will attempt to repartition rows from the SELECT statement and transfer them between workers for insertion. However, if the target table has too many shards then repartitioning will probably not perform well. The overhead of processing the shard intervals when determining how to partition the results is too great. Repartitioning can be disabled manually by setting ``citus.enable_repartitioned_insert_select`` to false.
+
 Adaptive executor configuration
 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
 

diff --git a/develop/reference_dml.rst b/develop/reference_dml.rst
@@ -47,13 +47,22 @@ Sometimes it's convenient to put multiple insert statements together into a sing
 
 Citus also supports ``INSERT … SELECT`` statements -- which insert rows based on the results of a select query. This is a convenient way to fill tables and also allows "upserts" with the ``ON CONFLICT`` clause, the easiest way to do :ref:`distributed rollups <rollups>`.
 
-In Citus there are two ways that inserting from a select statement can happen. The first is if the source tables and destination table are :ref:`colocated <colocation>`, and the select/insert statements both include the distribution column. In this case Citus can push the ``INSERT … SELECT`` statement down for parallel execution on all nodes.
+In Citus there are three ways that inserting from a select statement can happen. The first is if the source tables and destination table are :ref:`colocated <colocation>`, and the select/insert statements both include the distribution column. In this case Citus can push the ``INSERT … SELECT`` statement down for parallel execution on all nodes.
 
-The second way of executing an ``INSERT … SELECT`` statement is selecting the results from worker nodes, pulling the data up to the coordinator node, and then issuing an INSERT statement from the coordinator with the data. Citus is forced to use this approach when the source and destination tables are not colocated. Because of the network overhead, this method is not as efficient.
+The second way of executing an ``INSERT … SELECT`` statement is by repartioning the results of the result set into chunks, and sending those chunks among workers to matching destination table shards. Each worker node can insert the values into local destination shards.
 
-If upserts are an important operation in your application, the ideal solution is to model the data so that the source and destination tables are colocated, and so that the distribution column can be part of the GROUP BY clause in the upsert statement (if aggregating). This allows the operation to run in parallel across worker nodes for maximum speed.
+The repartitioning optimization can happen when the SELECT query doesn't require a merge step on the coordinator. It doesn't work with the following SQL features, which require a merge step:
 
-When in doubt about which method Citus is using, use the EXPLAIN command, as described in :ref:`postgresql_tuning`.
+* ORDER BY
+* LIMIT
+* OFFSET
+* GROUP BY when distribution column is not part of the group key
+* Window functions when partition by a non-distribution column in the source table(s)
+* Joins between non-colocated tables (i.e. repartition joins)
+
+When the source and destination tables are not colocated, and the repartition optimization cannot be applied, then Citus uses the third way of executing ``INSERT … SELECT``. It selects the results from worker nodes, and pulls the data up to the coordinator node. The coordinator redirects rows back down to the appropriate shard. Because all the data must pass through a single node, this method is not as efficient.
+
+When in doubt about which method Citus is using, use the EXPLAIN command, as described in :ref:`postgresql_tuning`. When the target table has a very large shard count, it may be wise to disable repartitioning, see :ref:`enable_repartitioned_insert_select`.
 
 COPY Command (Bulk load)
 ~~~~~~~~~~~~~~~~~~~~~~~~