Permalink
Browse files

Fast ALTER TABLE ADD COLUMN with a non-NULL default

Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.

Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.

The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g.  catalog relations) NULL can be passed for this
argument.

Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.

Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
  • Loading branch information...
adunstan committed Mar 28, 2018
1 parent ef1978d commit 16828d5c0273b4fe5f10f42588005f16b415b2d8
Showing with 1,898 additions and 244 deletions.
  1. +27 −0 doc/src/sgml/catalogs.sgml
  2. +17 −17 doc/src/sgml/ref/alter_table.sgml
  3. +498 −131 src/backend/access/common/heaptuple.c
  4. +60 −0 src/backend/access/common/tupdesc.c
  5. +1 −1 src/backend/catalog/aclchk.c
  6. +160 −11 src/backend/catalog/heap.c
  7. +3 −1 src/backend/catalog/index.c
  8. +11 −1 src/backend/commands/cluster.c
  9. +1 −1 src/backend/commands/functioncmds.c
  10. +2 −2 src/backend/commands/indexcmds.c
  11. +38 −13 src/backend/commands/tablecmds.c
  12. +1 −1 src/backend/commands/typecmds.c
  13. +1 −1 src/backend/executor/execExprInterp.c
  14. +11 −2 src/backend/executor/execMain.c
  15. +26 −2 src/backend/executor/execTuples.c
  16. +2 −0 src/backend/executor/execUtils.c
  17. +2 −2 src/backend/optimizer/util/clauses.c
  18. +4 −4 src/backend/optimizer/util/plancat.c
  19. +2 −1 src/backend/rewrite/rewriteHandler.c
  20. +1 −1 src/backend/statistics/extended_stats.c
  21. +15 −14 src/backend/utils/adt/ri_triggers.c
  22. +3 −3 src/backend/utils/adt/ruleutils.c
  23. +70 −12 src/backend/utils/cache/relcache.c
  24. +2 −2 src/backend/utils/fmgr/fmgr.c
  25. +4 −4 src/backend/utils/fmgr/funcapi.c
  26. +3 −1 src/include/access/htup_details.h
  27. +3 −0 src/include/access/tupdesc.h
  28. +29 −0 src/include/access/tupdesc_details.h
  29. +5 −1 src/include/catalog/heap.h
  30. +20 −10 src/include/catalog/pg_attribute.h
  31. +1 −1 src/include/catalog/pg_class.h
  32. +1 −3 src/test/regress/expected/event_trigger.out
  33. +515 −0 src/test/regress/expected/fast_default.out
  34. +1 −1 src/test/regress/parallel_schedule
  35. +1 −0 src/test/regress/serial_schedule
  36. +357 −0 src/test/regress/sql/fast_default.sql
@@ -1149,6 +1149,19 @@
</entry>
</row>
<row>
<entry><structfield>atthasmissing</structfield></entry>
<entry><type>bool</type></entry>
<entry></entry>
<entry>
This column has a value which is used where the column is entirely
missing from the row, as happens when a column is added with a
non-volatile <literal>DEFAULT</literal> value after the row is created.
The actual value used is stored in the
<structfield>attmissingval</structfield> column.
</entry>
</row>
<row>
<entry><structfield>attidentity</structfield></entry>
<entry><type>char</type></entry>
@@ -1229,6 +1242,20 @@
</entry>
</row>
<row>
<entry><structfield>attmissingval</structfield></entry>
<entry><type>anyarray</type></entry>
<entry></entry>
<entry>
This column has a one element array containing the value used when the
column is entirely missing from the row, as happens when the column is
added with a non-volatile <literal>DEFAULT</literal> value after the
row is created. The value is only used when
<structfield>atthasmissing</structfield> is true. If there is no value
the column is null.
</entry>
</row>
</tbody>
</tgroup>
</table>
@@ -1184,26 +1184,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</para>
<para>
When a column is added with <literal>ADD COLUMN</literal>, all existing
rows in the table are initialized with the column's default value
(NULL if no <literal>DEFAULT</literal> clause is specified).
If there is no <literal>DEFAULT</literal> clause, this is merely a metadata
change and does not require any immediate update of the table's data;
the added NULL values are supplied on readout, instead.
When a column is added with <literal>ADD COLUMN</literal> and a
non-volatile <literal>DEFAULT</literal> is specified, the default is
evaluated at the time of the statement and the result stored in the
table's metadata. That value will be used for the column for all existing
rows. If no <literal>DEFAULT</literal> is specified, NULL is used. In
neither case is a rewrite of the table required.
</para>
<para>
Adding a column with a <literal>DEFAULT</literal> clause or changing the type of
an existing column will require the entire table and its indexes to be
rewritten. As an exception when changing the type of an existing column,
if the <literal>USING</literal> clause does not change the column
contents and the old type is either binary coercible to the new type or
an unconstrained domain over the new type, a table rewrite is not needed;
but any indexes on the affected columns must still be rebuilt. Adding or
removing a system <literal>oid</literal> column also requires rewriting the entire
table. Table and/or index rebuilds may take a significant amount of time
for a large table; and will temporarily require as much as double the disk
space.
Adding a column with a volatile <literal>DEFAULT</literal> or
changing the type of an existing column will require the entire table and
its indexes to be rewritten. As an exception, when changing the type of an
existing column, if the <literal>USING</literal> clause does not change
the column contents and the old type is either binary coercible to the new
type or an unconstrained domain over the new type, a table rewrite is not
needed; but any indexes on the affected columns must still be rebuilt.
Adding or removing a system <literal>oid</literal> column also requires
rewriting the entire table. Table and/or index rebuilds may take a
significant amount of time for a large table; and will temporarily require
as much as double the disk space.
</para>
<para>
Oops, something went wrong.

0 comments on commit 16828d5

Please sign in to comment.