Pg::Reindex - rebuild postgresql indexes concurrently without locking.
Version 0.01
use Pg::Reindex qw(prepare rebuild);
prepare($dbh, \@namespaces, \@tables, \@indexes);
rebuild($dbh, \%options, $dryrun);
perl `perldoc -l Pg::Reindex` \
[--help] \
[--server=localhost] \
[--port=5432] \
[--user=postgres] \
[--password=PASSWORD] \
[--table=TABLE] ... \
[--namespace=NAMESPACE] ... \
[--index=INDEX] ... \
[--[no]validate] \
[--high_txn_lag=BYTES] \
[--log_txn_lag=BYTES] \
[--[no]dryrun] \
[prepare|continue]
Postgresql indexes should be rebuilt on a regular basis for good performance.
This can be done with the REINDEX
command, however, building indexes this way
requires an exclusive lock on the table. On the other hand, using
CREATE INDEX CONCURRENTLY
avoids this lock.
Pg::Reindex
builds new indexes using CREATE INDEX CONCURRENTLY
. Then it
starts a transaction for each index in which it drops the old index and
renames the new one.
It handles normal indexes and PRIMARY KEY
, FOREIGN KEY
and UNIQUE
constraints.
Before creating the next index, the streaming replication lag is checked to be below a certain limit. If so, nothing special happens and the index is built.
Otherwise, rebuild
waits for the replicas to catch up. When the lag
drops under a second limit, the rebuild
does not immediately continue.
Instead it waits for another 30 seconds and checks the lag every second
within that period. Only if the lag stays below the limit for the whole
time, execution is continued. This grace period is to deal with the fact
that a wal sender process may suddenly disappear and reappear after a
few seconds. Without the grace period the program may encounter a false
drop below the limit and hence continue. For large indexes this adds a
lot of lag.
To use Pg::Reindex as a module, first you need to load the Pg::Reindex module:
use Pg::Reindex qw(prepare rebuild);
use strict;
(The use strict;
isn't required but is strongly recommended.)
Then you need to "prepare" the indexes that you want rebuilt. You can filter by combinations of namespace, tables, and indexes.
prepare($dbh, \@opt_namespaces,\@opt_tables, \@opt_indexes);
After "preparing" the set of indexes to be rebuilt, then you rebuild them:
rebuild( $dbh, { ThrottleOn => 10000000,
ThrottleOff => 100000, Validate => 1 }, $opt_dryrun);
prepare
determines the list of indexes that would be re-indexed, and
sets up the data structures used by rebuild
. prepare
must be called
before rebuild
is called.
prepare
creates a new schema named reindex
with 2 tables,
worklist
and log
. Worklist
is created as UNLOGGED
table. prepare
saves information on all indexes that need to be rebuilt
to worklist
. The information in worklist
is used by rebuild
.
-
$dbh
DBI database handle to the database whose indexes are to be reindexed.
-
\@namespaces
Rebuild only indexes in the
namespaces
. Ifnamespaces
is empty, indexes in all namespaces except the following are considered: those beginning withpg_
, ininformation_schema
i, or aresequences
namespaces. -
\@tables
Rebuild only indexes that belong to the specified tables.
-
\@indexes
List of indexes to reindex.
If tables
, namespaces
and indexes
are given simultaneously,
only indexes satisfying all conditions are considered.
-
$dbh
DBI database handle to the database whose indexes are to be reindexed.
-
\%options
ThrottleOn ThrottleOff Validate
-
$dryrun
To use Pg::Reindex as a perl script you need to have perl run it. The command
below would do that by using perldoc
to determine Pg::Reindex
's location.
perl `perldoc -l Pg::Reindex` \
[--help] \
[--server=localhost] \
[--port=5432] \
[--user=postgres] \
[--password=PASSWORD] \
[--table=TABLE] ... \
[--namespace=NAMESPACE] ... \
[--index=INDEX] ... \
[--[no]validate] \
[--high_txn_lag=BYTES] \
[--log_txn_lag=BYTES] \
[--[no]dryrun] \
[prepare|continue]
Options can be abbreviated.
-
--server
Hostname / IP address or directory path to use to connect to the Postgres server. If you want to use a local UNIX domain socket, specify the socket directory path.
Default: localhost
-
--port
The port to connect to.
Default: 5432
-
--user
The user.
Default: postgres
-
--password
a file name or open file descriptor where to read the password from. If the parameter value consists of only digits, it's evaluated as file descriptor.
There is no default.
A convenient way to specify the password on the BASH command line is
reindex.pl --password=3 3<<<my_secret
That way the password appears in
.bash_history
. But that file is usually only readable to the owner. -
--table
Reindex only indexes that belong to the specified table.
This option can be given multiple times.
If
--table
,--namespace
and--index
are given simultaneously, only indexes satisfying all conditions are considered. -
--namespace
Without this option only namespaces are considered that are not in beginning with
pg_
. Alsoinformation_schema
orsequences
namespaces are omitted.If
--table
,--namespace
and--index
are given simultaneously, only indexes satisfying all conditions are considered. -
--index
If
--table
,--namespace
and--index
are given simultaneously, only indexes satisfying all conditions are considered. -
--[no]validate
validate
FOREIGN KEY
constraints or leave themNOT VALID
. Default it to validate. -
--[no]dryrun
don't modify the database but print the essential SQL statements.
-
--high-txn-lag
the upper limit streaming replicas may lag behind in bytes.
Default is 10,000,000.
-
--low-txn-lag
the lower limit in bytes when execution may be continued after it has been interrupted due to exceeding
high_txn_lag
.Default is 100,000
-
--help
print this help
BINARY, <binary at cpan.org>
Please report any bugs or feature requests to bug-pg-reindex at rt.cpan.org
, or through
the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Pg-Reindex. I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.
Copyright 2015 Binary Ltd.
This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:
http://www.perlfoundation.org/artistic_license_2_0
Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.
If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.
This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.
This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.
Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.