- Concepts
- Installation
- Supported Postgres versions
- Tutorial
- Command line reference
- Operations reference
pgroll
introduces a few concepts that are important to understand before using the tool.
pgroll
migrations are applied in two steps, following an expand/contract pattern.
During the migration start phase, pgroll
will perform only additive changes to the database schema. This includes: creating new tables, adding new columns, and creating new indexes. In the cases where a required change is not backwards compatible, pgroll
will take the necessary steps to ensure that the current schema is still valid. For example, if a new column is added to a table with a NOT NULL
constraint, pgroll
will backfill the new column with a default value.
After a successful migration start, the database will contain two versions of the schema: the old version and the new version. The old version of the schema is still available to client applications. This allows client applications to be updated to use the new version of the schema without any downtime.
Once all client applications have been updated to use the latest version of the schema, the complete phase can be run. During the complete phase pgroll
will perform all non-additive changes to the database schema. This includes: dropping tables, dropping columns, and dropping indexes. Effectively breaking the old version of the schema.
pgroll
maintains multiple versions of the database schema side-by-side. This is achieved by creating a new Postgres schema for each migration that is applied to the database. The schema will contain views on the underlying tables. These views are used to expose different tables or columns to client applications depending on which version of the schema they are configured to use.
For instance, a rename column migration will create a new schema containing a view on the underlying table with the new column name. This allows for the new version of the schema to become available without breaking existing client applications that are still using the old name. In the migration complete phase, the old schema is dropped and the actual column is renamed (views are updated to point to the new column name automatically).
For other more complex changes, like adding a NOT NULL
constraint to a column, pgroll
will duplicate the affected column and backfill it with the values from the old one. For some time the old & new columns will coexist in the same table. This allows for the new version of the schema to expose the column that fulfils the constraint, while the old version still uses the old column. pgroll
will take care of copying the values from the old column to the new one, and vice versa, as needed, both by executing the backfill or installing triggers to keep the columns in sync during updates.
In order to work with the multiple versioned schema that pgroll
creates, clients need to be configured to work with one of them.
This is done by having client applications configure the search path when they connect to the Postgres database.
For example, this fragment for a Go client application shows how to set the search_path
after a connection is established:
db, err := sql.Open("postgres", "postgres://postgres:postgres@localhost:5432/postgres?sslmode=disable")
if err != nil {
return nil, err
}
searchPath := "public_02_add_assignee_column"
log.Printf("Setting search path to %q", searchPath)
_, err = db.Exec(fmt.Sprintf("SET search_path = %s", pq.QuoteIdentifier(searchPath)))
if err != nil {
return nil, fmt.Errorf("failed to set search path: %s", err)
}
In practice, the searchPath
variable would be provided to the application as an environment variable.
If an application doesn't set the search_path
for the connection, the search_path
defaults to the public
schema, meaning that the application will be working with the underlying tables directly rather than accessing them through the versioned views.
Binaries are available for Linux, macOS & Windows on our Releases page.
To install pgroll
from source, run the following command:
go install github.com/xataio/pgroll@latest
Note: requires Go 1.21 or later.
To install pgroll
with homebrew, run the following command:
# macOS or Linux
brew tap xataio/pgroll
brew install pgroll
pgroll
supports Postgres versions >= 14.
pgroll
's versioned views. This is because pgroll
is unable to create the views with the (security_invoker = true)
option, as the ability to do so was added in Postgres 15. If you use RLS in Postgres 14 pgroll
is likely a poor choice of migration tool. All other pgroll
features are fully supported across all supported Postgres versions.
This section will walk you through applying your first migrations using pgroll
.
We'll apply two migrations to a fresh database and have a look at what pgroll
does under the hood.
Prerequisites:
pgroll
installed and accessible somewhere on your$PATH
- A fresh Postgres instance against which to run migrations
A good way to get a throw-away Postgres instance for use in the tutorial is to use Docker. Start a Postgres instance in Docker with:
docker run --rm --name for-pgroll -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:16
The remainder of the tutorial assumes that you have a local Postgres instance accessible on port 5432.
pgroll
needs to store its own internal state somewhere in the target Postgres database. Initializing pgroll
configures this store and makes pgroll
ready for first use:
pgroll init
You should see a success message indicating that pgroll
has been configured.
What data does pgroll
store?
pgroll
stores its data in the pgroll
schema. In this schema it creates:
- A
migrations
table containing the version history for each schema in the database - Functions to capture the current database schema for a given schema name
- Triggers to capture DDL statements run outside of
pgroll
migrations
With pgroll
initialized, let's run our first migration. Here is a migration to create a table:
{
"name": "01_create_users_table",
"operations": [
{
"create_table": {
"name": "users",
"columns": [
{
"name": "id",
"type": "serial",
"pk": true
},
{
"name": "name",
"type": "varchar(255)",
"unique": true
},
{
"name": "description",
"type": "text",
"nullable": true
}
]
}
}
]
}
Take this file and save it as sql/01_create_users_table.json
.
The migration will create a users
table with three columns. It is equivalent to the following SQL DDL statement:
CREATE TABLE users(
id SERIAL PRIMARY KEY,
name VARCHAR(255) UNIQUE NOT NULL,
description TEXT
)
To apply the migration to the database run:
pgroll start sql/01_create_users_table.json --complete
What does the --complete
flag do here?
pgroll
divides migration application into two steps: start and complete. During the start phase, both old and new versions of the database schema are available to client applications. After the complete phase, only the most recent schema is available.
As this is the first migration there is no old schema to maintain, so the migration can safely be started and completed in one step.
For more details about pgroll
's two-step migration process, see the Multiple schema versions section.
Now let's add some users to our new table:
INSERT INTO users (name, description)
SELECT
'user_' || suffix,
CASE
WHEN random() < 0.5 THEN 'description for user_' || suffix
ELSE NULL
END
FROM generate_series(1, 100000) AS suffix;
Execute this SQL to insert 10^5 users into the users
table. Roughly half of the users will have descriptions and the other half will have NULL
descriptions.
Now that we have our users
table, lets make a non backwards-compatible change to the schema and see how pgroll
helps us by maintaining the old and new schema versions side by side.
Some of the users in our users
table have descriptions and others don't. This is because our initial migration set the description
column as nullable: true
, allowing some users to have NULL
values in the description field.
We'd like to change the users
table to disallow NULL
values in the description
field. We also want a description
to be set explicitly for all new users, so we don't want to specify a default value for the column.
There are two things that make this migration difficult:
- We have existing
NULL
values in ourdescription
column that need to be updated to something non-NULL
- Existing applications using the table are still running and may be inserting more
NULL
descriptions
pgroll
helps us solve both problems by maintaining old and new versions of the schema side-by-side and copying/rewriting data between them as required.
Here is the pgroll
migration that will perform the migration to make the description
column NOT NULL
:
{
"name": "02_user_description_set_nullable",
"operations": [
{
"alter_column": {
"table": "users",
"column": "description",
"nullable": false,
"up": "(SELECT CASE WHEN description IS NULL THEN 'description for ' || name ELSE description END)",
"down": "description"
}
}
]
}
Save this migration as sql/02_user_description_set_nullable.json
and start the migration:
pgroll start 02_user_description_set_nullable.json
After some progress updates you should see a message saying that the migration has been started successfully.
At this point it's useful to look at the table data and schema to see what pgroll
has done. Let's look at the data first:
SELECT * FROM users ORDER BY id LIMIT 10
You should see something like this:
+-----+----------+-------------------------+--------------------------+
| id | name | description | _pgroll_new_description |
+-----+----------+-------------------------+--------------------------+
| 1 | user_1 | <null> | description for user_1 |
| 2 | user_2 | description for user_2 | description for user_2 |
| 3 | user_3 | <null> | description for user_3 |
| 4 | user_4 | description for user_4 | description for user_4 |
| 5 | user_5 | <null> | description for user_5 |
| 6 | user_6 | description for user_6 | description for user_6 |
| 7 | user_7 | <null> | description for user_7 |
| 8 | user_8 | <null> | description for user_8 |
| 9 | user_9 | description for user_9 | description for user_9 |
| 10 | user_10 | description for user_10 | description for user_10 |
pgroll
has added a _pgroll_new_description
field to the table and populated the field for all rows using the up
SQL from the 02_user_description_set_nullable.json
file:
"up": "(SELECT CASE WHEN description IS NULL THEN 'description for ' || name ELSE description END)",
This has copied over all description
values into the _pgroll_new_description
field, rewriting any NULL
values using the provided SQL.
Now let's look at the table schema:
DESCRIBE users
You should see something like this:
+-------------------------+------------------------+-----------------------------------------------------------------+
| Column | Type | Modifiers |
+-------------------------+------------------------+-----------------------------------------------------------------+
| id | integer | not null default nextval('_pgroll_new_users_id_seq'::regclass) |
| name | character varying(255) | not null |
| description | text | |
| _pgroll_new_description | text | |
+-------------------------+------------------------+-----------------------------------------------------------------+
Indexes:
"_pgroll_new_users_pkey" PRIMARY KEY, btree (id)
"_pgroll_new_users_name_key" UNIQUE CONSTRAINT, btree (name)
Check constraints:
"_pgroll_add_column_check_description" CHECK (_pgroll_new_description IS NOT NULL) NOT VALID
Triggers:
_pgroll_trigger_users__pgroll_new_description BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_users__pgroll_new_description>
_pgroll_trigger_users_description BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_users_description()
The _pgroll_new_description
column has a NOT NULL
CHECK
constraint, but the old description
column is still nullable.
We'll talk about what the two triggers on the table do later.
For now, let's look at the schemas in the database:
\dn
You should see something like this:
+-----------------------------------------+-------------------+
| Name | Owner |
+-----------------------------------------+-------------------+
| pgroll | postgres |
| public | pg_database_owner |
| public_01_create_users_table | postgres |
| public_02_user_description_set_nullable | postgres |
+-----------------------------------------+-------------------+
We have two schemas: one corresponding to the old schema, public_01_create_users_table
, and one for the migration we just started, public_02_user_description_set_nullable
. Each schema contains one view on the users
table. Let's look at the view in the first schema:
\d+ public_01_create_users_table.users
The output should contain something like this:
SELECT users.id,
users.name,
users.description
FROM users;
and for the second view:
\d+ public_02_user_description_set_nullable.users
The output should contain something like this:
SELECT users.id,
users.name,
users._pgroll_new_description AS description
FROM users;
The second view exposes the same three columns as the first, but its description
field is mapped to the _pgroll_new_description
field in the underlying table.
By choosing to access the users
table through either the public_01_create_users_table.users
or public_02_user_description_set_nullable.users
view, applications have a choice of which version of the schema they want to see; either the old version without the NOT NULL
constraint on the description
field or the new version with the constraint.
When we looked at the schema of the users
table, we saw that pgroll
has created two triggers:
_pgroll_trigger_users__pgroll_new_description BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_users__pgroll_new_description>
_pgroll_trigger_users_description BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION _pgroll_trigger_users_description()
These triggers are used by pgroll
to ensure that any values written into the old description
column are copied over to the _pgroll_new_description
column (rewriting values using the up
SQL from the migration) and to copy values written to the _pgroll_new_description
column back into the old description
column (rewriting values using thedown
SQL from the migration).
Let's see the first of those triggers in action.
First set the search path for your Postgres session to use the old schema:
SET search_path = 'public_01_create_users_table'
Now insert some data into the users
table through the users
view:
INSERT INTO users(name, description) VALUES ('Alice', 'this is Alice'), ('Bob', NULL)
This inserts two new users into the users
table, one with a description
and one without.
Let's check that the data was inserted:
SELECT * FROM users WHERE name = 'Alice' or name = 'Bob'
Running this query should show:
+--------+-------+---------------------+
| id | name | description |
+--------+-------+---------------------+
| 100001 | Alice | this is Alice |
| 100002 | Bob | NULL |
+--------+-------+---------------------+
The trigger should have copied the data that was just written into the old description
column (without the NOT NULL
constraint) into the _pgroll_new_description
column (with the NOT NULL
constraint )using the up
SQL from the migration.
Let's check. Set the search path to the new version of the schema:
SET search_path = 'public_02_user_description_set_nullable'
and find the users we just inserted:
SELECT * FROM users WHERE name = 'Alice' or name = 'Bob'
The output should look like this:
+--------+-------+---------------------+
| id | name | description |
+--------+-------+---------------------+
| 100001 | Alice | this is Alice |
| 100002 | Bob | description for Bob |
+--------+-------+---------------------+
Notice that the trigger installed by pgroll
has rewritten the NULL
value inserted into the old schema by using the up
SQL from the migration definition.
Once the old version of the database schema is no longer required (perhaps the old applications that depend on the old schema are no longer in production) the current migration can be completed:
pgroll complete
After the migration has completed, the old version of the schema is no longer present in the database:
\dn
shows something like:
+-----------------------------------------+-------------------+
| Name | Owner |
+-----------------------------------------+-------------------+
| pgroll | postgres |
| public | pg_database_owner |
| public_02_user_description_set_nullable | postgres |
+-----------------------------------------+-------------------+
Only the new version schema public_02_user_description_set_nullable
remains in the database.
Let's look at the schema of the users
table to see what's changed there:
DESCRIBE users
shows something like:
+-------------+------------------------+-----------------------------------------------------------------+----------+--------------+-------------+
| Column | Type | Modifiers | Storage | Stats target | Description |
+-------------+------------------------+-----------------------------------------------------------------+----------+--------------+-------------+
| id | integer | not null default nextval('_pgroll_new_users_id_seq'::regclass) | plain | <null> | <null> |
| name | character varying(255) | not null | extended | <null> | <null> |
| description | text | not null | extended | <null> | <null> |
+-------------+------------------------+-----------------------------------------------------------------+----------+--------------+-------------+
Indexes:
"_pgroll_new_users_pkey" PRIMARY KEY, btree (id)
"_pgroll_new_users_name_key" UNIQUE CONSTRAINT, btree (name)
The extra _pgroll_new_description
has been renamed to description
and the old description
column has been removed. The column is now marked as NOT NULL
.
pgroll
has allowed us to safely roll out this change to the description
column.
We've seen:
- how to apply a couple of
pgroll
migrations to a database. - how
pgroll
separates migrations intostart
andcomplete
phases. - how data is backfilled to meet constraints at the beginning of the
start
phase. - that during the
start
phase,pgroll
uses multiple schema to present different versions of an underlying table to client applications. - that data written into the old schema version is copied over into the new schema, and vice-versa.
- that completing a migration removes the old schema version and cleans up the underlying table, putting it in its final state.
The pgroll
CLI offers the following subcommands:
The pgroll
CLI has the following top-level flags:
--postgres-url
: The URL of the postgres instance against which migrations will be run.--schema
: The Postgres schema in which migrations will be run (default"public"
).--pgroll-schema
: The Postgres schema in whichpgroll
will store its internal state (default:"pgroll"
).--lock-timeout
: The Postgreslock_timeout
value to use for allpgroll
DDL operations, specified in milliseconds (default500
).--role
: The Postgres role to use for allpgroll
DDL operations (default:""
, which doesn't set any role).
Each of these flags can also be set via an environment variable:
PGROLL_PG_URL
PGROLL_SCHEMA
PGROLL_STATE_SCHEMA
PGROLL_LOCK_TIMEOUT
PGROLL_ROLE
The CLI flag takes precedence if a flag is set via both an environment variable and a CLI flag.
pgroll init
initializes pgroll
for first use.
$ pgroll init
This will create a new schema in the database called pgroll
(or whatever value is specified with the --pgroll-schema
switch).
The tables and functions in this schema store pgroll
's internal state and are not intended to be modified outside of pgroll
CLI.
pgroll start
starts a pgroll
migration:
$ pgroll start sql/03_add_column.json
This starts the migration defined in the sql/01_create_table.json
file.
After starting a migration there will be two schema versions in the database; one for the old schema before the migration and one for the new version with the schema changes.
A migration can be started and completed with one command by specifying the --complete
flag:
$ pgroll start sql/03_add_column.json --complete
This is equivalent to running pgroll start
immediately followed by pgroll complete
.
--complete
flag is appropriate only when there are no applications running against the old database schema. In most cases, the recommended workflow is to run pgroll start
, then gracefully shut down old applications before running pgroll complete
as a separate step.
pgroll complete
completes a pgroll
migration, removing the previous schema and leaving only the latest schema.
$ pgroll complete
This completes the most recently started migration.
Running pgroll complete
when there is no migration in progress is a no-op.
Completing a pgroll
migration removes the previous schema version from the database, leaving only the latest version of the schema.
pgroll complete
ensure that all applications that depend on the old version of the database schema are no longer live. Prematurely running pgroll complete
can cause downtime of old application instances that depend on the old schema.
pgroll rollback
rolls back the currently active migration.
$ pgroll rollback
This rolls back the currently active migration (an active migration is one that has been started but not yet completed).
Rolling back a pgroll
migration means removing the new schema version. The old schema version was still present throughout the migration period and does not require modification.
Migrations cannot be rolled back once completed. Attempting to roll back a migration that has already been completed is a no-op.
pgroll rollback
ensure that any new versions of applications that depend on the new database schema are no longer live. Prematurely running pgroll rollback
can cause downtime of new application instances that depend on the new schema.
pgroll status
shows the current status of pgroll
within a given schema:
$ pgroll status
{
"Schema": "public",
"Version": "27_drop_unique_constraint",
"Status": "Complete"
}
The status field can be one of the following values:
"No migrations"
- no migrations have been applied in this schema yet."In progress"
- a migration has been started, but not yet completed."Complete"
- the most recent migration was completed.
The Version
field gives the name of the latest schema version.
If a migration is In progress
the schemas for both the latest version indicated by the Version
field and the previous version will exist in the database.
If a migration is Complete
only the latest version of the schema will exist in the database.
The top-level --schema
flag can be used to view the status of pgroll
in a different schema:
$ pgroll status --schema schema_a
{
"Schema": "schema_a",
"Version": "01_create_tables",
"Status": "Complete"
}
pgroll
migrations are specified as JSON files. All migrations follow the same basic structure:
{
"name": "0x_migration_name",
"operations": [...]
}
See the examples directory for examples of each kind of operation.
pgroll
supports the following migration operations:
- Add column
- Alter column
- Create index
- Create table
- Drop column
- Drop constraint
- Drop index
- Drop table
- Raw SQL
- Rename table
- Set replica identity
An add column operation creates a new column on an existing table.
add column operations have this structure:
{
"add_column": {
"table": "name of table to which the column should be added",
"up": "SQL expression",
"column": {
"name": "name of column",
"type": "postgres type",
"comment": "postgres comment for the column",
"nullable": true|false,
"unique": true|false,
"pk": true|false,
"default": "default value for the column",
"check": {
"name": "name of check constraint",
"constraint": "constraint expression"
},
"references": {
"name": "name of foreign key constraint",
"table": "name of referenced table",
"column": "name of referenced column"
}
}
}
}
Default values are subject to the usual rules for quoting SQL expressions. In particular, string literals should be surrounded with single quotes.
Example add column migrations:
- 03_add_column.json
- 06_add_column_to_sql_table.json
- 17_add_rating_column.json
- 26_add_column_with_check_constraint.json
- 30_add_column_simple_up.json
An alter column operation alters the properties of a column. The operation supports several sub-operations, described below.
A rename column operation renames a column.
rename column operations have this structure:
{
"alter_column": {
"table": "table name",
"column": "old column name",
"name": "new column name"
}
}
Example rename column migrations:
A change type operation changes the type of a column.
change type operations have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"type": "new type of column",
"up": "SQL expression",
"down": "SQL expression"
}
}
Example change type migrations:
An add check constraint operation adds a CHECK
constraint to a column.
add check constraint migrations have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"check": {
"name": "check constraint name",
"constraint": "constraint expression"
},
"up": "SQL expression",
"down": "SQL expression"
}
}
Example add check constraint migrations:
Add foreign key operations add a foreign key constraint to a column.
add foreign key constraints have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"references": {
"name": "name of foreign key reference",
"table": "name of referenced table",
"column": "name of referenced column"
},
"up": "SQL expression",
"down": "SQL expression"
}
}
Example add foreign key migrations:
Add not null operations add a NOT NULL
constraint to a column.
add not null operations have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"nullable": false,
"up": "SQL expression",
"down": "SQL expression"
}
}
Example add not null migrations:
Drop not null operations drop a NOT NULL
constraint from a column.
drop not null operations have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"nullable": true,
"up": "SQL expression",
"down": "SQL expression"
}
}
Example drop not null migrations:
Add unique operations add a UNIQUE
constraint to a column.
add unique operations have this structure:
{
"alter_column": {
"table": "table name",
"column": "column name",
"unique": {
"name": "name of unique constraint"
},
"up": "SQL expression",
"down": "SQL expression"
}
}
Example add unique migrations:
A create index operation creates a new btree index on a set of columns.
create index operations have this structure:
{
"create_index": {
"table": "name of table on which to define the index",
"name": "index name",
"columns": [ "names of columns on which to define the index" ]
}
}
Example create index migrations:
A create table operation creates a new table in the database.
create table operations have this structure:
{
"create_table": {
"name": "name of new table",
"columns": [...]
]
}
}
where each column
is defined as:
{
"name": "column name",
"type": "postgres type",
"comment": "postgres comment for the column",
"nullable": true|false,
"unique": true|false,
"pk": true|false,
"default": "default value"
"check": {
"name": "name of check constraint"
"constraint": "constraint expression"
}
"references": {
"name": "name of foreign key constraint"
"table": "name of referenced table"
"column": "name of referenced column"
}
},
Default values are subject to the usual rules for quoting SQL expressions. In particular, string literals should be surrounded with single quotes.
Example create table migrations:
- 01_create_tables.json
- 02_create_another_table.json
- 08_create_fruits_table.json
- 12_create_employees_table.json
- 14_add_reviews_table.json
- 19_create_orders_table.json
- 20_create_posts_table.json
- 25_add_table_with_check_constraint.json
- 28_different_defaults.json
A drop column operation drops a column from an existing table.
drop column operations have this structure:
{
"drop_column": {
"table": "name of table",
"column": "name of column to drop",
"down": "SQL expression"
}
}
Example drop column migrations:
A drop constraint operation drops a constraint from an existing table.
Only CHECK
, FOREIGN KEY
, and UNIQUE
constraints can be dropped.
drop constraint operations have this structure:
{
"drop_constraint": {
"table": "name of table",
"column": "name of column on which constraint is defined",
"name": "name of constraint to drop",
"up": "SQL expression",
"down": "SQL expression"
}
}
Example drop constraint migrations:
A drop index operation drops an index from a table.
drop index operations have this structure:
{
"drop_index": {
"name": "name of index to drop"
}
}
Example drop index migrations:
A drop table operation drops a table.
drop table operations have this structure:
{
"drop_table": {
"name": "name of table to drop"
}
}
Example drop table migrations:
A raw SQL operation runs arbitrary SQL against the database. This is intended as an 'escape hatch' to allow a migration to perform operations that are otherwise not supported by pgroll
.
pgroll
is unable to guarantee that raw SQL migrations are safe and will not result in application downtime.
sql operations have this structure:
{
"sql": {
"up": "SQL expression",
"down": "SQL expression"
}
}
Example raw SQL migrations:
A rename table operation renames a table.
rename table operations have this structure:
{
"rename_table": {
"from": "old column name",
"to": "new column name"
}
}
Example rename table migrations:
A set replica identity operation sets the replica identity for a table.
set replica identity operations have this structure:
{
"set_replica_identity": {
"table": "name of the table",
"identity": {
"type": "full | default | nothing | index"
"index": "name of the index, if type is 'index'"
}
}
}
Example set replica identity migrations: