Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading between major versions? #37

Open
roosmaa opened this issue Nov 23, 2014 · 65 comments
Open

Upgrading between major versions? #37

roosmaa opened this issue Nov 23, 2014 · 65 comments
Labels

Comments

@roosmaa
Copy link

@roosmaa roosmaa commented Nov 23, 2014

There doesn't seem to be a good way to upgrade between major versions of postgres. When sharing the volume with a new container with a newer version of postgres it won't run as the data directory hasn't been upgraded. pg_upgrade on the other hand requires (?) old installation binary files, so upgrading the data files from new server container is also difficult.

It would be nice if there was some suggested way of doing this in the readme. Maybe even some meta container which does the upgrading from version to version?

@md5
Copy link
Contributor

@md5 md5 commented Nov 23, 2014

The "Usage" section of this document describes what pg_upgrade does: http://www.postgresql.org/docs/9.4/static/pgupgrade.html

@roosmaa
Copy link
Author

@roosmaa roosmaa commented Nov 23, 2014

I guess my concern does not communicate well in the first comment... Lets try again.

pg_upgrade functionality is perfectly clear: you need to have both postgres [old] and [new] versions installed at the same time to upgrade the data structures on disk. But what is not clear is the best way of going about the upgrade when using containers, since every container only has a single version of postgres available/installed.

  • Should one create a custom upgrade-from-X-to-Y container with both X and Y versions of postgres installed?
  • Or should one just apt-get install new postgres version into the existing container, upgrade the data and then replace the container with one of the "vanilla" containers?
  • Or some even more clever way of doing this?

It would be fairly useful to have a section in the Readme regarding how to approach this. (And what to avoid?)

@tianon
Copy link
Member

@tianon tianon commented Nov 23, 2014

I've been wondering about/looking out for a good way to handle this
myself. If someone's got a good flow that works well for them, I'm very
interested in getting something documented (and "transition images" added
if that's what's necessary here -- this is a common enough thing IMO that
it doesn't really make sense to have everyone create their own for their
one-off needs).

Perhaps for each "supported" version, we could add a variant that also
installs the latest major release, and then we document how to create your
own one-off transition container if you need a transition more esoteric
than that?

@md5
Copy link
Contributor

@md5 md5 commented Nov 23, 2014

@roosmaa: my comment wasn't so much directed at you as intended as background.

@kajmagnus
Copy link

@kajmagnus kajmagnus commented Jan 13, 2015

+1 for major version upgrade instructions.

Yet another alternative, for smaller databases (?), should be that the old container runs pg_dumpall and saves the dump somewhere in the volume. Then one stops the container, starts a new one with the new major version, and it imports the dump.

@mpe
Copy link

@mpe mpe commented Mar 20, 2015

Yeah this is a major pain ... at the moment. Would be great to get some official way to do it.

@belak
Copy link

@belak belak commented Jun 16, 2015

Yeah, just nuked my postgres install on accident with an upgrade. Are there any plans for this?

@mkarg
Copy link

@mkarg mkarg commented Dec 29, 2015

I am not a docker guru, but I could imagine that it is possible to add some script to the container that is started when the container starts. That script should check the database files, and when it detects the correct version, simply startup the postgresql engine. If not, it should decide to run pgupgrade or dump/reload. Shouldn't be rocket science?

@tianon
Copy link
Member

@tianon tianon commented Dec 29, 2015

@dcbishop
Copy link

@dcbishop dcbishop commented Jan 31, 2016

+1.

I'm just going to nuke what I have since it's just a personal server I don't really use, but this seems like a major pain maintenance wise.

@Dirbaio
Copy link

@Dirbaio Dirbaio commented Feb 5, 2016

I've had success upgrading launching another postgres instance with the new version, and then using dumpall and psql to move all the data piping:

docker exec postgres-old pg_dumpall -U postgres | docker exec -i postgres-new psql -U postgres

It's rather simple :)

pg_upgrade is supposed to be faster though, I'm interested if someone has an easy way of using it.

@asprega
Copy link

@asprega asprega commented Mar 8, 2016

I was attempting to hack up a bash script to manage the upgrade process between two containers of different versions by using pg_upgrade, but I hit a roadblock:

#!/bin/bash

#
# Script to migrate PostgreSQL data from one version to another
#

set -e

OLD_CONTAINER=$1
OLD_VOLUME=$2

NEW_CONTAINER=$3
NEW_VOLUME=$4

if [ -z "$OLD_CONTAINER" ] || [ -z "$OLD_VOLUME" ] || [ -z "$NEW_CONTAINER" ] || [ -z "$NEW_VOLUME" ]; then
  echo -e
  echo -e "Usage: ./pg_upgrade_docker.sh [old container name] [old volume name] [new container name] [new volume name]"
  echo -e "Example: ./pg_upgrade_docker.sh postgres94 postgres94_data postgres95 postgres95_data"
  echo -e
  exit 1;
fi

# Get the major version and the pg binaries out of the old postgres container
(docker start "$OLD_CONTAINER" || true) > /dev/null
OLD_MAJOR="$(docker exec "$OLD_CONTAINER" bash -c 'echo "$PG_MAJOR"')"

OLD_BIN_DIR="/tmp/pg_upgrade/bin/$OLD_MAJOR"
mkdir -p "$OLD_BIN_DIR"
rm -rf "$OLD_BIN_DIR"/*
docker cp "$OLD_CONTAINER":"/usr/lib/postgresql/$OLD_MAJOR/bin/." "$OLD_BIN_DIR"

(docker stop "$OLD_CONTAINER" || true) > /dev/null

# Get the major version out of the new postgres container
(docker start "$NEW_CONTAINER" || true) > /dev/null
NEW_MAJOR="$(docker exec "$NEW_CONTAINER" bash -c 'echo "$PG_MAJOR"')"
(docker stop "$NEW_CONTAINER" || true) > /dev/null

# Create a temp container running the new postgres version which we'll just use to migrate data from one volume to another.
# This container will use the old binaries we just extracted from the old container.
# We can't reuse the existing "new" container because we have to bind extra volumes for the update to work.
NEW_IMAGE="$(docker ps -a --filter "name=$NEW_CONTAINER" --format "{{.Image}}")"
docker run -v "$OLD_BIN_DIR":/tmp/old-pg-bin -v "$OLD_VOLUME":/tmp/old-pg-data -v "$NEW_VOLUME":/tmp/new-pg-data \
  --name temp_postgres_util "$NEW_IMAGE" su - postgres -c "cd /tmp && /usr/lib/postgresql/$NEW_MAJOR/bin/pg_upgrade \
    -b /tmp/old-pg-bin -B /usr/lib/postgresql/$NEW_MAJOR/bin \
    -d /tmp/old-pg-data/ -D /tmp/new-pg-data/ \
    -o \"-c config_file=/tmp/old-pg-data/postgresql.conf\" -O \"-c config_file=/tmp/new-pg-data/postgresql.conf\""

# Remove temp container
(docker stop temp_postgres_util) > /dev/null
(docker rm temp_postgres_util) > /dev/null

rm -rf "$OLD_BIN_DIR"

echo -e "Data migration from $OLD_MAJOR to $NEW_MAJOR is complete!"

the idea is a bit convoluted, because I'm extracting the binaries from the old version and mounting them into a new temporary container created from the same image of the container with the new version, along with data volumes from existing containers (the old and the new ones).
My idea was then to use that container to run pg_upgrade, then throw it away (and data would have been migrated through the two volumes).
When running the script, though, I get the following error:

Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok

*failure*
Consult the last few lines of "pg_upgrade_server.log" for
the probable cause of the failure.

connection to database failed: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/tmp/.s.PGSQL.50432"?


could not connect to old postmaster started with the command:
"/tmp/old-pg-bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/tmp/old-pg-data/" -o "-p 50432 -b -c config_file=/tmp/old-pg-data/postgresql.conf -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/tmp'" start
Failure, exiting

Any ideas?

@donnykurnia
Copy link

@donnykurnia donnykurnia commented Mar 11, 2016

I'm using @Dirbaio tricks to dump all data from old container and restore it into the new container. Of course, this needs separate data volume and run fast with small data sets. I hope that pr_upgrade can be more 'intelligent', even better if the Postgres itself could do the upgrade by itself. I mean, when installing the new version, the apps should also know how old version format looks like. Maybe also include a necessary old version binary to do the data 'upgrade' then after the upgrade finished, just delete that old version binary.

@julienmathevet
Copy link

@julienmathevet julienmathevet commented Apr 8, 2016

+1

@mkarg
Copy link

@mkarg mkarg commented May 20, 2016

I think we should have a complete solution or none at all, as I doubt that most people always use the latest version.

From: Jonas Thiem [mailto:notifications@github.com]
Sent: Freitag, 20. Mai 2016 08:53
To: docker-library/postgres
Cc: Markus KARG; Comment
Subject: Re: [docker-library/postgres] Upgrading between major versions? (#37)

Ok so why can't the container be changed to simply have both the latest and the second-to-latest postgresql version installed? (e.g. just put a chroot into the docker container and install the older package from the distribution there or something. Once a script for that has be made that just needs to be passed the name of the respective older apt package, it shouldn't really be a lengthy process) At least that would cover the majority of users and it would allow to successfully convert the database for them automatically by the container.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub #37 (comment) https://github.com/notifications/beacon/ABn3tyqP3YQcazFg6O98ZNdeqZx_FYJwks5qDVpUgaJpZM4C_kVy.gif

@etc0de
Copy link

@etc0de etc0de commented May 23, 2016

Why is none at all any better? I would assume most people running a production server with some basic security consciousness will probably use at least some version of somewhere in the latest release cycle - such major upgrades aren't happening that often, right? (I'm thinking of the usual docker environment here optimized for flexibility and repeatability where upgrading and reverting if it goes wrong is usually less painful than for a regular oldschool server administrator) And if doing it for all versions is not feasible, at least doing it for the respective second-to-newest one should be doable..

If for the other cases where the upgrade can't be done there is a descriptive error message like there is now, I don't see how this wouldn't be at least a considerable improvement - although I agree that being able to upgrade from any arbitrary previous version automatically is of course better if it can be done.

@mkarg
Copy link

@mkarg mkarg commented May 23, 2016

It binds resources better invested into developing a general solution, and it possibly could stand in the way when coming up with the general solution.

-Markus

From: Jonas Thiem [mailto:notifications@github.com]
Sent: Montag, 23. Mai 2016 17:53
To: docker-library/postgres
Cc: Markus KARG; Comment
Subject: Re: [docker-library/postgres] Upgrading between major versions? (#37)

Why is none at all any better? I would assume most people running a production server with some basic security consciousness will probably use at least the latest release - they aren't happening that often, right? And if doing it for all versions is not feasible, at least doing it for the respective second-to-newest one should be doable..

If for the other cases where the upgrade can't be done there is a descriptive error message like there is now, I don't see how this wouldn't be at least a considerable improvement - although I agree that being able to upgrade from any arbitrary previous version automatically is of course better if it can be done.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub #37 (comment) https://github.com/notifications/beacon/ABn3t8Q4uNkFUuMUTDo-1BtrJaS5KDrRks5qEc1GgaJpZM4C_kVy.gif

@mkarg
Copy link

@mkarg mkarg commented May 23, 2016

Yes, resignation is the right word. We just do not see how it could be solved in a good way.

@mkarg
Copy link

@mkarg mkarg commented May 24, 2016

If you have time you can do whatever you like -- this is the nature of open source. Whether or not the project leads will accept your PR is a different question and not up to me to decide.

@tianon
Copy link
Member

@tianon tianon commented May 24, 2016

@tianon
Copy link
Member

@tianon tianon commented May 25, 2016

I spent a little time working on a generic way to run pg_upgrade for somewhat arbitrary version combinations (only increasing, ie 9.0 to 9.5, but not the reverse), and here's the result:
tianon/postgres-upgrade
(might be easier to read directly over at https://github.com/tianon/docker-postgres-upgrade/blob/master/README.md)

@rosenfeld
Copy link

@rosenfeld rosenfeld commented Aug 23, 2016

I haven't tested this yet but, since it's built on top of Debian, maybe an option would be to add some script to /docker-entrypoint-initdb.d like this, during the upgrade process (ideally making an snapshot of the original data first when using BTRFS for example):

Considering the current DB is on 9.4 and you want to migrate it to 9.5

apt-get update
apt-get install -y postgresql-9.5 postgresql-contrib-9.5
pg_dropcluster 9.5 main
pg_upgradecluster 9.4 main -m upgrade -k

This is the basic idea. However, this will add some unneccessary downtime. We could save some time by creating a temporary image:

FROM postgres:9.4
RUN apt-get update && apt-get install -y postgresql-9.5 postgresql-contrib-9.5
RUN echo "pg_dropcluster 9.5 main; pg_upgradecluster 9.4 main -m upgrade -k" > /docker-entrypoint-initdb.d/zzz-upgrade.sh

This is just to simplify the explanation, I'd probably use COPY rather than echo ... > ......

The container running from this new image would then only execute the remaining steps and would be run only once during the upgrade after stopping the previous 9.4 container and before starting the new 9.5 container. pg_upgradecluster will reuse the previous configuration when creating the new cluster.

Maybe to reduce the impact on users, one might want to run the 9.4 cluster in a read-only transaction during the upgrade so that it would mostly work and handle the write errors in the application to tell the users the database is being upgraded and that it's currently not possible to write new data to it... If upgrading the cluster would take a while even with the --link/-k option, maybe a read-only mode might give users a better experience rather than presenting them a static maintainance page for a long while... Maybe it could switch the header to let users know that it's in read-only mode during the upgrade in the meanwhile... It's more effort, but a better user experience for some sort of applications.

@yosifkit
Copy link
Member

@yosifkit yosifkit commented Oct 3, 2016

@Jonast, The only complete way is to have images for each set of supported versions; to ensure users can upgrade between them. @tianon, has a set of them in tianon/docker-postgres-upgrade. The other major problem is that it requires two folders (volumes) in specific locations to upgrade your database, since pg_upgrade requires both servers to be running and cannot just upgrade in place. Tianon's repo has more information on the requirements of this, which includes how to do it with one volume (properly setup), so that pg_upgrade can take advantage of it being a single drive and "use hard links instead of copying files to the new cluster".

The reason mariadb works to "upgrade automatically" is that the mariadb team has to put in extra work to make newer versions able to read older data directories, whereas postgres can make backwards incompatible changes in newer releases and not have to worry about file formats from older versions. This is why postgres data is usually stored under a directory of its version, but that would over complicate volume mounts if every version of postgres had a different volume.

@rosenfeld, unfortunately pg_dropcluster and pg_upgradecluster are specific tooling for the Debian provided packages and do not work with the Postgres provided apt/deb packages. Even after writing a new entrypoint to run the upgrade I just get failures, since it is assume specific directory structures and probably an init system:

$ # script:
#!/bin/bash

gosu postgres pg_ctl -D "$PGDATA" \
    -o "-c listen_addresses='localhost'" \
    -w start

gosu postgres pg_dropcluster 9.5 main
gosu postgres pg_upgradecluster 9.4 main -m upgrade -k

gosu postgres pg_ctl -D "$PGDATA" -m fast -w stop
exec gosu postgres "$@"

$ # shortened ouput (after initializing the db folder with a regular postgres image)
...
waiting for server to start....LOG:  database system was shut down at 2016-10-03 18:22:33 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
 done
server started
Error: specified cluster does not exist
Error: specified cluster does not exist
@rosenfeld
Copy link

@rosenfeld rosenfeld commented Oct 4, 2016

Hi @yosifkit. I don't understand why you think pg_upgradecluster being specific to Debian is a problem since the official PG images are based on Debian. As a matter of fact this is how I upgraded my PG servers from 9.5.4 to 9.6:

docker pull postgresql:9.6 # so we can easily start it after the upgrade is finished
docker run -it --rm --name pg-upgrade -v /mnt/pg/data:/var/lib/postgresql \
  -v /mnt/pg/config:/etc/postgresql --entrypoint bash postgres:9.5.4

# in the bash session in the container I installed 9.6:
apt-get update && apt-get install -y postgresql-9.6 postgresql-contrib-9.6
# Then, in another session, I stopped the server running current PG in order to
# start the upgrade (it's always a good idea to perform a back-up of the data
# before upgrading anyway).
pg_upgradecluster -m upgrade --link 9.5 main
# optionally: pg_dropcluster 9.5 main

After the upgrade is finished just start the new 9.6 container with the same arguments.

Here's how I run the container:

docker run --name pg -v /mnt/pg/scripts:/pg-scripts \
  -v /mnt/pg/data:/var/lib/postgresql \
  -v /mnt/pg/config:/etc/postgresql -p 5432:5432 \
  postgresql:9.6 /pg-scripts/start-pg 9.6 10.0.1.0/24

I stop it with docker exec pg pg_ctlcluster -f 9.6 main stop.

Here's how start-pg looks like:

#!/bin/bash
version=$1
net=$2
setup_db(){
  pg_createcluster $version main -o listen_addresses='*' -o wal_level=hot_standby \
    -o max_wal_senders=3 -o hot_standby=on -- -A trust
  pghba=/etc/postgresql/$version/main/pg_hba.conf
  echo -e "host\tall\tpguser\t$net\ttrust" >> $pghba
  echo -e "host\treplication\tpguser\t$net\ttrust" >> $pghba
  pg_ctlcluster $version main start
  psql -U postgres -c '\du' postgres|grep -q pguser || \
    createuser -U postgres -l -s pguser
  pg_ctlcluster $version main stop
}
[ -d /var/lib/postgresql/$version/main ] || setup_db
exec pg_ctlcluster --foreground $version main start

The server is actually managed by some systemd unit files, but this is basically how it works behind the scene.

@fabiand
Copy link

@fabiand fabiand commented Oct 16, 2016

What about a solution where the new container is using an image with an older postgresql version to launch the older version, then use the regular flow to perform the update?
(This is basically nesting the older postgresql container inside the new postgresql container).

It might be tricky to get it right, but at least the new postgresql image does not need to contain any previous postgresql version.

@dschilling
Copy link

@dschilling dschilling commented Oct 26, 2016

I love the simplicity of @Dirbaio's approach. You wouldn't have to know what two versions you're migrating between for it to work. If you're worried about the speed of that approach, it sounds like the Upgrade via Replication approach described in the docs would be the best option, over pg_upgrade, in terms of actual downtime.

@rosenfeld
Copy link

@rosenfeld rosenfeld commented Oct 26, 2016

@dschilling yes, if you can afford the time to set up slony (I started to read its documentation once but found it very complicated) or if you can't afford any downtime window, that's probably the best plan. Since I'm not comfortable with setting up Slony I preferred to use the method I explained above because the downtime is much smaller when you use pg_upgrade with --link. Since it can only upgrade the full cluster rather than a single database I decided to use our production database in a dedicated cluster so that the downtime would be as small as possible while using the pg_upgrade approach.

@thijslemmens
Copy link

@thijslemmens thijslemmens commented Nov 4, 2016

Instead of Slony, you could use pglogical:
https://2ndquadrant.com/en/resources/pglogical/

I plan to try this approach, using a HAProxy to switch from the old to the new version. Pglogical should be easier to use than Slony, but I still need to validate.

@yosifkit
Copy link
Member

@yosifkit yosifkit commented Nov 14, 2018

Or there needs to be a docker image to do the upgrade, like postgress:db10to11 or something

This PoC seems to work: https://github.com/tianon/docker-postgres-upgrade

@bwbroersma
Copy link

@bwbroersma bwbroersma commented Oct 12, 2019

I have made a Proof of Concept of a new entrypoint.sh script, to perform both major PG version upgrades (with pg_upgrade) and schema updates. It is far from perfect, see the todo 🙂.

The benefit is that it just upgrades with:

$ docker-compose up -d

https://github.com/bwbroersma/docker-postgres-upgrade

@johnmanko
Copy link

@johnmanko johnmanko commented Nov 26, 2019

This is also an issue with image tags. I used 12 not thinking that beta releases would ever be paired with release tags. I'm now stuck on 12-beta3 having to deal with updating to 12/12.0. Such a pain.

frafra referenced this issue in GeoNode/geonode Nov 26, 2019
aguestuser added a commit to team-friendo/signalboost that referenced this issue Apr 9, 2020
* this prevents not being able to read the db if we have to rebuild
  from scratch after the postgres docker image is upgraded.
* see: docker-library/postgres#37
aguestuser added a commit to team-friendo/signalboost that referenced this issue Apr 9, 2020
* this prevents not being able to read the db if we have to rebuild
  from scratch after the postgres docker image is upgraded.
* see: docker-library/postgres#37
@tianon tianon pinned this issue Nov 18, 2020
@djbrown
Copy link

@djbrown djbrown commented Jan 17, 2021

It's 2021 now, is there any progress on this issue? 🙏🏾

@tianon
Copy link
Member

@tianon tianon commented Jan 19, 2021

It's 2021 now, is there any progress on this issue? 🙏🏾

As far as I know, none of the upstream PostgreSQL upstream limitations on upgrading have changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet