Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] pg_dump/pg_restore does not work #5630

Closed
1 of 3 tasks
vnghia opened this issue Dec 11, 2023 · 35 comments
Closed
1 of 3 tasks

[BUG] pg_dump/pg_restore does not work #5630

vnghia opened this issue Dec 11, 2023 · 35 comments
Labels
bug Something isn't working needs triage Bug that needs triage from maintainer

Comments

@vnghia
Copy link

vnghia commented Dec 11, 2023

The bug

When run pg_dump and later pg_restore, the restoration process throws an error:

pg_restore: error: COPY failed for table "geodata_places": ERROR:  type "earth" does not exist
LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth
                                                                  ^
QUERY:  SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth
CONTEXT:  SQL function "ll_to_earth" during inlining
COPY geodata_places, line 1: "3038999	Soldeu	1.66769	42.57688	AD	02		2017-11-06"
pg_restore: warning: errors ignored on restore: 1

I think PR #5301 causes this issue.

The OS that Immich Server is running on

Ubuntu 22.04

Version of Immich Server

v1.89.0

Version of Immich Mobile App

v1.89.0

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

N/A.

Your .env content

N/A.

Reproduction steps

1. Immich with some geodata
2. `pg_dump -Fc -U immich -w -d immich -f immich.dump`
3. `pg_restore -U immich -w -d immich --clean immich.dump`
...

Additional information

After some googling, I found this diogob/activerecord-postgres-earthdistance#30 (comment). Basically, we have to replace earth with public.earth.

@vnghia vnghia added bug Something isn't working needs triage Bug that needs triage from maintainer labels Dec 11, 2023
@bo0tzz
Copy link
Member

bo0tzz commented Dec 11, 2023

Cc @zackpollard

@nebulade
Copy link
Contributor

Came to report the same issue, as our Cloudron packaging tests hit this.

@nebulade
Copy link
Contributor

To add some more findings, looking at the postgresdump created from the database, it contains the following for earthdistance:

CREATE EXTENSION IF NOT EXISTS earthdistance WITH SCHEMA public;

All other occurances of earthdistance extension functions are correctly prefixed with public it seems.

@alextran1502
Copy link
Contributor

@vnghia What Postgres version were you using when perform db_dump and pg_restore?

@vnghia
Copy link
Author

vnghia commented Dec 11, 2023

I'm using 16 but I think other versions are affected as well since pg_dump and pg_restore are version-independent, at least for the SQL dump.

@nebulade
Copy link
Contributor

This also happens on postgres 14

@alextran1502
Copy link
Contributor

Does the following guide work?

https://immich.app/docs/administration/backup-and-restore

@nebulade
Copy link
Contributor

Comparing that with our case, it seems pg_dumpall vs pg_dump is the culprit. In Cloudron only pg_dump is performed. Have to see what potential side-effects are to change that.

@ModestTG
Copy link

Reporting that streaming data out of Cloudnative PG and into a new cluster also runs into this same issue. Seems to fail on the pg_restore function.

My Cloudnative PG cluster config, trying to get data out of a PG14 database into a new cluster that has pgvecto.rs installed by default: https://github.com/ModestTG/jace-cluster/blob/main/kubernetes/apps/database/cloudnative-pg/cluster/cluster2.yaml

@nebulade
Copy link
Contributor

nebulade commented Dec 11, 2023

To add a bit more findings, so adding SET search_path TO "$user",public; (this may vary depending on your setup, it can be fetched in a working instance with SHOW search_path) in the top of the sqldump makes the restore work. I don't have a solution yet on how to make pg_dump do this automatically so one does not have to edit the file for restore.

This behavior to unset search_path is part of https://www.postgresql.org/docs/release/10.3/

I guess the question is, why this ll_to_earth() uses earth() without prefix, but I am really not a postgres extension expert :/

@vnghia
Copy link
Author

vnghia commented Dec 11, 2023

Just want to add that SET search_path TO "$user",public; is only usable if we dump the database into plain text (sql script). In my case (binary format), it seems there is no way to restore the backup beside converting that binary format to plain text beforehand.

@jrasm91
Copy link
Contributor

jrasm91 commented Dec 11, 2023

I think it is worth noting that there are two separate things that need to happen:

  1. Schema needs to be re-created and
  2. Data needs to be reloaded

Normally (1) happens via the migrations, which do correctly create the required extensions.

When you run the pg_dump command you have to the option to include only the data, only the schema, or both the schema and the data. The command in the docs exports the schema and the data together. We should maybe update the documentation to instead recommend:

  1. Export data only
  2. Start up new install
  3. Import backup

@nebulade
Copy link
Contributor

nebulade commented Dec 11, 2023

The default recommended backup/restore path using pg_dumpall actually works fine, so I don't think it needs a fix for such deployments. However we only hit issues if not the default deployment and database setup is used. For those cases relying on a special process like "install fresh (to create the schema) -> import data" adds more requirements to the backkup/restore strategy, which I think should be rather avoided.

@vnghia
Copy link
Author

vnghia commented Dec 11, 2023

Based on this Stackoverflow's answer and this thread, this is a earthdistance's bug which uses non prefixed "earth" and there's nothing we can do about it.

The workaround is to move cube and earthdistance to pg_catalog schema which will be included even if search_path is unset by:

For a newly-created database:

CREATE EXTENSION cube WITH SCHEMA pg_catalog;
CREATE EXTENSION earthdistance WITH SCHEMA pg_catalog;

For a migrated database:

ALTER EXTENSION cube SET SCHEMA pg_catalog;
ALTER EXTENSION earthdistance SET SCHEMA pg_catalog;

This only need to be done once, no need to modify the sqldump, works well with binary dump and the easiest way to do this is adding a migration:

ALTER EXTENSION cube SET SCHEMA pg_catalog;
ALTER EXTENSION earthdistance SET SCHEMA pg_catalog;

but moving something to pg_catalog is considered bad practice :/

I can not restore even with pg_dumpall so if we decide not to move the extensions, we should at least add a warning to the backup-and-restore doc.

Happy to send a PR if needed 😄

Edit: On a second thought, moving extensions will effect only the current immich database, which is fine I guess :/

@ModestTG
Copy link

I tried the SQL statements for a newly-created database and I didn't have any success. I'm bootstrapping with Cloudnative PG using initdb. I'm still getting the earth is not defined error. My cluster is defined here: https://github.com/ModestTG/jace-cluster/blob/main/kubernetes/apps/database/cloudnative-pg/cluster/cluster2.yaml

I know my usecase is outside the defined docker-compose approach but I think most people running K8S are doing something similar to what I'm doing. I'd like to stream the data into a new cluster instead of just drop-in replacing my PG 14 image with another image that has pgvecto.rs enabled. I'm (probably foolishly) running all of my self-hosted apps on a shared psql cluster, so I'd like to not disturb what's already working if I can. Thanks again for the discussion.

@vnghia
Copy link
Author

vnghia commented Dec 12, 2023

@ModestTG, can you try the

CREATE EXTENSION cube WITH SCHEMA pg_catalog;
CREATE EXTENSION earthdistance WITH SCHEMA pg_catalog;

or

ALTER EXTENSION cube SET SCHEMA pg_catalog;
ALTER EXTENSION earthdistance SET SCHEMA pg_catalog;

on your old PG cluster (the source database) and stream it again. Because the schema is defined inside the database so if you execute these statement on the new database, it will be overwritten with the schema from the old one.

@ModestTG
Copy link

@vnghia That seems to have done the trick. The cluster has come up! Thank you so much!

@vnghia
Copy link
Author

vnghia commented Dec 12, 2023

So I guess the best solution right now is adding a migration

ALTER EXTENSION cube SET SCHEMA pg_catalog;
ALTER EXTENSION earthdistance SET SCHEMA pg_catalog;

WDYT @zackpollard ?

@zackpollard
Copy link
Contributor

zackpollard commented Dec 12, 2023

I'll need to spend some time looking into this myself. It was mentioned above (by you) that moving things into pg_catalog is considered bad practice, so I would want to understand what implications this could have.

@ModestTG
Copy link

ModestTG commented Dec 12, 2023

So I migrated the data and I'm trying to get immich stood up in the new cluster, and I keep getting <Table> already exists and errors saying that the server can't connect to the database. Perhaps my new psql cluster doesn't have the right permissions? When the migration happened the database name did not migrate over, I had to rename the database

EDIT: I just noticed that all of the tables also have the wrong owner. I'll fix that. I would think that the bootstrap initdb from Cloudnative PG would keep all of that info but I guess it doesn't. Maybe I'm not understanding it correctly.

@nebulade
Copy link
Contributor

For anyone wanting to understand the root cause a bit better, pg_dump does set the search_path to "" which was done due to https://wiki.postgresql.org/wiki/A_Guide_to_CVE-2018-1058%3A_Protect_Your_Search_Path
This causes the restore to fail as it can't lookup some symbols like earth() unless they are fully prefixed.

After all this I don't think its a good idea to register the extension into pg_catalog as a solution to the problem.

For a trusted environment, adding public to the search_path during restore is not a security issue as far as I understand the CVE and thus I think its the preferred solution, it will not require any immich change and also is in-line with the preferred backup/restore for immich which uses pg_dumpall anyways.

@ModestTG
Copy link

I was able to get immich migrated from one Cloudnative PG cluster to another Cloudnative PG cluster. Here is the method I used:

  1. Shut down Immich to stop writes to the DB
  2. Run the following commands on the existing DB
ALTER EXTENSION cube SET SCHEMA pg_catalog;
ALTER EXTENSION earthdistance SET SCHEMA pg_catalog;
  1. Create the new cluster using the bootstrap initdb method
spec:
  instances: 3
  imageName: ghcr.io/bo0tzz/cnpgvecto.rs:14-v0.1.10 # CNPG compatible image with pgvecto.rs extension installed
  bootstrap:
    initdb:
      import:
        type: microservice
        databases:
          - immich
        source:
          externalCluster: postgres # name of cluster where existing immich data lives
(...)
  externalClusters:
    - name: postgres
      connectionParameters:
        host: postgres-ro.database.svc.cluster.local
        user: postgres
        dbname: immich
      password:
        name: cloudnative-pg-secret
        key: password
  1. For me, this created a database in the new cluster called app, owned by app. You could either update your immich HR to point at this new db with the new role, or update the db to fit the existing config (this is the choice I took)
  2. To update the new cluster, perform the following psql commands on the new cluster:
ALTER DATABASE app RENAME TO immich; --or whatever your db name is
CREATE ROLE immich; --or whatever your role is
ALTER ROLE immich WITH LOGIN;
\c immich;
CREATE EXTENSION vectors; --add extension for v1.91.0 release
REASSIGN OWNED BY app TO immich;
  1. Update Immich HR secrets/config to point at the new postgres cluster
  2. Bring up Immich, verify all data is present.

@nebulade
Copy link
Contributor

For Cloudron we have decided to remove the search_path statement which newer pg_dump versions add to the sqldump. The line which gets added looks like SELECT pg_catalog.set_config('search_path', '', false); and if removed, pg_restore will pick up the default search_path from the main postgres config. This makes the restore work correctly, but given the CVE for this, this is only recommended in a trusted environment.

From our side no immich changes are required then.

@vnghia
Copy link
Author

vnghia commented Dec 13, 2023

So I think we should add a warning in "Backup and restore" section that pg_dump will produce an unusable backup without moving things into pg_catalog or manually editing and then we can close this issue.

@vnwnv
Copy link

vnwnv commented Dec 21, 2023

I'm testing disaster recovery, and I met same problem when using pg_dumpall,

I'm using the postgres 15 from debian and immich v1.90.2

backup and restore like this

# backup
pg_dumpall -c -U postgres > dump.sql

# restore
psql -f dump.sql postgres 

@alextran1502
Copy link
Contributor

@wlmqpsc Please try to follow the guide here https://immich.app/docs/administration/backup-and-restore/

@vnwnv
Copy link

vnwnv commented Dec 21, 2023

OK. I tried this. @alextran1502

For my install i need to use pg_dumpall -c -U postgres | gzip > "./dump.sql.gz" to backup and in new database, use gunzip < "/import/dump.sql.gz" | psql -U postgres -d immich to restore.

and i got same error

ERROR:  type "earth" does not exist
LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth
                                                                  ^
QUERY:  SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth
CONTEXT:  SQL function "ll_to_earth" during inlining
COPY geodata_places, line 1: "3038999   Soldeu  1.66769 42.57688        AD      02              2017-11-06"

@vktr2b
Copy link

vktr2b commented Dec 21, 2023

I've also encountered this issue but I've solved it by changing this line in my dump.sql so it uses both search_paths not just the default one.

SELECT pg_catalog.set_config('search_path', '', false)

changed to

SELECT pg_catalog.set_config('search_path', 'public, pg_catalog', true)

This sed command sets search path to includes both the public schema and the pg_catalog schema, preventing the issue occurring when importing

sed -i "s/SELECT pg_catalog.set_config('search_path', '', false);/SELECT pg_catalog.set_config('search_path', 'public, pg_catalog', true);/g" dump.sql

To verify that the change has been successfully applied, grep can be used:

grep "SELECT pg_catalog.set_config" dump.sql

I also tested removing the line altogether as above mentioned, It worked without issues but I was more conformable just leaving it there and adding those 2 schemas.

@vnghia
Copy link
Author

vnghia commented Dec 26, 2023

I got similar error while trying to restore the backup made with 1.91.1, these lines

CREATE INDEX IF NOT EXISTS face_index ON asset_faces
USING vectors (embedding cosine_ops) WITH (options = $$

and

CREATE INDEX IF NOT EXISTS clip_index ON smart_search
USING vectors (embedding cosine_ops) WITH (options = $$

should be USING public.vectors instead. Although removing SELECT pg_catalog.set_config('search_path', '', false) makes the issue go away, it will be better if we can fix this somehow.

/cc @jrasm91

@erikvanoosten
Copy link

Same error. I am on Immich v1.91.4 and I followed the guide https://immich.app/docs/administration/backup-and-restore almost to the letter. (I had to to do docker-compose up --no-start instead of docker compose create, otherwise the network was not created.)

What procedure can be followed when access to the source database is lost?

@erikvanoosten
Copy link

erikvanoosten commented Dec 29, 2023

With the sed command, suggested by #5630 (comment) this works:

Change

gunzip < "/path/to/backup/dump.sql.gz" | \
  docker exec -i immich_postgres psql -U postgres -d immich    # Restore Backup

to:

gunzip < "/path/to/backup/dump.sql.gz" | \
  sed "s/SELECT pg_catalog.set_config('search_path', '', false);/SELECT pg_catalog.set_config('search_path', 'public, pg_catalog', true);/g" | \
  docker exec -i immich_postgres psql -U postgres -d immich    # Restore Backup

@Zakmaf
Copy link

Zakmaf commented Jan 16, 2024

Thank you so much @erikvanoosten, your solution worked for right away.

@bjrne
Copy link

bjrne commented Feb 4, 2024

The default recommended backup/restore path using pg_dumpall actually works fine (@nebulade)

This was not the case for me. I dumped with the suggested dumpall from version 1.93.3 and used the official restore procedure exactly. I have a very pure docker compose setup, all immich containers, nothing custom. I also restored to 1.93.3. However I still got the error during a restore:

ERROR:  type "earth" does not exist              
LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth                           
                                                                  ^                               
QUERY:  SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth                                                 
CONTEXT:  SQL function "ll_to_earth" during inlining                                              
COPY geodata_places, line 1: "3038999   Soldeu  1.66769 42.57688        AD      02              2017-11-06"

I haven't found something broken in the restored instance though, should there have been some trouble?

@reefland
Copy link

reefland commented Feb 9, 2024

Ran into this using Cloudnative-PG doing a v14 to v16 migration. My Postgres is not dedicated to Immich, I was able to list all databases minus Immich and they all cleanly migrated via Cloudnative-PG bootstrap "monolith" method.

Slightly different steps, but same theme overall:

Step 1 - Create Manual Cluster Database Dump of PG14

cd ~/kubernetes/cloudnative-pg/backups
kubectl exec postgres14-1 -c postgres -n cloudnative-pg -- pg_dumpall -c -U postgres > cluster14.pg_dumpall

Step 2 - Use Script to Extract Immich Database from the dumpall file

https://nicolaiarocci.com/how-to-restore-a-single-postgres-database-from-a-pg_dumpall-dump/

Simple script:

cat pg_extract.sh

#!/bin/bash
[ $# -lt 2 ] && { echo "Usage: $0 <postgresql dump> <dbname>"; exit 1; }
sed  "/connect.*$2/,\$!d" $1 | sed "/PostgreSQL database dump complete/,\$d"

Used as:

./pg_extract.sh cluster14.pg_dumpall immich > immich.dump

Step 3 - Fix Search Path:

  • Could probably add this as sed step in script above
  • Edit dump file: immich.dump
  • Locate line starting with SELECT pg_catalog.set_config (About line 8)
  • Add word “public” as below:
SELECT pg_catalog.set_config('search_path', 'public', false);
  • Save changes

Step 4 - Manually create Immich database and extensions as others documented.

Step 5 - Import Immich Dump File:

GPASSWORD=<app_user_password> psql -p 5432 -h 192.168.10.247 -d immich -U app < immich.dump
  • Where GPASSWORD is the password of the app user/role.
  • Where -h is the host name, in this case IP address of new PG cluster load balancer
  • Where -d points to immich database
  • Where -U points to the user/role of app

This imported cleanly with no SQL Errors.

@immich-app immich-app locked and limited conversation to collaborators Feb 9, 2024
@jrasm91 jrasm91 converted this issue into discussion #7004 Feb 9, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working needs triage Bug that needs triage from maintainer
Projects
None yet
Development

No branches or pull requests