Skip to content

[Bug]: Better logging/error detection when recovery fails #8501

@AntonOfTheWoods

Description

@AntonOfTheWoods

Is there an existing issue already for this bug?

  • I have searched for an existing issue, and could not find anything. I believe this is a new bug.

I have read the troubleshooting guide

  • I have read the troubleshooting guide and I think this is a new bug.

I am running a supported version of CloudNativePG

  • I have read the troubleshooting guide and I think this is a new bug.

Contact Details

anton.melser@outlook.com

Version

1.27 (latest patch)

What version of Kubernetes are you using?

1.33

What is your Kubernetes environment?

Self-managed: k3s

How did you install the operator?

YAML manifest

What happened?

Restore from backup failed with a 0.6.0 barman plugin backup.

There is basically no useful information in the log. Re-running the backup (on a live restore source) succeeds.

It would be good to get more relevant details, like at least whether it was a network issue, a restore files issues, etc.

Cluster resource

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: supabase-cluster
spec:
  instances: 1
  imageName: supabase/postgres:17.4.1.077
  # imageName: supabase/postgres:17.5.1.020-orioledb

  imagePullPolicy: IfNotPresent
  superuserSecret:
    name: supabase-postgres
  postgresUID: 101  # 105 for 17.4.1.062 and below
  postgresGID: 102  # 106 for 17.4.1.062 and below

  plugins:
  - name: barman-cloud.cloudnative-pg.io
    isWALArchiver: true
    parameters:
      barmanObjectName: minio-store

  # To bootstrap a new database (needs supabase-migratejob.yaml afterwards)
  # bootstrap:
  #   initdb:
  #     database: supabase
  #     owner: supabase_admin

  # To restore from a backup
  bootstrap:
    recovery:
      source: bootstrap-source
  externalClusters:
  - name: bootstrap-source
    plugin:
      name: barman-cloud.cloudnative-pg.io
      parameters:
        barmanObjectName: minio-store-prod

  projectedVolumeTemplate:
    sources:
      - secret:
          name: supabase-pgsodium
          items:
            # available at /projected/postgresql-custom/pgsodium_root.key
            - key: pgsodium_root.key
              path: postgresql-custom/pgsodium_root.key
      - configMap:
          name: pgsodium-getkey
          items:
            # available at /projected/postgresql-custom/pgsodium_getkey.sh
            - key: pgsodium_getkey.sh
              path: postgresql-custom/pgsodium_getkey.sh
              mode: 511
  managed:
    services:
      disabledDefaultServices: ["ro", "r"]
      additional:
        - selectorType: rw
          serviceTemplate:
            metadata:
              name: "supabasedb-svc"
              labels:
                "istio.io/use-waypoint": "transcrobes-supabase-waypoint"
    roles:
    - name: supabase_admin
      ensure: present
      comment: "Supabase Admin"
      login: true
      superuser: true
      createdb: true
      createrole: true
      replication: true
      bypassrls: true
      passwordSecret:
        name: supabase-superuser
  storage:
    pvcTemplate:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 40Gi
      storageClassName: manual-db
      volumeMode: Filesystem

  enableSuperuserAccess: true
  postgresql:
    parameters:
      cron.database_name: supabase
      pg_net.database_name: supabase
      vault.getkey_script: '/projected/postgresql-custom/pgsodium_getkey.sh'
      pgsodium.getkey_script: '/projected/postgresql-custom/pgsodium_getkey.sh'
      auto_explain.log_min_duration: 10s
      supautils.extensions_parameter_overrides: '{"pg_cron":{"schema":"pg_catalog"}}'
      supautils.policy_grants: '{"postgres":["auth.audit_log_entries","auth.identities","auth.refresh_tokens","auth.sessions","auth.users","realtime.messages","storage.buckets","storage.migrations","storage.objects","storage.s3_multipart_uploads","storage.s3_multipart_uploads_parts"]}'
      supautils.drop_trigger_grants: '{"postgres":["auth.audit_log_entries","auth.identities","auth.refresh_tokens","auth.sessions","auth.users","realtime.messages","storage.buckets","storage.migrations","storage.objects","storage.s3_multipart_uploads","storage.s3_multipart_uploads_parts"]}'
      supautils.privileged_extensions: 'address_standardizer, address_standardizer_data_us, autoinc, bloom, btree_gin, btree_gist, citext, cube, dblink, dict_int, dict_xsyn, earthdistance, fuzzystrmatch, hstore, http, hypopg, index_advisor, insert_username, intarray, isn, ltree, moddatetime, orioledb, pg_buffercache, pg_cron, pg_graphql, pg_hashids, pg_jsonschema, pg_net, pg_prewarm, pg_repack, pg_stat_monitor, pg_stat_statements, pg_tle, pg_trgm, pg_walinspect, pgaudit, pgcrypto, pgjwt, pgroonga, pgroonga_database, pgrouting, pgrowlocks, pgsodium, pgstattuple, pgtap, plcoffee, pljava, plls, plpgsql_check, postgis, postgis_raster, postgis_sfcgal, postgis_tiger_geocoder, postgis_topology, postgres_fdw, refint, rum, seg, sslinfo, supabase_vault, supautils, tablefunc, tcn, tsm_system_rows, tsm_system_time, unaccent, uuid-ossp, vector, wrappers'
      supautils.privileged_extensions_custom_scripts_path: '/etc/postgresql-custom/extension-custom-scripts'
      supautils.privileged_extensions_superuser: 'supabase_admin'
      supautils.privileged_role: 'postgres'
      supautils.privileged_role_allowed_configs: 'auto_explain.*, log_lock_waits, log_min_duration_statement, log_min_messages, log_replication_commands, log_statement, log_temp_files, pg_net.batch_size, pg_net.ttl, pg_stat_statements.*, pgaudit.log, pgaudit.log_catalog, pgaudit.log_client, pgaudit.log_level, pgaudit.log_relation, pgaudit.log_rows, pgaudit.log_statement, pgaudit.log_statement_once, pgaudit.role, pgrst.*, plan_filter.*, safeupdate.enabled, session_replication_role, track_io_timing, wal_compression'
      supautils.reserved_memberships: 'pg_read_server_files, pg_write_server_files, pg_execute_server_program, supabase_admin, supabase_auth_admin, supabase_storage_admin, supabase_read_only_user, supabase_realtime_admin, supabase_replication_admin, dashboard_user, pgbouncer, authenticator'
      supautils.reserved_roles: 'supabase_admin, supabase_auth_admin, supabase_storage_admin, supabase_read_only_user, supabase_realtime_admin, supabase_replication_admin, dashboard_user, pgbouncer, service_role*, authenticator*, authenticated*, anon*'

    pg_hba:
      # ripped from supabase/posgres/ansible/files/postgresql_config/pg_hba.conf
      - local all  supabase_admin       scram-sha-256
      - local all  all                  peer map=supabase_map
      - host  all  all  127.0.0.1/32    trust
      - host  all  all  ::1/128         trust
      - host  all  all  10.0.0.0/8      scram-sha-256
      - host  all  all  172.16.0.0/12   scram-sha-256
      - host  all  all  192.168.0.0/16  scram-sha-256
      - host  all  all  0.0.0.0/0       scram-sha-256
      - host  all  all  ::0/0           scram-sha-256
    pg_ident:
      # ripped from supabase/posgres/ansible/files/postgresql_config/pg_ident.conf
      - supabase_map  postgres   postgres
      - supabase_map  gotrue     supabase_auth_admin
      - supabase_map  postgrest  authenticator
      - supabase_map  adminapi   postgres
    # ripped from supabase/posgres/ansible/files/postgresql_config/postgresql.conf
    shared_preload_libraries:
      [
        pg_stat_statements,
        pg_stat_monitor,
        pgaudit,
        plpgsql,
        plpgsql_check,
        pg_cron,
        pg_net,
        # orioledb,
        auto_explain,
        pg_tle,
        supautils,
        pgsodium,
        supabase_vault,
        plan_filter
      ]

Relevant log output

Defaulted container "full-recovery" out of: full-recovery, bootstrap-controller (init), plugin-barman-cloud (init)                               {"level":"info","ts":"2025-09-01T03:37:33.9719199Z","msg":"Starting webserver","logging_pod":"supabase-cluster-1-full-recovery","address":"localh
ost:8010","hasTLS":false}                                                                                                                        {"level":"info","ts":"2025-09-01T03:37:34.075084558Z","msg":"pg_controldata check on existing directory failed, cleaning up folders","pgdata":"/var/lib/postgresql/data/pgdata","logging_pod":"supabase-cluster-1-full-recovery","err":"while executing pg_controldata: exit status 1","out":""}  {"level":"info","ts":"2025-09-01T03:37:34.075116432Z","msg":"cleaning up existing data directory","pgdata":"/var/lib/postgresql/data/pgdata","pgw
al":"","logging_pod":"supabase-cluster-1-full-recovery"}                                                                                         {"level":"info","ts":"2025-09-01T03:37:34.277487332Z","msg":"Restore through plugin detected, proceeding...","logging_pod":"supabase-cluster-1-fu
ll-recovery"}                                                                                                                                                                                                                                                                                     
{"level":"error","ts":"2025-09-01T03:39:33.023347596Z","msg":"Error while restoring a backup","logging_pod":"supabase-cluster-1-full-recovery","error":"rpc error: code = Unknown desc = General error (exit code 4)","stacktrace":"github.com/cloudnative-pg/machinery/pkg/log.(*logger).Error\n\
tpkg/mod/github.com/cloudnative-pg/machinery@v0.3.1/pkg/log/log.go:125\ngithub.com/cloudnative-pg/cloudnative-pg/internal/cmd/manager/instance/restore.restoreSubCommand\n\tinternal/cmd/manager/instance/restore/restore.go:79\ngithub.com/cloudnative-pg/cloudnative-pg/internal/cmd/manager/ins
tance/restore.(*restoreRunnable).Start\n\tinternal/cmd/manager/instance/restore/restore.go:62\nsigs.k8s.io/controller-runtime/pkg/manager.(*runna
bleGroup).reconcile.func1\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/manager/runnable_group.go:226"}                                  
{"level":"info","ts":"2025-09-01T03:39:33.023505826Z","msg":"Stopping and waiting for non leader election runnables"}                            {"level":"info","ts":"2025-09-01T03:39:33.02353804Z","msg":"Stopping and waiting for leader election runnables"}                                 
{"level":"info","ts":"2025-09-01T03:39:33.023699636Z","msg":"Webserver exited","logging_pod":"supabase-cluster-1-full-recovery","address":"localh
ost:8010"}                                                                                                                                       
{"level":"info","ts":"2025-09-01T03:39:33.023740792Z","msg":"Stopping and waiting for caches"}                                                   
{"level":"info","ts":"2025-09-01T03:39:33.023810241Z","msg":"Stopping and waiting for webhooks"}                                                 
{"level":"info","ts":"2025-09-01T03:39:33.023827138Z","msg":"Stopping and waiting for HTTP servers"}                                             
{"level":"info","ts":"2025-09-01T03:39:33.023837684Z","msg":"Wait completed, proceeding to shutdown the manager"}                                
{"level":"error","ts":"2025-09-01T03:39:33.02385908Z","msg":"restore error","logging_pod":"supabase-cluster-1-full-recovery","error":"while resto
ring cluster: rpc error: code = Unknown desc = General error (exit code 4)","stacktrace":"github.com/cloudnative-pg/machinery/pkg/log.(*logger).E
rror\n\tpkg/mod/github.com/cloudnative-pg/machinery@v0.3.1/pkg/log/log.go:125\ngithub.com/cloudnative-pg/cloudnative-pg/internal/cmd/manager/inst
ance/restore.NewCmd.func1\n\tinternal/cmd/manager/instance/restore/cmd.go:101\ngithub.com/spf13/cobra.(*Command).execute\n\tpkg/mod/github.com/sp
f13/cobra@v1.9.1/command.go:1015\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tpkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1148\ngithub.com
/spf13/cobra.(*Command).Execute\n\tpkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1071\nmain.main\n\tcmd/manager/main.go:71\nruntime.main\n\t/o
pt/hostedtoolcache/go/1.24.6/x64/src/runtime/proc.go:283"}

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions