Skip to content

CLOS-2132: Detect postgresql.conf unix_socket_directories pre-upgrade and add explicit migration steps#65

Merged
prilr merged 2 commits into
cloudlinux:cloudlinuxfrom
prilr:CLOS-2132-elevate-postgresql-cannot-be-started-aft
May 19, 2026
Merged

CLOS-2132: Detect postgresql.conf unix_socket_directories pre-upgrade and add explicit migration steps#65
prilr merged 2 commits into
cloudlinux:cloudlinuxfrom
prilr:CLOS-2132-elevate-postgresql-cannot-be-started-aft

Conversation

@prilr
Copy link
Copy Markdown
Collaborator

@prilr prilr commented May 15, 2026

Add a second report in postgresqlcheck that fires when postgresql-server is installed AND /var/lib/pgsql/data/postgresql.conf contains an active "unix_socket_directories" (plural) setting.

Root cause confirmed from 3 separate cases.

CL7's postgresql-server 9.2 ships with a forward-compatible patch that accepts the newer (PG 9.3+) plural parameter name
"unix_socket_directories" - so postgres runs fine on CL7 with that setting. RHEL-8's postgresql-upgrade package, which provides the old 9.2 server binary that pg_upgrade launches to read the data-old cluster, ships an unpatched 9.2 build that rejects the plural form with:

  LOG:  unrecognized configuration parameter "unix_socket_directories" in file "/var/lib/pgsql/data-old/postgresql.conf"
  FATAL: configuration file "/var/lib/pgsql/data-old/postgresql.conf" contains errors
  pg_ctl: could not start server

The default RHEL-7 postgresql.conf has the plural line commented out by default, but admin edits, cPanel tooling, and config-management commonly uncomment it (it's "the modern form" per current PG docs).

Users hit by this saw their migration fail with a confusing socket-connection error - the actual cause is buried in pg_upgrade_server.log.

The new check:

  • Reads /var/lib/pgsql/data/postgresql.conf if it exists.
  • Matches "^\sunix_socket_directories\s=" (active, not commented).
  • Emits a HIGH-severity report with a concrete sed remediation (rename plural -> singular) for both pre-upgrade and post-upgrade cases.
  • Treats missing/unreadable files as "not affected" - if the data dir was never initialized (no initdb), the bug can't trigger.

Tests cover: the new detection function across plural / commented / singular / whitespace / buried / missing-file cases;
both that the new report fires only when postgresql-server is present and that it does not when the package is absent.

prilr and others added 2 commits May 14, 2026 23:15
Replace the generic "follow the docs" hint with explicit post-upgrade
steps that walk the user through the manual PostgreSQL data migration:

  1. dnf install postgresql-upgrade
  2. postgresql-setup --upgrade
  3. systemctl start postgresql

Also call out that the data stays in the old 9.2 format and that
postgresql will refuse to start until the migration is done - this is
the proximate confusion the customer described in the ticket
(systemd showed "An old version of the database format was found" but
the recommended migration command failed with a missing-package error,
leaving the customer stuck).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a second report in postgresqlcheck that fires when postgresql-server
is installed AND /var/lib/pgsql/data/postgresql.conf contains an active
"unix_socket_directories" (plural) setting.

Root cause confirmed from 3 customer tickets (ZD-197576, ZD-204621,
ZD-228015) and reproduced end-to-end on a CL7 nopanel test VM:

CL7's postgresql-server 9.2 ships with a forward-compatible patch that
accepts the newer (PG 9.3+) plural parameter name
"unix_socket_directories" - so postgres runs fine on CL7 with that
setting. RHEL-8's postgresql-upgrade package, which provides the old
9.2 server binary that pg_upgrade launches to read the data-old
cluster, ships an UN-patched 9.2 build that rejects the plural form
with:

  LOG:  unrecognized configuration parameter "unix_socket_directories"
        in file "/var/lib/pgsql/data-old/postgresql.conf"
  FATAL: configuration file "/var/lib/pgsql/data-old/postgresql.conf"
         contains errors
  pg_ctl: could not start server

The default RHEL-7 postgresql.conf has the plural line commented out by
default, but admin edits, cPanel tooling, and config-management commonly
uncomment it (it's "the modern form" per current PG docs). Customers
hit by this saw their migration fail with a confusing socket-connection
error - the actual cause is buried in pg_upgrade_server.log, which
customers don't usually know to read.

The new check:
  - Reads /var/lib/pgsql/data/postgresql.conf if it exists.
  - Matches "^\s*unix_socket_directories\s*=" (active, not commented).
  - Emits a HIGH-severity report with a concrete sed remediation
    (rename plural -> singular) for both pre-upgrade and post-upgrade
    cases.
  - Treats missing/unreadable files as "not affected" - if the data dir
    was never initialized (no initdb), the bug can't trigger.

Tests cover: the new detection function across plural / commented /
singular / whitespace / buried / missing-file cases; both that the new
report fires only when postgresql-server is present and that it does
not when the package is absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@azheregelya azheregelya self-requested a review May 19, 2026 09:14
@prilr prilr merged commit 86f6da3 into cloudlinux:cloudlinux May 19, 2026
1 check passed
@prilr prilr deleted the CLOS-2132-elevate-postgresql-cannot-be-started-aft branch May 19, 2026 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants