Skip to content

Background worker fails with "empty host" when PGHOST is a Unix socket path (CNPG/Kubernetes) #266

Description

@kkyehit

postgres_connection_string() produces malformed URL when PGHOST is a Unix socket path (CNPG / Kubernetes environments)

Summary

When running pg_durable inside a CloudNativePG (CNPG) managed PostgreSQL cluster, the background worker fails to start and all workflow instances remain in pending state. It seems like the root cause is that CNPG sets PGHOST to a Unix socket directory path (e.g. /controller/run) rather than a TCP hostname. The postgres_connection_string() function in src/types.rs appears to use PGHOST verbatim in a URL-format connection string, which may produce an invalid URL and trigger an "empty host" error.

Environment

  • pg_durable (latest main)
  • PostgreSQL 17 (Debian Bookworm)
  • CloudNativePG (CNPG) v0.28.3 on Kubernetes (Minikube v1.35.1)
  • CNPG sets PGHOST=/controller/run (Unix socket directory)

Steps to Reproduce

  1. Deploy a CNPG cluster with pg_durable loaded via shared_preload_libraries.
  2. Create the extension: CREATE EXTENSION IF NOT EXISTS pg_durable;
  3. Start a simple workflow:
    SELECT df.start(
      'SELECT ''step 1 done''' ~> 'SELECT ''step 2 done'''
    );
  4. Check the instance status — it stays pending.
Image Image
  1. Inspect the connection string used by the extension: ```sql
    SELECT * FROM df.debug_connection();
    
    
Image

Actual Behavior

debug_connection() returns:

postgres://postgres@/controller/run:5432/postgres

The PostgreSQL pod logs report:

ERROR: empty host
Image

The background worker cannot connect, so no workflow ever advances beyond pending.

Expected Behavior

The connection string should be valid regardless of whether PGHOST is a TCP hostname/IP or a Unix socket directory path. Possible correct forms for a Unix socket path:

  • TCP fallback: postgres://postgres@127.0.0.1:5432/postgres
  • libpq keyword/value: host=/controller/run port=5432 user=postgres dbname=postgres
  • Percent-encoded URL: postgres://postgres@%2Fcontroller%2Frun:5432/postgres

Root Cause

It looks like postgres_connection_string() in src/types.rs unconditionally interpolates PGHOST into a postgres:// URL:

pub fn postgres_connection_string() -> String {
    let host = std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string());
    let port = unsafe { pgrx::pg_sys::PostPortNumber };
    let user = get_worker_role();
    let database = get_database();

    format!("postgres://{user}@{host}:{port}/{database}")
}

When PGHOST is /controller/run, this produces:

postgres://postgres@/controller/run:5432/postgres

This is not a valid URL — the path segment is misinterpreted as the URL path, leaving the host component empty, which causes libpq to reject the connection string with "empty host".

The same issue affects get_host():

pub fn get_host() -> String {
    std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string())
}

Suggested Fix

Some kind of guard is needed to detect whether PGHOST starts with / (indicating a Unix socket directory) and handle it differently from a regular TCP hostname. The following is just one possible approach to illustrate the idea — I am not requesting this exact implementation, and I trust the maintainers to choose the most appropriate solution:

pub fn postgres_connection_string() -> String {
    let host = std::env::var("PGHOST").unwrap_or_else(|_| "127.0.0.1".to_string());
    let port = unsafe { pgrx::pg_sys::PostPortNumber };
    let user = get_worker_role();
    let database = get_database();

    if host.starts_with('/') {
        // Unix socket: encode the path or use keyword/value format
        // Option A – percent-encode the socket directory in the URL host
        let encoded = host.replace('/', "%2F");
        format!("postgres://{user}@{encoded}:{port}/{database}")
        // Option B – fall back to loopback TCP (simpler, always available)
        // format!("postgres://{user}@127.0.0.1:{port}/{database}")
    } else {
        format!("postgres://{user}@{host}:{port}/{database}")
    }
}

The key point is that when PGHOST is a socket path, the current string interpolation produces an invalid URL. Any approach that avoids inserting a raw socket path into the host component of a postgres:// URL would resolve the issue. Option A preserves the socket path via percent-encoding; Option B falls back to TCP loopback, which is always reachable within the same pod.

Additional Context

CNPG (and possibly other Kubernetes-native PostgreSQL operators) seems to configure PGHOST as a socket directory path rather than a hostname by default. It also appears that modifying this environment variable from within the cluster may be restricted, making it difficult to work around the issue on the user side.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions