Skip to content

2.25.1.0-b146

@arpang arpang tagged this 08 Jan 04:07
Summary:
initdb spawns two types of backend processes:
- bootstrap backend process: responsible for initializing pg_data dir and template1 database
- regular backend process: responsible for creating the remaining catalog objects

initdb in YugabyteDB is executed in two modes:
- global initdb: Responsible for creating objects that are required to be created once-per-cluster. Spawns both bootstrap and regular backend processes.
- local initdb: Responsible for initializing the pg_data directory in every node. Spawns only bootstrap backend process (skips template1 database creation).

This revision makes the following improvements to initdb:
- Logs the stdout and stderr of both bootstrap and regular backend processes in a separate `initdb.log` file inside the `FLAGS_log_dir` directory if FLAGS_log_dir is set.
- Yugabyte has a wrapper over pclose(), named yb_pclose_check(). It makes an incorrect assumption that the backend processes can only "exit". Consequently, if a backend process is signalled to terminate/abort, yb_pclose_check() returns 0.  This causes local initdb to appear successful even when the backend process fails to initialize the pg_data directory. Fix yb_pclose_check() such that it returns 0 only when pclose() is successful or returns YB_INITDB_ALREADY_DONE_EXIT_CODE.
-  When the bootstrap backend process receives an abort signal, it does not delete the allocated shared memory, causing a leakage. PG doesn't care about it because it doesn't execute initdb in a loop. But in YB, with the above-described fix, local initdb will fail (when the bootstrap backend is aborted) and retried in a loop by the tserver. Hence, in YB it is important to fix this memory leak. Do so by registering a signal to delete the shared memory when the bootstrap backend receives SIGABRT.
Jira: DB-13919

Test Plan:
Triggered an intentional PANIC during local initdb by mimicking `global/pg_control` file creation failure via:

```
diff --git a/src/postgres/src/backend/access/transam/xlog.c b/src/postgres/src/backend/access/transam/xlog.c
index ef3a3870d0..784a959499 100644
--- a/src/postgres/src/backend/access/transam/xlog.c
+++ b/src/postgres/src/backend/access/transam/xlog.c
@@ -3963,6 +3963,7 @@ WriteControlFile(void)

        fd = BasicOpenFile(XLOG_CONTROL_FILE,
                                           O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
+       fd = -1;
        if (fd < 0)
                ereport(PANIC,
                                (errcode_for_file_access(),
```

Verified that:
- stderr goes to initdb.log file
- local initdb is unsuccessful and re-tried
- there is no shared memory leakage during the retries by monitoring `ipcs -m`

Reviewers: telgersma, myang

Reviewed By: telgersma, myang

Subscribers: smishra, yql

Differential Revision: https://phorge.dev.yugabyte.com/D40903
Assets 2
Loading