Implement support for core dump creation and back trace extraction in CI #218

GeoffMontee · 2019-10-14T20:33:09Z

It might be a nice improvement to our CI if were able to create core dumps when PostgreSQL crashes and also able to automatically extract back traces from them. This would probably make it easier to debug crashes like the one we saw in #213.

This idea was originally mentioned in #214 here: #214 (comment)

This would probably require at least the changes listed below.

Changes Required in `ci-build`

We would have to make sure that ci-build compiles tds_fdw with the -ggdb option specified in PG_CPPFLAGS, so that tds_fdw is built with debugging symbols.

Changes Required in `ci-setup`

We would have to make sure that ci-setup installs debuginfo packages for PostgreSQL.

For example:

sudo yum install postgresql12-debuginfo

We would have to make sure that ci-setup grants unlimited size core dumps to the PostgreSQL process.

For OSes that use systemd, that would probably look like this:

sudo tee /etc/systemd/system/postgresql-12.service.d/limitcore.conf <<EOF
[Service]

LimitCORE=infinity
EOF
sudo systemctl daemon-reload

For other OSes, that would probably look like this:

sudo tee /etc/security/limits.conf.d/postgres_core.conf <<EOF
postgres soft core unlimited
postgres hard core unlimited
EOF

We would also have to make sure that ci-setup sets up some other parameters related to core dumps.

For example:

sudo tee /etc/sysctl.d/postgres_core.conf <<EOF
# Set the path to the core dumps
kernel.core_pattern = /core_dumps

# Add the PID to the end of the file name
kernel.core_uses_pid = 1

# Allow setuid processes to dump core. Is this necessary for Postgres?
fs.suid_dumpable = 2
EOF

We would also make sure that ci-setup creates any paths that we depend on.

For example:

mkdir /core_dumps
chmod 0777 /core_dumps

Changes Required in `tds_fdw`

We would have to change tests/postgresql-tests.py in tds_fdw to make it detect PostgreSQL crashes. Maybe it could scan the PostgreSQL log for lines like this?:

2019-10-02 02:28:48.702 UTC [50] LOG:  server process (PID 292) was terminated by signal 11: Segmentation fault

If tests/postgresql-tests.py detects a PostgreSQL crash, then it would have to get the value of kernel.core_pattern:

For example:

sysctl kernel.core_pattern

When tests/postgresql-tests.py has the value of kernel.core_pattern, it could check the path for core dumps.
When tests/postgresql-tests.py finds a core dump, it could get all backtraces from it.

For example:

sudo gdb --batch --eval-command="thread apply all bt full" $(which postmaster) ${core_file_path}

The text was updated successfully, but these errors were encountered:

juliogonzalez · 2019-11-19T23:28:07Z

I will start working on this as soon as we get rid of CentOS6 for the testing (so we can only have systemd, which is already used by Ubuntu 18.04).

I will also need how this would work inside the docker containers we use. Most probably no big deal, but you never know :-)

GeoffMontee mentioned this issue Dec 4, 2019

Seg fault when selecting from SQL Server foreign table in a plpgsql function with local variable #236

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement support for core dump creation and back trace extraction in CI #218

Implement support for core dump creation and back trace extraction in CI #218

GeoffMontee commented Oct 14, 2019

juliogonzalez commented Nov 19, 2019

Implement support for core dump creation and back trace extraction in CI #218

Implement support for core dump creation and back trace extraction in CI #218

Comments

GeoffMontee commented Oct 14, 2019

Changes Required in ci-build

Changes Required in ci-setup

Changes Required in tds_fdw

juliogonzalez commented Nov 19, 2019

Changes Required in `ci-build`

Changes Required in `ci-setup`

Changes Required in `tds_fdw`