Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Shutdown stalled waiting for TimescaleDB Background Worker Scheduler #4200

Closed
mkindahl opened this issue Mar 29, 2022 · 5 comments · Fixed by #4199 or #4323
Closed

[Bug]: Shutdown stalled waiting for TimescaleDB Background Worker Scheduler #4200

mkindahl opened this issue Mar 29, 2022 · 5 comments · Fixed by #4199 or #4323

Comments

@mkindahl
Copy link
Contributor

mkindahl commented Mar 29, 2022

What type of bug is this?

Locking issue

What subsystems and features are affected?

Background worker

What happened?

If a shutdown is slow, it can cause the scheduler to block while shutting down.

TimescaleDB version affected

2.5.0

PostgreSQL version used

13.5

What operating system did you use?

Ubuntu 20.04.3 LTS

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

strace

strace: Process 1104 attached
futex(0x7fedaf9afc60, FUTEX_WAIT_PRIVATE, 2, NULL

gdb

redacted bt full

#0  0x00007f0e4d1040eb in __lll_lock_wait_private () from target:/lib/x86_64-linux-gnu/libc.so.6
[...]
#3  malloc (size=53)
[...]
#7  0x000055b9212235b1 in errmsg ()
#8  0x00007f0e27bf27a8 in handle_sigterm (postgres_signal_arg=15) at /build/timescaledb/src/bgw/scheduler.c:841
#9  <signal handler called>
[...]
#13 free (ptr=<optimized out>)
#14 0x00007f0e4db12cb4 in OPENSSL_LH_free () from target:/lib/x86_64-linux-gnu/libcrypto.so.1.1
[...]

How can we reproduce the bug?

Run the following and be patient until the situation occurs.

while sleep 0.1
    do pg_ctl restart -m fast
done
@mkindahl mkindahl added the bug label Mar 29, 2022
mkindahl added a commit to mkindahl/timescaledb that referenced this issue Mar 29, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
@mkindahl mkindahl self-assigned this Mar 29, 2022
mkindahl added a commit to mkindahl/timescaledb that referenced this issue Mar 29, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
@feikesteenbergen
Copy link
Member

feikesteenbergen commented Mar 30, 2022

We have observed a few situations where when shutting down or restarting PostgreSQL using the fast mode the shutdown stalls indefinitely.

What we observe is that almost all PostgreSQL processes have been terminated, yet a few remain, for example:

ps xef
    PID TTY      STAT   TIME COMMAND
   1092 ?        S      0:29 /usr/lib/postgresql/14/bin/postgres [...]
   1095 ?        Ss     0:00  \_ postgres: cluster0: checkpointer                                                                                                                        
   1099 ?        Ss     0:00  \_ postgres: cluster0: archiver last was 00000002000000000000004C                                                                                          
   1100 ?        Ss     0:00  \_ postgres: cluster0: stats collector                                                                                                                     
   1104 ?        Ss     0:00  \_ postgres: cluster0: TimescaleDB Background Worker Scheduler 

The checkpointer, stats collector and archiver are expected in this situation, as PostgreSQL will shut them down last.
The TimescaleDB Background Worker Scheduler is at this point not responding to signals, so a:

kill 1104

Does not sort any effect.

Using strace shows us what the process is currently doing:

strace -p 1104
strace: Process 1104 attached
futex(0x7fedaf9afc60, FUTEX_WAIT_PRIVATE, 2, NULL

Taking a deeper dive with gdb show us that during a free (#13), a malloc (#3) is called, causing a deadlock:

redacted bt full

#0  0x00007f0e4d1040eb in __lll_lock_wait_private () from target:/lib/x86_64-linux-gnu/libc.so.6
[...]
#3  malloc (size=53)
[...]
#7  0x000055b9212235b1 in errmsg ()
#8  0x00007f0e27bf27a8 in handle_sigterm (postgres_signal_arg=15) at /build/timescaledb/src/bgw/scheduler.c:841
#9  <signal handler called>
[...]
#13 free (ptr=<optimized out>)
#14 0x00007f0e4db12cb4 in OPENSSL_LH_free () from target:/lib/x86_64-linux-gnu/libcrypto.so.1.1
[...]

We have been able to reproduce this in an environment with atypical memory allocation, which as far as we can tell, increases the chances of hitting this kind of bug.

To reproduce, we executed the following and waited for the situation to occur:

while sleep 0.1
    do pg_ctl restart -m fast
done

In a regular environment we have not been able to reproduce this yet, but we have had reports of stalled shutdowns.

@feikesteenbergen feikesteenbergen changed the title [Bug]: Slow shutdown can cause BGW scheduler to block [Bug]: Shutdown stalled waiting for TimescaleDB Background Worker Scheduler Mar 30, 2022
@mkindahl
Copy link
Contributor Author

@feikesteenbergen Is there a good way to monitor for a stalled shutdown other than inspecting the output of ps? Would be good to understand if there is a good way to create a monitoring rule for this situation.

mkindahl added a commit to mkindahl/timescaledb that referenced this issue Mar 30, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers and also adds calls to `gettext` to support translation.

Fixes timescale#4200
mkindahl added a commit to mkindahl/timescaledb that referenced this issue Mar 30, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
@feikesteenbergen
Copy link
Member

feikesteenbergen commented Mar 30, 2022

@mkindahl a stalled shutdown would present itself as a PostgreSQL server that is listening, but not accepting connections.

Therefore, to monitor this kind of situation, running a frequent pg_isready against the server would be able to alert on this.

pg_isready returns 0 to the shell if the server is accepting connections normally, 1 if the server is rejecting connections (for example during startup), 2 if there was no response to the connection attempt, and 3 if no attempt was made (for example due to invalid parameters).

All fine exitcode=0

$ pg_isready
/var/run/postgresql:5432 - accepting connections

Shutting down exitcode=1

$ pg_isready
/var/run/postgresql:5432 - rejecting connections

NOTE: A postgresql instance that is in recovery and isn't yet consistent, or doesn't have hot_standby=on, will also return this result.

Not running exitcode=2

$ pg_isready
/var/run/postgresql:5432 - no response

Alternatives would be to alert on absence of metrics, for example, pg_exporter can be used with prometheus, and can be used to detect absence of useful metrics for a period of time.

mkindahl added a commit that referenced this issue Mar 30, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes #4200
@mkindahl mkindahl added this to the TimescaleDB 2.6.1 milestone Apr 5, 2022
RafiaSabih pushed a commit to RafiaSabih/timescaledb that referenced this issue Apr 7, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
RafiaSabih pushed a commit to RafiaSabih/timescaledb that referenced this issue Apr 8, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
mkindahl added a commit to RafiaSabih/timescaledb that referenced this issue Apr 8, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
svenklemm pushed a commit that referenced this issue Apr 11, 2022
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes #4200
@mkindahl
Copy link
Contributor Author

mkindahl commented May 12, 2022

It turns out that write_stderr is not signal-safe after all and uses vfprintf internally, which requires allocating a buffer and is done using malloc. This can, again, lead to a process that is shutting down to become unresponsive because it is holding a lock.

@mkindahl mkindahl reopened this May 12, 2022
@mkindahl
Copy link
Contributor Author

The function write_stderr is defined in elog.c and calls either write_console or write_eventlog depending on the platform. Unfortunately, both of these functions are static so they are not callable from outside elog.c. This makes it difficult to create a signal-safe version without copying code, or re-implementing the solution for all platforms.

mkindahl added a commit to mkindahl/timescaledb that referenced this issue May 12, 2022
In timescale#4199 existing calls of `ereport` were replaced with calls of
`write_stderr` to eliminate the use of signal-unsafe calls, in
particular calls to `malloc`. Unfortunately, `write_stderr` contains a
call to `vfprintf`, which allocates memory as well, occationally
causing servers that are shutting down to become unresponsive.

Since the existing signal handlers just called `die` after printing out
a useful message, this commit fixes this by using `die` as a signal
handler.

Fixes timescale#4200
mkindahl added a commit to mkindahl/timescaledb that referenced this issue May 12, 2022
In timescale#4199 existing calls of `ereport` were replaced with calls of
`write_stderr` to eliminate the use of signal-unsafe calls, in
particular calls to `malloc`. Unfortunately, `write_stderr` contains a
call to `vfprintf`, which allocates memory as well, occationally
causing servers that are shutting down to become unresponsive.

Since the existing signal handlers just called `die` after printing out
a useful message, this commit fixes this by using `die` as a signal
handler.

Fixes timescale#4200
mkindahl added a commit that referenced this issue May 12, 2022
In #4199 existing calls of `ereport` were replaced with calls of
`write_stderr` to eliminate the use of signal-unsafe calls, in
particular calls to `malloc`. Unfortunately, `write_stderr` contains a
call to `vfprintf`, which allocates memory as well, occationally
causing servers that are shutting down to become unresponsive.

Since the existing signal handlers just called `die` after printing out
a useful message, this commit fixes this by using `die` as a signal
handler.

Fixes #4200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment