[Bug]: decompress_chunk fails with "database system is in recovery mode" #6028

flexwende · 2023-08-30T08:31:40Z

What type of bug is this?

Unexpected error

What subsystems and features are affected?

Compression

What happened?

I tried compression to reduce storage.

compress_chunk works fine, but when I used decompress_chunk to revert the compression it fails with the error:

FATAL: the database system is in recovery mode

I tried the same with a minimal example and it worked. The hypertable chunks where the error occures were created with timescale 2.5.0. Could this be the problem?

TimescaleDB version affected

2.11.2

PostgreSQL version used

14.1

What operating system did you use?

Windows 10 x64

What installation method did you use?

Not applicable

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

2023-08-30 10:29:43.511 CEST [28896] LOG:  server process (PID 19872) was terminated by exception 0xC0000005
2023-08-30 10:29:43.511 CEST [28896] DETAIL:  Failed process was running: select decompress_chunk('_timescaledb_internal._hyper_16_413_chunk')
2023-08-30 10:29:43.511 CEST [28896] HINT:  See C include file "ntstatus.h" for a description of the hexadecimal value.
2023-08-30 10:29:43.511 CEST [28896] LOG:  terminating any other active server processes
2023-08-30 10:29:43.534 CEST [8544] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.535 CEST [28896] LOG:  all server processes terminated; reinitializing
2023-08-30 10:29:43.570 CEST [22404] LOG:  database system was interrupted; last known up at 2023-08-30 10:28:55 CEST
2023-08-30 10:29:43.572 CEST [28932] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.586 CEST [19236] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.653 CEST [21988] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.668 CEST [18012] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.682 CEST [16564] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.698 CEST [16060] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.713 CEST [22432] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.728 CEST [3620] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.743 CEST [10428] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.759 CEST [29576] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.885 CEST [26464] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.900 CEST [23356] FATAL:  the database system is in recovery mode
2023-08-30 10:29:43.916 CEST [4208] FATAL:  the database system is in recovery mode
2023-08-30 10:29:44.044 CEST [13108] FATAL:  the database system is in recovery mode
2023-08-30 10:29:44.059 CEST [23668] FATAL:  the database system is in recovery mode
2023-08-30 10:29:44.075 CEST [6272] FATAL:  the database system is in recovery mode
2023-08-30 10:29:44.200 CEST [1636] FATAL:  the database system is in recovery mode
2023-08-30 10:29:44.561 CEST [22404] LOG:  database system was not properly shut down; automatic recovery in progress
2023-08-30 10:29:44.565 CEST [22404] LOG:  redo starts at 26/21256530
2023-08-30 10:29:44.565 CEST [22404] LOG:  invalid record length at 26/21256568: wanted 24, got 0
2023-08-30 10:29:44.565 CEST [22404] LOG:  redo done at 26/21256530 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2023-08-30 10:29:44.595 CEST [28896] LOG:  database system is ready to accept connections
2023-08-30 10:29:44.612 CEST [28088] LOG:  TimescaleDB background worker launcher connected to shared catalogs

How can we reproduce the bug?

no idea

The text was updated successfully, but these errors were encountered:

lkshminarayanan · 2023-08-30T14:39:28Z

Hello @flexwende,

Thank you bringing this issue to our attention!

The error message you mentioned, database system is in recovery mode, typically occurs when the database server is in the process of starting up and is undergoing recovery. This is a normal part of the startup procedure. According to the logs, it appears that the database server successfully completed its recovery phase and was ready to accept connections.

However, the primary concern here is the unexpected termination of the server process during the execution of the query select decompress_chunk('_timescaledb_internal._hyper_16_413_chunk'). The exception 0xC0000005 present in the logs suggests a possible segmentation fault as the cause of termination, although further investigation is required for confirmation. Following the termination, the postgres server automatically got restarted and entered the recovery mode. During this recovery period, any attempts you made to decompress data would naturally result in the database system is in recovery mode error.

Regarding chunks created in version 2.5.0, they should not be a problem. In my internal tests, I was able to compress and decompress chunks that were created in 2.5.0 and upgraded to 2.11.2, without any issues. It is possible that the problem might be specific to a particular schema configuration. And for this purpose, it will be good to have more detailed information such as the table schema, tuple count, compression settings, and any relevant policies associated with the hypertable. Can you please provide us with these information?

Considering that the database server successfully completed its recovery and is operational again, could you please attempt to run decompress_chunk once more? This will help determine whether the issue persists beyond the recovery phase. If the problem persists, would it be feasible for you to provide a coredump or a stacktrace from the moment of the crash? These additional details will help us to find the root cause of the unexpected termination.

Thank you!

flexwende · 2023-08-30T14:43:28Z

I solved the problem at the exact time of your answer :D

The table has an unique index. After droping the unique index decompress_chunk works as expected. To readd the unique index, compression had to be disabled for the hypertable.

Thank you for the quick response!

lkshminarayanan · 2023-08-30T14:53:54Z

Glad you were able to solve the issue! I'll mark this is as closed for now. Please do reopen if you face any further issues with this. Thank you!

flexwende added the bug label Aug 30, 2023

lkshminarayanan self-assigned this Aug 30, 2023

lkshminarayanan added the compression label Aug 30, 2023

lkshminarayanan added the waiting-for-author label Aug 30, 2023

timescale-automation removed the waiting-for-author label Aug 30, 2023

lkshminarayanan closed this as completed Aug 30, 2023

KamalAman mentioned this issue Mar 15, 2024

[Bug]: Crash on insert or deletion into compressed chunk #6753

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: decompress_chunk fails with "database system is in recovery mode" #6028

[Bug]: decompress_chunk fails with "database system is in recovery mode" #6028

flexwende commented Aug 30, 2023 •

edited

lkshminarayanan commented Aug 30, 2023

flexwende commented Aug 30, 2023 •

edited

lkshminarayanan commented Aug 30, 2023

[Bug]: decompress_chunk fails with "database system is in recovery mode" #6028

[Bug]: decompress_chunk fails with "database system is in recovery mode" #6028

Comments

flexwende commented Aug 30, 2023 • edited

What type of bug is this?

What subsystems and features are affected?

What happened?

TimescaleDB version affected

PostgreSQL version used

What operating system did you use?

What installation method did you use?

What platform did you run on?

Relevant log output and stack trace

How can we reproduce the bug?

lkshminarayanan commented Aug 30, 2023

flexwende commented Aug 30, 2023 • edited

lkshminarayanan commented Aug 30, 2023

flexwende commented Aug 30, 2023 •

edited

flexwende commented Aug 30, 2023 •

edited