-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
What type of bug is this?
Crash
What subsystems and features are affected?
Query executor
What happened?
When I enabled hypercore access method on table with jsonb column, some operations started failing:
- ✅ EXPLAIN (ANALYZE) SELECT * FROM ... -- works
- 💣 EXPLAIN (ANALYZE, SERIALIZE TEXT) SELECT * FROM ... -- crash
- 💣 VACUUM FULL -- crash
Table have jsonb field, so may be this?
ALTER TABLE _timescaledb_internal._hyper_9_28_chunk SET ACCESS METHOD hypercore;ALTER TABLE
Time: 186,894 ms
EXPLAIN (analyze,verbose,buffers,serialize binary) SELECT * from device_data.raw_zigbee;server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
TimescaleDB version affected
2.19.3
PostgreSQL version used
17.4
What operating system did you use?
PostgreSQL 17.4 (Ubuntu 17.4-1.pgdg24.04+2) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, 64-bit
What installation method did you use?
Docker, Deb/Apt
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
2025-05-07 12:26:59.863 UTC [1007221] LOG: checkpoint starting: wal
realloc(): invalid next size
2025-05-07 12:27:43.739 UTC [1708] LOG: server process (PID 1009258) was terminated by signal 6: Aborted
2025-05-07 12:27:43.739 UTC [1708] DETAIL: Failed process was running: VACUUM FULL;
2025-05-07 12:27:43.739 UTC [1708] LOG: terminating any other active server processes
2025-05-07 12:27:43.741 UTC [1024851] admin@customer FATAL: the database system is in recovery mode
2025-05-07 12:27:43.909 UTC [1708] LOG: all server processes terminated; reinitializing
2025-05-07 12:27:43.911 UTC [1708] LOG: [pg_stat_monitor] pgsm_shmem_shutdown: Shutdown initiated.
2025-05-07 12:27:44.856 UTC [1024937] LOG: database system was interrupted; last known up at 2025-05-07 12:26:11 UTC
2025-05-07 12:27:55.753 UTC [1024937] LOG: database system was not properly shut down; automatic recovery in progress
2025-05-07 12:27:55.819 UTC [1024937] LOG: redo starts at 2/9BCD8470
2025-05-07 12:27:59.926 UTC [1024937] LOG: invalid magic number 0000 in WAL segment 0000000100000002000000E5, LSN 2/E5D00000, offset 13631488
2025-05-07 12:27:59.926 UTC [1024937] LOG: redo done at 2/E5CFFF68 system usage: CPU: user: 0.86 s, system: 1.18 s, elapsed: 4.10 s
2025-05-07 12:28:00.735 UTC [1024938] LOG: checkpoint starting: end-of-recovery immediate wait
2025-05-07 12:28:27.871 UTC [1024938] LOG: checkpoint complete: wrote 157486 buffers (30.0%); 0 WAL file(s) added, 12 removed, 62 recycled; write=19.220 s, sync=2.766 s, total=27.203 s; sync files=460, longest=0.773 s, average=0.007 s; distance=1212574 kB, estimate=1212574 kB; lsn=2/E5D00048, redo lsn=2/E5D00048
2025-05-07 12:28:27.993 UTC [1708] LOG: database system is ready to accept connections
2025-05-07 12:28:27.994 UTC [1026041] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-05-07 12:28:28.001 UTC [1026040] LOG: pg_cron scheduler started
realloc(): invalid next size
2025-05-07 12:29:50.454 UTC [1708] LOG: server process (PID 1026103) was terminated by signal 6: Aborted
2025-05-07 12:29:50.454 UTC [1708] DETAIL: Failed process was running: explain (analyze,verbose,buffers,serialize text) SELECT * from raw_zigbee;
2025-05-07 12:29:50.454 UTC [1708] LOG: terminating any other active server processes
2025-05-07 12:29:50.456 UTC [1028340] admin@customer FATAL: the database system is in recovery mode
2025-05-07 12:29:50.532 UTC [1708] LOG: all server processes terminated; reinitializing
2025-05-07 12:29:50.533 UTC [1708] LOG: [pg_stat_monitor] pgsm_shmem_shutdown: Shutdown initiated.
2025-05-07 12:29:51.527 UTC [1028353] LOG: database system was interrupted; last known up at 2025-05-07 12:28:27 UTC
2025-05-07 12:29:52.717 UTC [1028353] LOG: database system was not properly shut down; automatic recovery in progress
2025-05-07 12:29:52.744 UTC [1028353] LOG: redo starts at 2/E5D000C0
2025-05-07 12:29:52.744 UTC [1028353] LOG: invalid record length at 2/E5D02378: expected at least 24, got 0
2025-05-07 12:29:52.744 UTC [1028353] LOG: redo done at 2/E5D02340 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2025-05-07 12:29:52.953 UTC [1028354] LOG: checkpoint starting: end-of-recovery immediate wait
2025-05-07 12:29:53.257 UTC [1028354] LOG: checkpoint complete: wrote 5 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.071 s, sync=0.095 s, total=0.336 s; sync files=6, longest=0.046 s, average=0.016 s; distance=8 kB, estimate=8 kB; lsn=2/E5D02378, redo lsn=2/E5D02378
2025-05-07 12:29:53.295 UTC [1708] LOG: database system is ready to accept connections
2025-05-07 12:29:53.297 UTC [1028380] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-05-07 12:29:53.302 UTC [1028379] LOG: pg_cron scheduler started
realloc(): invalid next size
2025-05-07 12:31:38.879 UTC [1708] LOG: server process (PID 1029056) was terminated by signal 6: Aborted
2025-05-07 12:31:38.879 UTC [1708] DETAIL: Failed process was running: explain (analyze,verbose,buffers,serialize binary) SELECT * from device_data.raw_zigbee;
2025-05-07 12:31:38.879 UTC [1708] LOG: terminating any other active server processes
2025-05-07 12:31:38.881 UTC [1031198] admin@customer FATAL: the database system is in recovery mode
2025-05-07 12:31:38.883 UTC [1708] LOG: all server processes terminated; reinitializing
2025-05-07 12:31:38.884 UTC [1708] LOG: [pg_stat_monitor] pgsm_shmem_shutdown: Shutdown initiated.
2025-05-07 12:31:39.393 UTC [1031199] LOG: database system was interrupted; last known up at 2025-05-07 12:29:53 UTC
2025-05-07 12:31:40.532 UTC [1031199] LOG: database system was not properly shut down; automatic recovery in progress
2025-05-07 12:31:40.561 UTC [1031199] LOG: redo starts at 2/E5D023F0
2025-05-07 12:31:40.562 UTC [1031199] LOG: invalid record length at 2/E5D02428: expected at least 24, got 0
2025-05-07 12:31:40.562 UTC [1031199] LOG: redo done at 2/E5D023F0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2025-05-07 12:31:40.728 UTC [1031200] LOG: checkpoint starting: end-of-recovery immediate wait
2025-05-07 12:31:41.012 UTC [1031200] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.070 s, sync=0.052 s, total=0.316 s; sync files=3, longest=0.020 s, average=0.018 s; distance=0 kB, estimate=0 kB; lsn=2/E5D02428, redo lsn=2/E5D02428
2025-05-07 12:31:41.049 UTC [1708] LOG: database system is ready to accept connections
2025-05-07 12:31:41.050 UTC [1031327] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-05-07 12:31:41.054 UTC [1031326] LOG: pg_cron scheduler started
2025-05-07 12:33:14.543 UTC [1031200] LOG: checkpoint starting: wal
2025-05-07 12:36:37.498 UTC [1031200] LOG: checkpoint complete: wrote 63852 buffers (12.2%); 0 WAL file(s) added, 0 removed, 33 recycled; write=200.724 s, sync=0.990 s, total=202.956 s; sync files=114, longest=0.646 s, average=0.009 s; distance=531810 kB, estimate=531810 kB; lsn=3/2078F2E8, redo lsn=3/645ACA8
2025-05-07 12:36:37.499 UTC [1031200] LOG: checkpoint starting: immediate force wait
2025-05-07 12:36:49.759 UTC [1031200] LOG: checkpoint complete: wrote 99104 buffers (18.9%); 0 WAL file(s) added, 0 removed, 26 recycled; write=10.965 s, sync=0.291 s, total=12.261 s; sync files=80, longest=0.164 s, average=0.004 s; distance=429265 kB, estimate=521555 kB; lsn=3/2078F3F0, redo lsn=3/2078F360
2025-05-07 12:38:19.310 UTC [1031200] LOG: checkpoint starting: immediate force wait
2025-05-07 12:38:20.423 UTC [1031200] LOG: checkpoint complete: wrote 3708 buffers (0.7%); 0 WAL file(s) added, 2 removed, 0 recycled; write=0.463 s, sync=0.381 s, total=1.114 s; sync files=81, longest=0.233 s, average=0.005 s; distance=28920 kB, estimate=472292 kB; lsn=3/223CD470, redo lsn=3/223CD418
2025-05-07 12:39:10.340 UTC [1031402] admin@customer ERROR: function hypertable_columnstore_stat(unknown) does not exist at character 15
2025-05-07 12:39:10.340 UTC [1031402] admin@customer HINT: No function matches the given name and argument types. You might need to add explicit type casts.
2025-05-07 12:39:10.340 UTC [1031402] admin@customer STATEMENT: SELECT * FROM hypertable_columnstore_stat('device_data.raw_zigbee');
2025-05-07 12:39:38.944 UTC [1031200] LOG: checkpoint starting: immediate force wait
2025-05-07 12:39:39.137 UTC [1031200] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.006 s, sync=0.027 s, total=0.194 s; sync files=2, longest=0.026 s, average=0.014 s; distance=0 kB, estimate=425062 kB; lsn=3/223CD578, redo lsn=3/223CD520How can we reproduce the bug?
CREATE TABLE device_data.raw_zigbee (
created timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP,
ts timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP,
routing_key text NOT NULL,
data jsonb
);
SELECT * FROM timescale.create_hypertable(
'device_data.raw_zigbee',
timescale.by_range('created')
);
ALTER TABLE device_data.raw_zigbee SET (
timescaledb.enable_columnstore,
timescaledb.orderby = 'created DESC',
timescaledb.segmentby = 'routing_key'
);
CALL timescale.add_columnstore_policy(
'device_data.raw_zigbee',
interval '1 day',
hypercore_use_access_method => false, -- do not use hypercore access method, crashes DB!
initial_start => (CURRENT_DATE + INTERVAL '1 hour')::timestamptz
);
then add some data and run the job.