-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Corruption when inserting into compressed chunk #4655
Comments
The buggy behaviour happens when the plan is a generic cached plan.
|
It corrupts data in the actual compressed chunk. Reading directly from the compressed tables does not solve anything. mats=# CREATE TABLE logs(ts TIMESTAMPTZ NOT NULL, msg TEXT);
CREATE TABLE
mats=# SELECT create_hypertable('logs', 'ts');
create_hypertable
--------------------
(11,public,logs,t)
(1 row)
mats=#
ALTER TABLE
mats=#
INSERT 0 1
compress_chunk
------------------------------------------
_timescaledb_internal._hyper_11_11_chunk
(1 row)
mats=# SET plan_cache_mode TO force_generic_plan;
SET
mats=# EXPLAIN INSERT INTO logs SELECT '2020-01-01', 'message' WHERE now() < '3000-01-01';
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Custom Scan (HypertableModify) (cost=0.01..0.01 rows=1 width=40)
-> Insert on logs (cost=0.01..0.01 rows=1 width=40)
-> Result (cost=0.01..0.01 rows=1 width=40)
One-Time Filter: (now() < '3000-01-01 00:00:00+00'::timestamp with time zone)
-> Custom Scan (ChunkDispatch) (cost=0.01..0.01 rows=1 width=40)
-> Result (cost=0.01..0.01 rows=1 width=40)
One-Time Filter: (now() < '3000-01-01 00:00:00+00'::timestamp with time zone)
(7 rows)
mats=# INSERT INTO logs SELECT '2020-01-01', 'message' WHERE now() < '3000-01-01';
INSERT 0 1
mats=# \dt _timescaledb_internal.*
List of relations
Schema | Name | Type | Owner
-----------------------+----------------------------+-------+-------
_timescaledb_internal | _compressed_hypertable_12 | table | mats
_timescaledb_internal | _hyper_11_11_chunk | table | mats
_timescaledb_internal | bgw_job_stat | table | mats
_timescaledb_internal | bgw_policy_chunk_stats | table | mats
_timescaledb_internal | compress_hyper_12_12_chunk | table | mats
(5 rows)
mats=# set plan_cache_mode to DEFAULT ;
SET
mats=# SELECT ts FROM _timescaledb_internal.compress_hyper_12_12_chunk ;
ERROR: invalid compression algorithm 7
mats=# SELECT msg FROM _timescaledb_internal.compress_hyper_12_12_chunk ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed. Coredump shows same stack as before (which is probably a red herring since the corruption occurs when inserting the row):
|
Inserting without a filter works fine, but once a filter is added, it breaks:
|
This is the workaround I've given to support-dev collab. |
Adding the tuple to a separate table and trying the query works, but as soon as we add an expression that requires a
|
A short summary of the problem and why it occurs. As can be seen in the plan above, there is a gating clause inserted between the chunk dispatch node and the hypertable modify. Since the target chunk is a compressed chunk, the tuple table slot will be constructed as a compressed tuple table slot when the tuple is constructed in the chunk dispatch node. However, the gating clause has the types of an uncompressed tuple, so when hypertable modify takes the tuple and tries to insert it into the compressed chunk, it is treated as an uncompressed tuple and inserted the wrong way. This will later cause the select from the table to fail since the format is incorrect. Tried a few different approaches, but they run into different issues:
|
This will be fixed by the changes to support UPSERT/UPDATE/DELETE on compressed chunks because we won't do on-the-fly compression anymore |
What type of bug is this?
Data corruption
What subsystems and features are affected?
Compression
What happened?
When inserting via trigger with
WHERE
clause into an already compressed chunk there is data corruption if you read the data back:The reproductions steps are below. I have also gotten an ASAN backtrace with a similar to the reproducer's setup.
TimescaleDB version affected
2.7.2
PostgreSQL version used
13.7
What operating system did you use?
Ubuntu 20.04 x64
What installation method did you use?
Source
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
How can we reproduce the bug?
The text was updated successfully, but these errors were encountered: