Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: On-insert decompression after schema changes may not work properly #5577

Closed
kgyrtkirk opened this issue Apr 17, 2023 · 0 comments · Fixed by #5578
Closed

[Bug]: On-insert decompression after schema changes may not work properly #5577

kgyrtkirk opened this issue Apr 17, 2023 · 0 comments · Fixed by #5578

Comments

@kgyrtkirk
Copy link
Contributor

kgyrtkirk commented Apr 17, 2023

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Compression

What happened?

This could lead to the same kind of incorrect result issues as #5553 ; from a slightly different angle.

  • compressed table with at least 1 segmentby column (time, some_col, segmentby_col)
  • unique: time,segmentby_col
  • main table schema is altered - some_col is dropped
  • insertion of duplicate row is not detected

--

what happens:

  • build_scankey works on 3 schema levels:
    • the slot is at the main hypertable level
    • the decompressor->out_rel is at the standard inherited hypertable table level
      • scankeys are acquired at this level
    • the decompressor->in_rel is at the compressed level
  • during the scankey build; the actual value is extracted from the slot; but by using the out_rel level indexes ; which may lead to problems..even segfault if the drops are propelry planned :D

TimescaleDB version affected

main

PostgreSQL version used

15.2

What operating system did you use?

Debian12

What installation method did you use?

Source

What platform did you run on?

Not applicable

Relevant log output and stack trace

select count(1) from readings where battery_temperature=0.2;
 count 
-------
     2
(1 row)

select * from readings;
             time             | battery_temperature 
------------------------------+---------------------
 Fri Nov 11 03:11:11 2022 PST |                    
 Fri Nov 11 11:11:11 2022 PST |                 0.2
 Fri Nov 11 11:11:11 2022 PST |                 0.2
(3 rows)

-- unique check failure during decompression
select decompress_chunk(show_chunks('readings'),true);
ERROR:  duplicate key value violates unique constraint "_hyper_1_3_chunk_readings_uniq_idx"

How can we reproduce the bug?

drop table if exists readings;
CREATE TABLE readings(
    "time"  TIMESTAMPTZ NOT NULL,
    battery_status  TEXT,
    battery_temperature  DOUBLE PRECISION
);

INSERT INTO readings ("time") VALUES ('2022-11-11 11:11:11-00');

SELECT create_hypertable('readings', 'time', chunk_time_interval => interval '12 hour', migrate_data=>true);

ALTER TABLE readings SET (timescaledb.compress,timescaledb.compress_segmentby = 'battery_temperature');
SELECT compress_chunk(show_chunks('readings'));

ALTER TABLE readings DROP COLUMN battery_status;
INSERT INTO readings ("time", battery_temperature) VALUES ('2022-11-11 11:11:11', 0.2);
SELECT readings FROM readings;

create unique index readings_uniq_idx on readings("time",battery_temperature);

SELECT decompress_chunk(show_chunks('readings'),true);
SELECT compress_chunk(show_chunks('readings'),true);

INSERT INTO readings ("time", battery_temperature) VALUES
    ('2022-11-11 11:11:11',0.2) -- same record as inserted 
;

select count(1) from readings where battery_temperature=0.2;

select * from readings;
-- unique check failure during decompression
select decompress_chunk(show_chunks('readings'),true);

case producing segfault:

drop table if exists readings;
CREATE TABLE readings(
    "time"  TIMESTAMPTZ NOT NULL,
    battery_status  TEXT,
    candy integer,
    battery_status2  TEXT,
    battery_temperature  TEXT
);

SELECT create_hypertable('readings', 'time', chunk_time_interval => interval '12 hour', migrate_data=>true);

ALTER TABLE readings SET (timescaledb.compress,timescaledb.compress_segmentby = 'battery_temperature');

ALTER TABLE readings DROP COLUMN battery_status;
ALTER TABLE readings DROP COLUMN battery_status2;
INSERT INTO readings ("time", candy,battery_temperature) VALUES ('2022-11-11 11:11:11', 88,'0.2');

create unique index readings_uniq_idx on readings("time",battery_temperature);

SELECT compress_chunk(show_chunks('readings'),true);

-- segfault
INSERT INTO readings ("time", candy,battery_temperature) VALUES
    ('2022-11-11 11:11:11',33,0.1) -- same record as inserted 
;
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 17, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 20, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 21, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 24, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 24, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 25, 2023
On compressed tables 3 schema levels are active simultaneously;
in build_scankeys all of them appears - and the hypertable level slot
was access using sibling table indices.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 25, 2023
On compressed hypertables 3 schema levels are in use simultaneously
 * main - hypertable level
 * chunk - inheritance level
 * compressed chunk

In the build_scankeys method all of them appear - as slot have their
fields represented as a for a row of the main hypertable.

Accessing the slot by the attribut numbers of the chunks may lead to
indexing mismatches if there are differences between the schemes.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 27, 2023
On compressed hypertables 3 schema levels are in use simultaneously
 * main - hypertable level
 * chunk - inheritance level
 * compressed chunk

In the build_scankeys method all of them appear - as slot have their
fields represented as a for a row of the main hypertable.

Accessing the slot by the attribut numbers of the chunks may lead to
indexing mismatches if there are differences between the schemes.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 27, 2023
On compressed hypertables 3 schema levels are in use simultaneously
 * main - hypertable level
 * chunk - inheritance level
 * compressed chunk

In the build_scankeys method all of them appear - as slot have their
fields represented as a for a row of the main hypertable.

Accessing the slot by the attribut numbers of the chunks may lead to
indexing mismatches if there are differences between the schemes.

Fixes: timescale#5577
kgyrtkirk added a commit to kgyrtkirk/timescaledb that referenced this issue Apr 27, 2023
On compressed hypertables 3 schema levels are in use simultaneously
 * main - hypertable level
 * chunk - inheritance level
 * compressed chunk

In the build_scankeys method all of them appear - as slot have their
fields represented as a for a row of the main hypertable.

Accessing the slot by the attribut numbers of the chunks may lead to
indexing mismatches if there are differences between the schemes.

Fixes: timescale#5577
kgyrtkirk added a commit that referenced this issue Apr 27, 2023
On compressed hypertables 3 schema levels are in use simultaneously
 * main - hypertable level
 * chunk - inheritance level
 * compressed chunk

In the build_scankeys method all of them appear - as slot have their
fields represented as a for a row of the main hypertable.

Accessing the slot by the attribut numbers of the chunks may lead to
indexing mismatches if there are differences between the schemes.

Fixes: #5577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant