Support 'ON CONFLICT DO NOTHING' for compressed chunks #4113

wiebeytec · 2022-02-21T08:56:54Z

What type of enhancement is this?

User experience

What subsystems and features will be improved?

Compression

What does the enhancement do?

When there are compressed chunks and you do INSERT ... ON CONFLICT DO NOTHING, it says: "insert with ON CONFLICT or RETURNING clause is not supported on compressed chunks".

Currently, if you have an ingress system that processes duplicate values while back-dating, you can't use compression, even if you just want to ignore the duplicates.

It's a light-weight version of #3323: only supporting DO NOTHING is fine when having unique keys. However, perhaps it's essentially the same problem, I don't know.

Implementation challenges

I tried patching the code as per attached patch (Or github compare (for as long as the link works)). The problem with it is that despite there being a unique index, duplicate values are inserted. I don't know if fixing this (checking for the value in the compressed index) is essentially the same as #3323, or that DO NOTHING can more easily be implemented.

timescaledb-compressed-do-nothing-fix.patch.txt

The text was updated successfully, but these errors were encountered:

gayyappan · 2022-02-21T19:13:46Z

@wiebeytec This is essentially the same problem as 3323. We cannot support this enhancement without support for 3323.

cocowalla · 2022-07-18T13:51:37Z

Also really keen to see a fix for this.

As with many (most?) systems, I can't guarantee there will be no duplicates, so without this enhancement (more of a bug fix TBH, since it was treated as such in #100 for non-compressed hypertables) I need to check if rows exist before inserting them, which absolutely decimates performance 😭

WalkerWalker · 2022-08-11T08:58:15Z

also as discussed here on slack. Would love to see this feature.

sb230132 · 2022-08-30T04:28:50Z

Issue:

Consider below example table:

CREATE TABLE sample_table (
cpu double precision null,
time TIMESTAMP WITH TIME ZONE NOT NULL,
sensor_id INTEGER NOT NULL,
name TEXT default 'this is a default string value',
UNIQUE(sensor_id, time)
);

-- create hypertable
SELECT * FROM create_hypertable('sample_table', 'time',
chunk_time_interval => INTERVAL '10 day');

-- populate hypertable
INSERT INTO sample_table
SELECT random() AS cpu, time + (INTERVAL '1 minute' * random()) AS time,
sensor_id, 'very long string'
FROM
generate_series(now() - INTERVAL '1 months', now() - INTERVAL '1 week', INTERVAL '1 minute') AS g1(time),
generate_series(1, 25, 1 ) AS g2(sensor_id)
ORDER BY time;

-- show_chunks
_timescaledb_internal._hyper_1_1_chunk
_timescaledb_internal._hyper_1_2_chunk
_timescaledb_internal._hyper_1_3_chunk

Above chunks are child tables of sample_table. These chunks inherit UNIQUE CONSTRAINT from sample_table.

--enable compression
ALTER TABLE sample_table SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'sensor_id',
timescaledb.compress_orderby = 'time'
);

Enabling compression creates a new compressed table _timescaledb_internal._compressed_hypertable_2

Note that UNIQUE CONSTRAINT is not created for this new compressed table.

-- compress the chunks
SELECT COMPRESS_CHUNK(i) FROM SHOW_CHUNKS('sample_table') i;

-- INSERT duplicates:
INSERT into sample_table SELECT * FROM sample_table;
ERROR: insert into a compressed chunk that has primary or unique constraint is not supported

-- INSERT duplicates WITH ON CONFLICT
INSERT into sample_table SELECT * FROM sample_table ON CONFLICT DO NOTHING;
ERROR: insert with ON CONFLICT or RETURNING clause is not supported on compressed chunks

If i comment the above error handling code, timescaledb ends up inserting duplicate rows. This is because there is no UNIQUE CONSTRAINT on the compressed table.

Questions:

Should we copy the UNIQUE CONSTRAINT defined on base parent table, to compressed table and let compressed
chunks to inherit the same constraint ?
Challenge with this approach is that "time" column data type is _timescaledb_internal.compressed_data. This data type does not define any operator classes which defines about comparator operators.
Is UNIQUE CONSTRAINT check performed after data is decompressed, or should we do binary data comparison of compressed data ?

I will clarify above questions with my team and further proceed on how to approach this problem.

sb230132 · 2023-03-06T14:18:20Z

Refer #5252
Will close this issue once 5252 is merged to main branch.

WalkerWalker · 2023-03-06T14:42:04Z

wow. cool. Super happy to be able to use this feature.

antekresic · 2023-04-06T11:01:17Z

#5252 has been merged.

wiebeytec added the enhancement An enhancement to an existing feature for functionality label Feb 21, 2022

NunoFilipeSantos added the compression label Feb 21, 2022

mkindahl added the ddl Issues that involve DDL handling label Feb 25, 2022

jfjoly added the q2-candidate label Mar 3, 2022

NunoFilipeSantos added the Query Experience Team label Mar 9, 2022

NunoFilipeSantos assigned svenklemm Apr 5, 2022

svenklemm removed their assignment May 12, 2022

akuzm added Team: Core Database and removed Query Experience Team labels Jun 10, 2022

cocowalla mentioned this issue Jul 18, 2022

[Bug]: ON CONFLICT not working with compressed hypertables #4523

Closed

jfjoly removed the q2-candidate label Jul 25, 2022

sb230132 self-assigned this Aug 29, 2022

WalkerWalker mentioned this issue Dec 9, 2022

Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

Closed

antekresic closed this as completed Apr 6, 2023

Ansssss mentioned this issue Oct 18, 2023

[Enhancement]: Support MERGE with update/delete actions on a compressed hypertable #6214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support 'ON CONFLICT DO NOTHING' for compressed chunks #4113

Support 'ON CONFLICT DO NOTHING' for compressed chunks #4113

wiebeytec commented Feb 21, 2022

gayyappan commented Feb 21, 2022

cocowalla commented Jul 18, 2022

WalkerWalker commented Aug 11, 2022 •

edited

sb230132 commented Aug 30, 2022

sb230132 commented Mar 6, 2023

WalkerWalker commented Mar 6, 2023

antekresic commented Apr 6, 2023

Support 'ON CONFLICT DO NOTHING' for compressed chunks #4113

Support 'ON CONFLICT DO NOTHING' for compressed chunks #4113

Comments

wiebeytec commented Feb 21, 2022

What type of enhancement is this?

What subsystems and features will be improved?

What does the enhancement do?

Implementation challenges

gayyappan commented Feb 21, 2022

cocowalla commented Jul 18, 2022

WalkerWalker commented Aug 11, 2022 • edited

sb230132 commented Aug 30, 2022

Issue:

sb230132 commented Mar 6, 2023

WalkerWalker commented Mar 6, 2023

antekresic commented Apr 6, 2023

WalkerWalker commented Aug 11, 2022 •

edited