Feature request: delete compressed chunks if condition matches segmentby #2692

alex88 · 2020-11-27T09:03:20Z

This is a follow-up of a slack discussion
In our scenario we have IOT devices sending data into a raw_data table structured like this:

 - machine_id int4
 - metric_name varchar
 - value real
 - timestamp timestamptz

We also have compression enabled for chunks older than 2 weeks, which is segmented by machine_id,metric_name.
When we have to delete a machine there are three options:

use a foreign key
sequentially decompress, delete, compress each chunk
leave the data there

The first option works however:

having a foreign key slowed down (a lot) our initial data import process up to a point it would've taken days to complete
deletion of all the data linked to a machine in multiple hypertables because of the ON DELETE CASCADE was causing slowdowns and locks (after 20 minutes I had to cancel the query)

The second and third option are sub-optimal.

From my understanding of compression, data is stored in a columnar format but there is one row for each value of the segmentby column. So deleting based on a condition on the segmentby colums doesn't need to change the internal row array but it would just delete the row?

It's probably not that simple though. what are your thoughts about it?

The text was updated successfully, but these errors were encountered:

k-rus · 2020-11-27T10:23:12Z

Thank you for the feature request and describing your use case!

lasseste · 2022-01-06T11:17:53Z

Is there any news on this Feature?
We have a very similar use case:
We have timeseries data stored in a hypertable with a deviceId. We have different retention periods by device ids.
For deleting our data we have a script that deletes old data according to its retention period chunk by chunk. (We only delete all data from one deviceid within a chunk never parts)
In that way we managed to delete in a relative efficient way of deleting data ,even though we cannot drop complete chunks.
We have to run a Vacuum Full on each chunk sometimes to actually free disk space from old, nearly empty chunks.

However now we would like to compress the data, as it is very well compressable. For our deletion script that means we would need to decompress and compress each chunk in the process.

As the architecture already allows to segment the data by the device id, we were wondering if it would be possible to delete segments without decompressing first. Thats how i found this feature Request.

xvaara · 2022-10-12T09:24:47Z

I created a function to delete from compressed table using the segmentby col:
https://gist.github.com/xvaara/81990e8291019f931387492c1869fe84

it has a lot of debug output, just comment out the notices when in production.

mfreed · 2022-12-24T02:53:32Z

The team has been making progress on supporting DELETEs on compressed chunks. Please see this issue for discussion:

#2857

k-rus added community-request compression labels Nov 27, 2020

NunoFilipeSantos added feature-request Feature proposal and removed community-request labels Sep 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: delete compressed chunks if condition matches segmentby #2692

Feature request: delete compressed chunks if condition matches segmentby #2692

alex88 commented Nov 27, 2020 •

edited

k-rus commented Nov 27, 2020

lasseste commented Jan 6, 2022 •

edited

xvaara commented Oct 12, 2022

mfreed commented Dec 24, 2022

Feature request: delete compressed chunks if condition matches segmentby #2692

Feature request: delete compressed chunks if condition matches segmentby #2692

Comments

alex88 commented Nov 27, 2020 • edited

k-rus commented Nov 27, 2020

lasseste commented Jan 6, 2022 • edited

xvaara commented Oct 12, 2022

mfreed commented Dec 24, 2022

alex88 commented Nov 27, 2020 •

edited

lasseste commented Jan 6, 2022 •

edited