Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

srstsavage · 2021-01-24T18:42:40Z

Creating this issue to track a feature request for allowing INSERTS, UPDATES, and DELETES to compressed chunks within hypertables (i.e. performing those operations on hypertables with compressed chunks using data with times within the ranges of compressed chunks). Apologies if this is already covered by another issue, but I scanned for a while and didn't find one.

The current limitations are described in the docs and a blog post, which also mentions:

We also plan to weaken/remove this limitation in future releases.

I'm creating this issue mostly in hopes of hearing from the devs about the feasability and/or timeline of implementing this functionality. If you're interested in this as a user, please just +1 instead of comment spamming unless you have a relevant use case to describe.

Our use case is collecting long time series from many devices outputting observations at various frequency. The vast majority of data ingest is "current" (now-ish) data, and access patterns are primarily wide aggregations of current conditions or full time series extraction of a single device; this makes using compression on older tables very desirable.

However, we are also continuously adding new devices with existing historical time series to the data store, and certain devices periodically need older data deleted and replaced with quality controlled data. Currently this would require manual intervention, either by manually decompressing chunks, inserting data, and recompressing (which is complicated and requires temporary usage of larger disk space) or running the backfill script, which I haven't yet tested but seems like it isn't aware of secondary/space partitioning columns (i.e. it would decompress all chunks corresponding to a time range rather than specific chunks matching the hash of the secondary partitioning column).

I'll note that our use case can tolerate somewhat sub-optimal performance for INSERTS/UPDATES/DELETES on compressed chunks. In other words, we can wait a bit longer these operations to complete as long as they work as expected without manual intervention.

srstsavage · 2021-01-24T18:44:30Z

One more note, the backfill script seems to rely on the usage of compression policies, which aren't yet supported on distributed hypertables.

#1900

mfreed · 2021-01-28T01:00:36Z

Thanks for this tracking issue, @shane-axiom. Team indeed plans to mitigate some of these limitations.

As a start, we've recently merged a PR that allows one to ALTER the schema of a compressed hypertable (by supporting ability to add a column). This will likely be released in v2.1. #2736

gtskaushik · 2021-02-23T02:52:19Z

Eagerly waiting for this functionality

q00Dree · 2021-03-24T21:08:37Z

That would be nice. Current situation is kinda uncomfortable with backfilling.

mfreed · 2021-05-18T14:56:44Z

Update: We've been making great progress on this, and should release soon the ability to INSERT rows directly into compressed hypertables.

wkopec-tt · 2021-05-31T09:28:59Z

Do you have any plans for DELETEs?
We would love to move out user event data to TimescaleDB and our tests show amazing compression compression rate, but unfortunately we need to be able to delete user data to ensure GDPR-compatibility.
GDPR-compatibility is critical for every software operating in European Union, so I'm sure that it would be very important feature for a lot of timescale users.
Thanks!

Pinacolada8 · 2021-08-03T13:09:03Z

Hey guys,
I know it is annoying to ask about release time, but do you have any release time estimate?
Sorry for my bad english and for that annoying question.

mgagliardo91 · 2021-09-21T19:42:06Z

Would also love to see UPSERTS with unique constraints on compressed tables. With how much data we have per compressed chunk, it takes a really long time to use the backfill script - a single compressed chunk can take up to 2 hours to decompress even if we only need to upsert 2K records.

fergus3000 · 2022-02-25T13:42:22Z

We would also like to be able to upsert into compressed chunks where we have a uniqueness constraint. So far we have implemented the workaround of decompressing, upserting and re-compressing chunks, but have concerns about performance.

fetchezar · 2022-07-04T18:58:58Z

Hi! I'd like to propose a limited form of DELETE (and maybe even UPDATE), that I think it's rather viable. I note that I'm talking in the air here since this is what I have derived from looking at the TimescaleDB tables.

We start with a simple schema:

create table signal (
  id int8 not null primary key,
  name text not null unique
);
create table sample (
  id int8 not null,
  tstamp timestamptz not null,,
  value float8 not null,
  primary key (id, tstamp),
  constraint fk_sample foreign key(id) references signal(id)
);
select create_hypertable('sample ', 'tstamp', chunk_time_interval => interval '5 days');
alter table example set (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'id'
);
select add_compression_policy('example', interval '10 days');

So if we have a signal called "oven_temperature" with an id 23, we'd see samples like:

23, '2022-07-04T00:30:00', 120.0
23, '2022-07-04T00:35:00', 130.0
23, '2022-07-04T00:40:00', 155.0

While the normal uncompressed chunks just follow the base table's schema, the compression settings will generate tables in the background named compress_hyper_x_y_chunk in the following form:

create table compress_hyper_x_y_chunk (
  id int8,
  tstamp compressed_data,
  value compressed_data,
  -- plus other internal timescaledb stuff
)

Now, eyeballing it it seems editing either timestamp or value data is very complicated, since you'd need to manipulate the binary blobs in each column, which are compressed and so on.

However, it seems possible to delete the data of an entire signal easily, since you could just add a where id = 23 and delete all the signal's history.

In the same vein, it may be somewhat possible to update the table replacing the id with something else. Conceptually you'd be "redirecting" the history of a signal with another one. For instance, if you had "oven1_temperature" with id 23, "oven2_temperature" with id 24, and later you wanted to unify that history on "oven1_temperature", you could replace occurences of id 24 with id 23.

The foreign key of sample makes it so you cant delete signals at all if the hypertable is compressed, so at the very least "reorganizing" existing history by pointing it to other signal IDs can be useful.

All of this is a far cry of a generic standard SQL delete or update, but it'd useful in a bunch of cases I believe, probably exposed via especific TimescaleDB functions.

What do you think?

xvaara · 2022-10-12T09:26:19Z

I created a function to delete from compressed table using the segmentby col:
https://gist.github.com/xvaara/81990e8291019f931387492c1869fe84

it has a lot of debug output, just comment out the notices when in production.

Ozzard · 2022-11-14T09:25:37Z

I created a function to delete from compressed table using the segmentby col: https://gist.github.com/xvaara/81990e8291019f931387492c1869fe84

You just made my life a whole load simpler, as this allows me to meet our data retention requirements by removing segmented data from a single hypertable.

mgagliardo91 · 2022-11-15T13:20:58Z

Any outgoing work on being able to issue UPSERTS/UPDATES to a compressed chunk? We have been having to deal with merging out-of-order and in-order compressed data together for some time now hoping we may see an updated on this so we can simplify this aspect of how we use timescale.

WalkerWalker · 2022-12-09T12:43:25Z

Looking forward to this feature!!!! Have been fighting with manual decompressing, upsert and compressing quite a lot and the dealing with deadlocks along the way. I am sure this feature can make my life a lot easier.

iroussos · 2022-12-09T13:56:13Z

A quick status update for anyone interested on this feature:

We are actively working on supporting updates, deletes, and upserts on compressed chunks. To do so, we are revisiting the way we perform DML on compressed chunks, which will also speed up inserts.

We expect to start gradually releasing support for the aforementioned commands during Q1 of 2023 🎆

WalkerWalker · 2022-12-09T14:45:16Z

Wonder if this feature covers this one or not?
#4113

iroussos · 2022-12-09T15:44:44Z

Wonder if this feature covers this one or not? #4113

@WalkerWalker yes ON CONFLICT DO NOTHING will be covered by this update :-)
(it is a much simpler use case of upsert, so we'll start with that either way)

mgagliardo91 · 2022-12-09T17:26:37Z

A quick status update for anyone interested on this feature:

We are actively working on supporting updates, deletes, and upserts on compressed chunks. To do so, we are revisiting the way we perform DML on compressed chunks, which will also speed up inserts.

We expect to start gradually releasing support for the aforementioned commands during Q1 of 2023 🎆

@iroussos this is great to hear! We are very excited to simplify our current approach to upserting into compressed chunks. Will it support compressions that contain two segment_by columns?

jsvisa · 2022-12-13T00:27:16Z

/ping

iroussos · 2022-12-14T12:03:27Z

Will it support compressions that contain two segment_by columns?

@mgagliardo91 Yes, all compression configurations will be supported

jvanns · 2023-01-30T10:17:37Z

DELETE support would sure make the experience of managing a hypertable via regular statements a lot more streamlined ;)

WeeJeWel · 2023-02-02T15:04:14Z

@mgagliardo91 I'm very excited for this feature to arrive in TimescaleDB Cloud.

Can you share a rough ETA?

mgagliardo91 · 2023-02-02T15:16:26Z

@WeeJeWel I think you'd want to ask @iroussos

WeeJeWel · 2023-02-02T15:22:34Z

@WeeJeWel I think you'd want to ask @iroussos

You're right, haha!

@iroussos Hello ^.^

iroussos · 2023-02-10T16:36:31Z

@WeeJeWel sorry for the late response, lots of snow in my tiny end of the world, so I was slow to catch up with my notifications.

We are actively working on this, and the progress is looking good on our side. We are working in public, so for the more adventurous between you, you can even track our progress through our PRs (I'll link a few in case anyone is interested in what we have been doing)

With the usual disclaimer that until releases are finalized we can never make promises, here is a rough list to provide some transparency on our plans:

We have completed our initial re-architecture of how compression works (#4926) - This is a required step to enable DML on compressed chunks, but it also provides a ~2x improvement on INSERT rate into compressed chunks. We expect this to be included in 2.10, which is just around the corner (before end of the month)
We are currently working on enabling unique constraints on compressed chunks (#5252), and support the INSERT .. ON CONFLICT DO NOTHING and then the ON CONFLICT DO UPDATE (a.k.a. upsert) clauses - We expect those to land on the release that will follow 2.10
Finally, support for UPDATES and DELETES will be released - release TBD, but we'll keep our promise and work on those once the previous ones are ready

Let us know what you believe that we should be working on after the list above is in the can by adding new issues with feature requests or voting and commenting on existing ones :-)

WeeJeWel · 2023-02-10T16:47:53Z

Thank you for the transparency @iroussos !

I'm looking forward to seeing this available in TimescaleDB Cloud.

sb230132 · 2023-03-06T11:31:00Z

Refer #5339

iroussos · 2023-06-06T14:20:40Z

Closing successfully as this request is now supported in TimescaleDB 2.11 🎆

For more info check our release notes

bmillwood · 2023-06-27T17:48:10Z

I'm using self-hosted TimescaleDB and just ran ALTER EXTENSION timescaledb UPDATE and confirmed with \dx I'm now on 2.11.0. However, I'm still seeing these messages:

ERROR:  cannot update/delete rows from chunk "_hyper_20_6086_chunk" as it is compressed

(etc.)

Is there anything else I need to do to make this possible? Do I need to restart PostgreSQL, recompress chunks, or anything like that?

jsvisa · 2023-06-28T02:29:08Z

@bmillwood try to update the timescaledb_toolkit too and then retry

bmillwood · 2023-06-28T08:57:13Z

@jsvisa thanks, but I don't think I have that installed (it doesn't show up in \dx)

jsvisa · 2023-06-28T09:26:07Z

@bmillwood or maybe you need to specify the target database, else the extension maybe not updated as expected. ref timescale/docs.timescale.com-content#300

bmillwood · 2023-06-28T13:18:02Z

@jsvisa I don't use the Docker installation and don't have timescaledb installed on any other databases.

I looked at the code for the PR and I see this:

	/*
	 * We do not support UPDATE/DELETE operations on compressed hypertables
	 * on PG versions < 14, because Custom Scan (HypertableModify) node is
	 * not generated in the plan for UPDATE/DELETE operations on hypertables
	 */

Indeed I'm on postgresql 13. Maybe this should have been mentioned in the release notes?

mfreed added the enhancement label Jan 24, 2021

SimonDelamare mentioned this issue Jan 28, 2021

Allow drop primary key on compressed hypertable #2878

Open

NunoFilipeSantos added the compression label Aug 17, 2021

NunoFilipeSantos added feature-request Feature proposal and removed feature-enhancement labels Sep 28, 2021

jfjoly added the q2-candidate label Mar 2, 2022

NunoFilipeSantos added Query Experience Team needs-estimation labels Mar 22, 2022

NunoFilipeSantos added q3-candidate Team: Core Database and removed Query Experience Team labels May 13, 2022

jfjoly removed q2-candidate q3-candidate labels Jul 25, 2022

sb230132 assigned sb230132 and unassigned sb230132 Aug 25, 2022

This was referenced Dec 24, 2022

Insert into compressed chunk with primary key fails #3323

Closed

Feature request: delete compressed chunks if condition matches segmentby #2692

Open

mfreed removed the needs-estimation label May 26, 2023

iroussos closed this as completed Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

srstsavage commented Jan 24, 2021

srstsavage commented Jan 24, 2021

mfreed commented Jan 28, 2021

gtskaushik commented Feb 23, 2021

q00Dree commented Mar 24, 2021

mfreed commented May 18, 2021

wkopec-tt commented May 31, 2021

Pinacolada8 commented Aug 3, 2021

mgagliardo91 commented Sep 21, 2021

fergus3000 commented Feb 25, 2022

fetchezar commented Jul 4, 2022

xvaara commented Oct 12, 2022

Ozzard commented Nov 14, 2022

mgagliardo91 commented Nov 15, 2022

WalkerWalker commented Dec 9, 2022 •

edited

iroussos commented Dec 9, 2022

WalkerWalker commented Dec 9, 2022

iroussos commented Dec 9, 2022

mgagliardo91 commented Dec 9, 2022

jsvisa commented Dec 13, 2022

iroussos commented Dec 14, 2022

jvanns commented Jan 30, 2023

WeeJeWel commented Feb 2, 2023

mgagliardo91 commented Feb 2, 2023

WeeJeWel commented Feb 2, 2023

iroussos commented Feb 10, 2023

WeeJeWel commented Feb 10, 2023

sb230132 commented Mar 6, 2023

iroussos commented Jun 6, 2023

bmillwood commented Jun 27, 2023

jsvisa commented Jun 28, 2023

bmillwood commented Jun 28, 2023

jsvisa commented Jun 28, 2023

bmillwood commented Jun 28, 2023 •

edited

Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

Allow INSERTS/UPDATES/DELETES to compressed chunks #2857

Comments

srstsavage commented Jan 24, 2021

srstsavage commented Jan 24, 2021

mfreed commented Jan 28, 2021

gtskaushik commented Feb 23, 2021

q00Dree commented Mar 24, 2021

mfreed commented May 18, 2021

wkopec-tt commented May 31, 2021

Pinacolada8 commented Aug 3, 2021

mgagliardo91 commented Sep 21, 2021

fergus3000 commented Feb 25, 2022

fetchezar commented Jul 4, 2022

xvaara commented Oct 12, 2022

Ozzard commented Nov 14, 2022

mgagliardo91 commented Nov 15, 2022

WalkerWalker commented Dec 9, 2022 • edited

iroussos commented Dec 9, 2022

WalkerWalker commented Dec 9, 2022

iroussos commented Dec 9, 2022

mgagliardo91 commented Dec 9, 2022

jsvisa commented Dec 13, 2022

iroussos commented Dec 14, 2022

jvanns commented Jan 30, 2023

WeeJeWel commented Feb 2, 2023

mgagliardo91 commented Feb 2, 2023

WeeJeWel commented Feb 2, 2023

iroussos commented Feb 10, 2023

WeeJeWel commented Feb 10, 2023

sb230132 commented Mar 6, 2023

iroussos commented Jun 6, 2023

bmillwood commented Jun 27, 2023

jsvisa commented Jun 28, 2023

bmillwood commented Jun 28, 2023

jsvisa commented Jun 28, 2023

bmillwood commented Jun 28, 2023 • edited

WalkerWalker commented Dec 9, 2022 •

edited

bmillwood commented Jun 28, 2023 •

edited