Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor INSERT into compressed chunks #4926

Merged
merged 1 commit into from Dec 21, 2022

Conversation

svenklemm
Copy link
Member

@svenklemm svenklemm commented Nov 5, 2022

This patch changes INSERTs into compressed chunks to no longer
be immediately compressed but stored in the uncompressed chunk
instead and later merged with the compressed chunk by a separate
job.

This greatly simplifies the INSERT-codepath as we no longer have
to rewrite the target of INSERTs and on-the-fly compress leading to a
roughly 2x improvement on INSERT rate into compressed chunk.
Additionally this improves TRIGGER-support for INSERTs into compressed chunks.

This is a necessary refactoring to allow UPSERT/UPDATE/DELETE on
compressed chunks in follow-patches.

Fixes #4655

@svenklemm svenklemm force-pushed the compression_insert branch 3 times, most recently from ea040be to 9b97537 Compare November 5, 2022 14:29
@svenklemm
Copy link
Member Author

Depends on #4925

src/copy.c Outdated Show resolved Hide resolved
src/nodes/chunk_insert_state.c Outdated Show resolved Hide resolved
@svenklemm svenklemm force-pushed the compression_insert branch 11 times, most recently from 5a0e71f to 13f58ee Compare November 20, 2022 18:36
@svenklemm svenklemm changed the title Refactor compressed INSERT to use staging area Refactor INSERT into compressed chunks Nov 20, 2022
@svenklemm svenklemm force-pushed the compression_insert branch 2 times, most recently from 81145d1 to bcb6a97 Compare November 21, 2022 09:49
@svenklemm svenklemm marked this pull request as ready for review November 21, 2022 09:49
@codecov
Copy link

codecov bot commented Nov 21, 2022

Codecov Report

Merging #4926 (b455357) into main (1d51672) will decrease coverage by 0.18%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4926      +/-   ##
==========================================
- Coverage   89.61%   89.42%   -0.19%     
==========================================
  Files         227      227              
  Lines       51744    51617     -127     
==========================================
- Hits        46369    46160     -209     
- Misses       5375     5457      +82     
Impacted Files Coverage Δ
src/chunk.c 92.08% <100.00%> (+0.94%) ⬆️
src/copy.c 90.47% <100.00%> (-0.49%) ⬇️
src/nodes/chunk_dispatch_state.c 92.85% <100.00%> (-1.15%) ⬇️
src/nodes/chunk_insert_state.c 97.37% <100.00%> (+0.17%) ⬆️
src/planner/planner.c 95.90% <100.00%> (+0.05%) ⬆️
tsl/src/nodes/decompress_chunk/decompress_chunk.c 95.08% <100.00%> (+0.09%) ⬆️
tsl/src/nodes/decompress_chunk/qual_pushdown.c 93.29% <100.00%> (ø)
tsl/src/remote/dist_copy.c 88.79% <100.00%> (ø)
tsl/src/compression/compression.c 86.87% <0.00%> (-9.17%) ⬇️
src/loader/bgw_message_queue.c 86.36% <0.00%> (-2.85%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd20afc...b455357. Read the comment docs.

Copy link
Contributor

@fabriziomello fabriziomello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please add compression-15.out before merge it!

@svenklemm svenklemm force-pushed the compression_insert branch 2 times, most recently from dd23a98 to 5e8b23f Compare November 23, 2022 09:47
sql/policy_internal.sql Outdated Show resolved Hide resolved
Comment on lines 68 to 72
-- chunk status bits:
-- 1: compressed
-- 2: compressed unordered
-- 4: frozen
-- 8: compressed partial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to update the high-level description of our compression logic. Do we have one somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm i dont think we have what you are asking here atm.

@@ -65,6 +65,11 @@ BEGIN
lag := _timescaledb_internal.subtract_integer_from_now(htoid, lag::BIGINT);
END IF;

-- chunk status bits:
-- 1: compressed
-- 2: compressed unordered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unordered chunks are not created by INSERTs anymore, but can still be created by UPDATEs, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently dont support UPDATE on compressed chunks. Unordered chunks won't be created by this INSERT change but can still be created by the merge chunk functionality of compression.

Copy link
Member

@akuzm akuzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification.

@svenklemm svenklemm force-pushed the compression_insert branch 4 times, most recently from 91bcb49 to 2ace43f Compare December 21, 2022 10:46
This patch changes INSERTs into compressed chunks to no longer
be immediately compressed but stored in the uncompressed chunk
instead and later merged with the compressed chunk by a separate
job.

This greatly simplifies the INSERT-codepath as we no longer have
to rewrite the target of INSERTs and on-the-fly compress leading
to a roughly 2x improvement on INSERT rate into compressed chunk.
Additionally this improves TRIGGER-support for INSERTs into
compressed chunks.

This is a necessary refactoring to allow UPSERT/UPDATE/DELETE on
compressed chunks in follow-patches.
@svenklemm svenklemm enabled auto-merge (rebase) December 21, 2022 11:38
@svenklemm svenklemm merged commit 4527f51 into timescale:main Dec 21, 2022
@akuzm akuzm added the auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts) label Jan 26, 2023
@timescale-automation
Copy link

Automated backport to 2.9.x not done: cherry-pick failed.

Job log

@mahipv mahipv mentioned this pull request Feb 14, 2023
mahipv added a commit that referenced this pull request Feb 20, 2023
This release contains new features and bug fixes since the 2.9.3
release.

This release is high priority for upgrade. We strongly recommend that
you upgrade as soon as possible.

**Features**
* #5241 Allow RETURNING clause when inserting into compressed chunks
* #5245 Manage life-cycle of connections via memory contexts
* #5246 Make connection establishment interruptible
* #5253 Make data node command execution interruptible
* #5243 Enable real-time aggregation for continuous aggregates with
joins
* #5262 Extend enabling compression on a continuous aggregrate with
'compress_segmentby' and 'compress_orderby' parameters

**Bugfixes**
* #4926 Fix corruption when inserting into compressed chunks
* #5218 Add role-level security to job error log
* #5214 Fix use of prepared statement in async module
* #5290 Compression can't be enabled on continuous aggregates when
segmentby/orderby columns need quotation
* #5239 Fix next_start calculation for fixed schedules
mahipv pushed a commit that referenced this pull request Feb 21, 2023
This release contains new features and bug fixes since the 2.9.3 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:
* Joins in continuous aggregates
* Re-architecture of how compression works: ~2x improvement on INSERT rate into compressed chunks.
* Full PostgreSQL 15 support for all existing features. Support for the newly introduced MERGE command on hypertables will be introduced on a follow-up release.

**PostgreSQL 12 deprecation announcement**
We will continue supporting PostgreSQL 12 until July 2023. Sooner to that time, we will announce the specific version of TimescaleDB in which PostgreSQL 12 support will not be included going forward.

**Old format of Continuous Aggregates deprecation announcement**
TimescaleDB 2.7 introduced a new format for continuous aggregates that improves performance.
All instances with Continuous Aggregates using the old format should [migrate to the new format](https://docs.timescale.com/api/latest/continuous-aggregates/cagg_migrate/) by July 2023,
when support for the old format will be removed.
Sooner to that time, we will announce the specific version of TimescaleDB in which support for this feature will not be included going forward.

**Features**
* #4874 Allow joins in continuous aggregates
* #4926 Refactor INSERT into compressed chunks
* #5241 Allow RETURNING clause when inserting into compressed chunks
* #5245 Manage life-cycle of connections via memory contexts
* #5246 Make connection establishment interruptible
* #5253 Make data node command execution interruptible
* #5262 Extend enabling compression on a continuous aggregrate with 'compress_segmentby' and 'compress_orderby' parameters

**Bugfixes**
* #5214 Fix use of prepared statement in async module
* #5218 Add role-level security to job error log
* #5239 Fix next_start calculation for fixed schedules
* #5290 Fix enabling compression on continuous aggregates with columns requiring quotation

**Thanks**
* @henriquegelio for reporting the issue on fixed schedules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Corruption when inserting into compressed chunk
5 participants