Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix index matching during DML decompression #6428

Merged
merged 1 commit into from Dec 18, 2023

Conversation

antekresic
Copy link
Contributor

@antekresic antekresic commented Dec 14, 2023

Prior to version 2.8, compressed chunks had a different format for indexing. This change attempts to find the best index it can actually use during scanning of compressed data and moves filters around based on the selected index. If no index exists, it should fallback to doing a sequencial scan.

Fixes #6367

Copy link

codecov bot commented Dec 14, 2023

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (8f73f95) 87.30% compared to head (c1bc866) 87.36%.

Files Patch % Lines
tsl/src/compression/compression.c 90.99% 1 Missing and 9 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6428      +/-   ##
==========================================
+ Coverage   87.30%   87.36%   +0.06%     
==========================================
  Files         184      184              
  Lines       41615    41544      -71     
  Branches     9232     9219      -13     
==========================================
- Hits        36330    36294      -36     
+ Misses       3612     3583      -29     
+ Partials     1673     1667       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@antekresic antekresic force-pushed the find-index branch 2 times, most recently from 4fe9974 to 7356e7f Compare December 14, 2023 11:20
@antekresic antekresic marked this pull request as ready for review December 14, 2023 11:48
@github-actions github-actions bot requested a review from akuzm December 14, 2023 11:48
Copy link

@mkindahl, @akuzm: please review this pull request.

Powered by pull-review

@antekresic antekresic force-pushed the find-index branch 2 times, most recently from 019671e to 08e5599 Compare December 18, 2023 10:38
Prior to version 2.8, compressed chunks had a different format
for indexing. This change attempts to find the best index
it can actually use during scanning of compressed data
and moves filters around based on the selected index. If
no index exists, it should fallback to doing a sequencial
scan.
@antekresic antekresic merged commit 384bcaa into timescale:main Dec 18, 2023
43 checks passed
@timescale-automation
Copy link

Automated backport to 2.13.x not done: cherry-pick failed.

Git status

HEAD detached at origin/2.13.x
You are currently cherry-picking commit 384bcaa7a.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   .unreleased/fix_6428
	modified:   tsl/src/compression/compression.h
	modified:   tsl/test/expected/compression_update_delete.out
	modified:   tsl/test/sql/compression_update_delete.sql

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   tsl/src/compression/compression.c


Job log

@timescale-automation timescale-automation added the auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts) label Dec 18, 2023
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 3, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
@jnidzwetzki jnidzwetzki mentioned this pull request Jan 3, 2024
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 3, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 3, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 3, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #6365 Use numrows_pre_compression in approximate row count
* #6377 Use processed group clauses in PG16
* #6384 Change bgw_log_level to use PGC_SUSET
* #6393 Disable vectorized sum for expressions.
* #6408 Fix groupby pathkeys for gapfill in PG16
* #6428 Fix index matching during DML decompression
* #6439 Fix compressed chunk permission handling on PG16
* #6443 Fix lost concurrent CAgg updates
* #6454 Fix unique expression indexes on compressed chunks
* #6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 4, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 9, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 9, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 9, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit to jnidzwetzki/timescaledb that referenced this pull request Jan 9, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#6365 Use numrows_pre_compression in approximate row count
* timescale#6377 Use processed group clauses in PG16
* timescale#6384 Change bgw_log_level to use PGC_SUSET
* timescale#6393 Disable vectorized sum for expressions.
* timescale#6405 Read CAgg watermark from materialized data
* timescale#6408 Fix groupby pathkeys for gapfill in PG16
* timescale#6428 Fix index matching during DML decompression
* timescale#6439 Fix compressed chunk permission handling on PG16
* timescale#6443 Fix lost concurrent CAgg updates
* timescale#6454 Fix unique expression indexes on compressed chunks
* timescale#6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
jnidzwetzki added a commit that referenced this pull request Jan 9, 2024
This release contains bug fixes since the 2.13.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #6365 Use numrows_pre_compression in approximate row count
* #6377 Use processed group clauses in PG16
* #6384 Change bgw_log_level to use PGC_SUSET
* #6393 Disable vectorized sum for expressions.
* #6405 Read CAgg watermark from materialized data
* #6408 Fix groupby pathkeys for gapfill in PG16
* #6428 Fix index matching during DML decompression
* #6439 Fix compressed chunk permission handling on PG16
* #6443 Fix lost concurrent CAgg updates
* #6454 Fix unique expression indexes on compressed chunks
* #6465 Fix use of freed path in decompression sort logic

**Thanks**
* @MA-MacDonald for reporting an issue with gapfill in PG16
* @aarondglover for reporting an issue with unique expression indexes on compressed chunks
* @adriangb for reporting an issue with security barrier views on pg16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts) backported-2.13.x bug compression
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Crash in compressed table if segmentby index is dropped
4 participants