Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table mappings to compression settings #6990

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

erimatnor
Copy link
Contributor

@erimatnor erimatnor commented Jun 3, 2024

Refactor the compression settings metadata table to include a mapping from a chunk's relid to its compressed chunk's relid.

Adding this mapping makes compression settings the main metadata table for compression-related information, while decoupling it from chunk metadata. This simplifies the code that looks up compression metadata as it no longer requires first looking up the corresponding compressed chunk.

The new compression settings is a step towards removing a chunk's compression table from the chunk metadata. In other words, the "compressed chunk" will no longer be a chunk, just a relation associated with the regular chunk via compression settings.

Disable-check: force-changelog-file

@erimatnor erimatnor added compression tech-debt Needs refactoring and improvement tasks related to the source code and its architecture. labels Jun 3, 2024
@erimatnor erimatnor force-pushed the refactor-compression-settings branch 7 times, most recently from a48c167 to f2652ad Compare June 4, 2024 09:22
Copy link

codecov bot commented Jun 4, 2024

Codecov Report

Attention: Patch coverage is 96.51163% with 3 lines in your changes missing coverage. Please review.

Project coverage is 81.78%. Comparing base (59f50f2) to head (b9df9ea).
Report is 569 commits behind head on main.

Files with missing lines Patch % Lines
tsl/src/compression/compression.c 91.30% 0 Missing and 2 partials ⚠️
tsl/src/compression/create.c 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6990      +/-   ##
==========================================
+ Coverage   80.06%   81.78%   +1.71%     
==========================================
  Files         190      199       +9     
  Lines       37181    37038     -143     
  Branches     9450     9675     +225     
==========================================
+ Hits        29770    30290     +520     
+ Misses       2997     2861     -136     
+ Partials     4414     3887     -527     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Refactor the compression settings metadata table to include a mapping
from a chunk's relid to its compressed chunk's relid.

Adding this mapping makes compression settings the main metadata table
for compression-related information, while decoupling it from chunk
metadata. This simplifies the code that looks up compression metadata
as it no longer requires first looking up the corresponding compressed
chunk.

The new compression settings is a step towards removing a chunk's
compression table from the chunk metadata. In other words, the
"compressed chunk" will no longer be a chunk, just a relation
associated with the regular chunk via compression settings.
@erimatnor erimatnor force-pushed the refactor-compression-settings branch from f2652ad to b9df9ea Compare June 4, 2024 10:21
@erimatnor erimatnor marked this pull request as ready for review June 4, 2024 11:57
Copy link
Member

@svenklemm svenklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this PR might need a bit more work. If we're going to store info about compressed and uncompressed assignments in the compression_settings table, we should take it out of the chunk catalog table in this same update. Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.

@erimatnor
Copy link
Contributor Author

It looks like this PR might need a bit more work. If we're going to store info about compressed and uncompressed assignments in the compression_settings table, we should take it out of the chunk catalog table in this same update. Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.

While I agree we should get rid of the metadata in the chunk catalog table, I think we should do this in increments. That's a pretty heavy lift and has other consequences on dependent tables, like compression_chunk_size, etc.

This change does not mean we will store the same information in multiple places apart from an extra ID. I think we can live with this in the short term until the other changes are merged too.

@erimatnor erimatnor requested a review from antekresic June 7, 2024 16:08
Copy link
Contributor

@antekresic antekresic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, lets make a concrete plan for the rest of the refactoring that should follow this PR for clarity and prioritization purposes.

Removing the compressed_chunk_id chunk metadata should be followed up and would clean up the catalog and our code nicely.

@@ -1916,7 +1916,9 @@ process_rename_column(ProcessUtilityArgs *args, Cache *hcache, Oid relid, Rename
* we don't do anything. */
if (ht)
{
ts_compression_settings_rename_column_hypertable(ht, stmt->subname, stmt->newname);
ts_compression_settings_rename_column_recurse(ht->main_table_relid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ts_compression_settings_rename_column_recurse(ht->main_table_relid,
ts_compression_settings_rename_column_cascade(ht->main_table_relid,

}

TSDLLEXPORT bool
ts_compression_settings_delete_by_compress_relid(Oid relid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should comment that these functions are part of an intermediate step while we transition from using compressed table relid to chunk relid.

}

TSDLLEXPORT CompressionSettings *
ts_compression_settings_get_by_compress_relid(Oid relid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a todo to make clear this will eventually go away since its an intermediate step towards the complete refactor.


process_predicates(chunk, settings, predicates, &heap_filters, &index_filters, &is_null);

chunk_rel = table_open(chunk->table_id, RowExclusiveLock);
comp_chunk = ts_chunk_get_by_id(chunk->fd.compressed_chunk_id, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you get the compressed table relid from the settings instead of getting the compressed chunk metadata here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compression tech-debt Needs refactoring and improvement tasks related to the source code and its architecture.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants