New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for inserts into compressed chunks #3230
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3230 +/- ##
==========================================
+ Coverage 90.26% 90.47% +0.21%
==========================================
Files 216 216
Lines 35720 35986 +266
==========================================
+ Hits 32241 32557 +316
+ Misses 3479 3429 -50
Continue to review full report at Codecov.
|
470942d
to
5e8c5cf
Compare
78219fe
to
14afac9
Compare
3861c2a
to
1f34423
Compare
1f34423
to
d40e349
Compare
d40e349
to
34be757
Compare
|
||
typedef enum ChunkCompressionStatus | ||
{ | ||
CHUNK_COMPRESS_NONE = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have this enum and then we have CHUNK_STATUS_UNORDERED
type #defines as well..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #define
s are part of the database protocol and ideally will never change. They describe the chunk status column, and can be bitwise ORed. This is an internal in-memory state for some limited scope logic here. It's mostly to separate the cases of the return values of ts_chunk_get_compression_status()
.
My suggestion was to change CHUNK_DROPPED
to CHUNK_COMPRESS_DROPPED
for this reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I was also a bit confused by this, and the inconsistent naming between enums and flags. For instance, "unordered" having the "compress" prefix in one case (CHUNK_COMPRESS_UNORDERED
) but not the other CHUNK_STATUS_UNORDERED
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a few queries, but overall looks very good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locks solid in terms of functionality. Have some suggestions, questions, and nits that I think would be good to look at and potentially address.
|
||
typedef enum ChunkCompressionStatus | ||
{ | ||
CHUNK_COMPRESS_NONE = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I was also a bit confused by this, and the inconsistent naming between enums and flags. For instance, "unordered" having the "compress" prefix in one case (CHUNK_COMPRESS_UNORDERED
) but not the other CHUNK_STATUS_UNORDERED
).
---------------------------------+--------+-----------+------------------ | ||
compressed_chunk_insert_blocker | 7 | O | _hyper_1_2_chunk | ||
(1 row) | ||
tgname | tgtype | tgenabled | relname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should now remove this whole check if the insert blocker is no longer needed?
h.table_name AS hypertable_name, | ||
c.schema_name as chunk_schema, | ||
c.table_name as chunk_name, | ||
c.status as chunk_status, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to convert the status to a human readable format with, e.g., a CASE statement. For instance, "compressed", "not_compressed", etc.
@@ -223,3 +223,158 @@ SELECT COUNT(*) AS dropped_chunks_count | |||
SELECT add_compression_policy AS job_id | |||
FROM add_compression_policy('conditions', INTERVAL '1 day') \gset | |||
CALL run_job(:job_id); | |||
\i include/recompress_basic.sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just note that most of the added testing here seems to be manual compression/recompression and not related to background jobs. Only found one call to add_compression_policy
, so the question is if this is the appropriate test file to have the majority of these tests?
I have a question about the compression policy. Now that we do either compress or recompress in the policy, together with the fact that the policy only compresses one chunk in each run, is there a risk that we end up in a state where the compression policy makes no real progress and keeps recompressing the same chunk every time it runs. I am thinking of a situation where regular inserts into the an already compressed chunk causes it to be recompressed over-and-over again by the policy. |
Suggestion: Add CHANGELOG entry. |
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. It also adds support for triggers and compression policies on distributed hypertables. The bug fixes in this release addresses issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3209 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3241 Fix assertion failure in decompress_chunk_plan_create * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. It also adds support for triggers and compression policies on distributed hypertables. The bug fixes in this release addresses issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3209 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3241 Fix assertion failure in decompress_chunk_plan_create * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. It also adds support for triggers and compression policies on distributed hypertables. The bug fixes in this release addresses issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3209 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3241 Fix assertion failure in decompress_chunk_plan_create * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code
Yes, I think this is possible if inserts keep going back to the same chunk. Sven and I discussed a mitigation strategy for this: basically allow users to separate this into 2 separate policies: a) compression_policy and b) a recompression policy so that the recompression does not affect the compression. Plan to make these changes. |
41a3a37
to
da1d9da
Compare
|
||
-- test if default value for b and sequence value for id is used | ||
INSERT INTO vessels(timec, i, t) values('2020-01-02 10:16:00-05' , 11, 'default' ); | ||
COPY vessels(timec,i,t )FROM STDIN DELIMITER ','; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also have a test with COPY directly into the chunk when it is compressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It needs to go through chunk_insert_state to get directed to the correct chunk.
281b29a
to
a1e9bf5
Compare
Add CompressRowSingleState . This has functions to compress a single row.
Support defaults, sequences and check constraints with inserts into compressed chunks
78d27b0
to
8d9e40b
Compare
Compressed chunks with inserts after being compressed have batches that are not ordered according to compress_orderby for those chunks we cannot set pathkeys on the DecompressChunk node and we need an extra sort step if we require ordered output from those chunks.
Remove the chunk_dml_blocker trigger which was used to prevent INSERTs into compressed chunks.
8d9e40b
to
0d87a01
Compare
After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.
Add a test case for copy on distr. hypertables with compressed chunks. verifies that recompress_chunk and compression policy work as expected. Additional changes include: Clean up commented code Make use of BulkInsertState optional in row compressor Add test for insert into compressed chunk by a different role other than the owner
Two insert transactions could potentially try to update the chunk status to unordered. This results in one of the transactions failing with a tuple concurrently update error. Before updating status, lock the tuple for update, thus forcing the other transaction to wait for the tuple lock, then check status column value and update it if needed.
This patch adds a recompress procedure that may be used as custom job when compression and recompression should run as separate background jobs.
0d87a01
to
b7ea315
Compare
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3213 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3213 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3213 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3213 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * timescale#3116 Add distributed hypertable compression policies * timescale#3162 Use COPY when executing distributed INSERTs * timescale#3199 Add GENERATED column support on distributed hypertables * timescale#3210 Add trigger support on distributed hypertables * timescale#3230 Support for inserts into compressed chunks **Bugfixes** * timescale#3213 Propagate grants to compressed hypertables * timescale#3229 Use correct lock mode when updating chunk * timescale#3243 Fix assertion failure in decompress_chunk_plan_create * timescale#3250 Fix constraint triggers on hypertables * timescale#3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * timescale#3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * #3116 Add distributed hypertable compression policies * #3162 Use COPY when executing distributed INSERTs * #3199 Add GENERATED column support on distributed hypertables * #3210 Add trigger support on distributed hypertables * #3230 Support for inserts into compressed chunks **Bugfixes** * #3213 Propagate grants to compressed hypertables * #3229 Use correct lock mode when updating chunk * #3243 Fix assertion failure in decompress_chunk_plan_create * #3250 Fix constraint triggers on hypertables * #3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * #3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * #3116 Add distributed hypertable compression policies * #3162 Use COPY when executing distributed INSERTs * #3199 Add GENERATED column support on distributed hypertables * #3210 Add trigger support on distributed hypertables * #3230 Support for inserts into compressed chunks **Bugfixes** * #3213 Propagate grants to compressed hypertables * #3229 Use correct lock mode when updating chunk * #3243 Fix assertion failure in decompress_chunk_plan_create * #3250 Fix constraint triggers on hypertables * #3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * #3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
This release adds major new features since the 2.2.1 release. We deem it moderate priority for upgrading. This release adds support for inserting data into compressed chunks and improves performance when inserting data into distributed hypertables. Distributed hypertables now also support triggers and compression policies. The bug fixes in this release address issues related to the handling of privileges on compressed hypertables, locking, and triggers with transition tables. **Features** * #3116 Add distributed hypertable compression policies * #3162 Use COPY when executing distributed INSERTs * #3199 Add GENERATED column support on distributed hypertables * #3210 Add trigger support on distributed hypertables * #3230 Support for inserts into compressed chunks **Bugfixes** * #3213 Propagate grants to compressed hypertables * #3229 Use correct lock mode when updating chunk * #3243 Fix assertion failure in decompress_chunk_plan_create * #3250 Fix constraint triggers on hypertables * #3251 Fix segmentation fault due to incorrect call to chunk_scan_internal * #3252 Fix blocking triggers with transition tables **Thanks** * @yyjdelete for reporting a crash with decompress_chunk and identifying the bug in the code * @fabriziomello for documenting the prerequisites when compiling against PostgreSQL 13
Disable-check: commit-count