Reset baserel cache on invalid hypertable cache #5086

mkindahl · 2022-12-13T08:46:36Z

When popping the hypertable cache stack, it might happen that the hypertable cache was invalidated between the push and the pop. In that case, the baserel cache can contain invalid entries pointing to the now popped hypertable cache, so we reset the baserel cache.

Fixes #4795

codecov · 2022-12-13T08:59:16Z

Codecov Report

Merging #5086 (7920d89) into main (c76dfa0) will decrease coverage by 0.00%.
The diff coverage is 90.00%.

❗ Current head 7920d89 differs from pull request most recent head cf908a0. Consider uploading reports for the commit cf908a0 to get more accurate results

@@            Coverage Diff             @@
##             main    #5086      +/-   ##
==========================================
- Coverage   89.59%   89.59%   -0.01%     
==========================================
  Files         227      227              
  Lines       51586    51619      +33     
==========================================
+ Hits        46219    46246      +27     
- Misses       5367     5373       +6

Impacted Files	Coverage Δ
src/chunk_scan.c	`98.05% <ø> (ø)`
tsl/src/compression/compression.c	`96.32% <ø> (-0.01%)`	⬇️
src/hypertable_restrict_info.c	`90.75% <77.77%> (-0.39%)`	⬇️
src/guc.c	`94.11% <100.00%> (+0.11%)`	⬆️
src/planner/planner.c	`95.85% <100.00%> (+0.03%)`	⬆️
src/process_utility.c	`90.20% <100.00%> (+<0.01%)`	⬆️
src/loader/bgw_message_queue.c	`86.36% <0.00%> (-2.28%)`	⬇️
src/loader/loader.c	`94.38% <0.00%> (-0.26%)`	⬇️
tsl/src/bgw_policy/job.c	`88.31% <0.00%> (-0.05%)`	⬇️
src/planner/expand_hypertable.c	`94.24% <0.00%> (+<0.01%)`	⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df16815...cf908a0. Read the comment docs.

akuzm

I think this works as a quick fix. Would be very good to have a repro of this problem.

mkindahl · 2022-12-13T09:12:47Z

I think this works as a quick fix. Would be very good to have a repro of this problem.

Yes, this is a quick fix. We need to re-work this code.

Not sure how easy it is to reproduce, but investigating if this is possible.

jnidzwetzki

The changes look good to me.

The changes in .github/workflows/sanitizer-build-and-test.yaml should be reverted before merging to prevent the OOMs in our regular CI runs.

src/planner/planner.c

When popping the hypertable cache stack, it might happen that the hypertable cache was invalidated between the push and the pop. In that case, the baserel cache can contain invalid entries pointing to the now popped hypertable cache, so we reset the baserel cache. Fixes timescale#4795

svenklemm · 2022-12-13T16:37:23Z

Wouldnt it be better to decouple the baserel cache from the hypertable cache

mkindahl · 2022-12-13T16:38:33Z

Wouldnt it be better to decouple the baserel cache from the hypertable cache

Yes, but this is more work and we want to get the 2.9.0 release out. I was planning on refactoring this code to decouple the caches.

@byazici

This release adds major new features since the 2.8.1 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Hierarchical Continuous Aggregates (aka Continuous Aggregate on top of another Continuous Aggregate) * Improve `time_bucket_gapfill` function to allow specifying the timezone to bucket * Introduce fixed schedules for background jobs and the ability to check job errors. * Use `alter_data_node()` to change the data node configuration. This function introduces the option to configure the availability of the data node. This release also includes several bug fixes. **Features** * #4476 Batch rows on access node for distributed COPY * #4567 Exponentially backoff when out of background workers * #4650 Show warnings when not following best practices * #4664 Introduce fixed schedules for background jobs * #4668 Hierarchical Continuous Aggregates * #4670 Add timezone support to time_bucket_gapfill * #4678 Add interface for troubleshooting job failures * #4718 Add ability to merge chunks while compressing * #4786 Extend the now() optimization to also apply to CURRENT_TIMESTAMP * #4820 Support parameterized data node scans in joins * #4830 Add function to change configuration of a data nodes * #4966 Handle DML activity when datanode is not available * #4971 Add function to drop stale chunks on a datanode **Bugfixes** * #4663 Don't error when compression metadata is missing * #4673 Fix now() constification for VIEWs * #4681 Fix compression_chunk_size primary key * #4696 Report warning when enabling compression on hypertable * #4745 Fix FK constraint violation error while insert into hypertable which references partitioned table * #4756 Improve compression job IO performance * #4770 Continue compressing other chunks after an error * #4794 Fix degraded performance seen on timescaledb_internal.hypertable_local_size() function * #4807 Fix segmentation fault during INSERT into compressed hypertable * #4822 Fix missing segmentby compression option in CAGGs * #4823 Fix a crash that could occur when using nested user-defined functions with hypertables * #4840 Fix performance regressions in the copy code * #4860 Block multi-statement DDL command in one query * #4898 Fix cagg migration failure when trying to resume * #4904 Remove BitmapScan support in DecompressChunk * #4906 Fix a performance regression in the query planner by speeding up frozen chunk state checks * #4910 Fix a typo in process_compressed_data_out * #4918 Cagg migration orphans cagg policy * #4941 Restrict usage of the old format (pre 2.7) of continuous aggregates in PostgreSQL 15. * #4955 Fix cagg migration for hypertables using timestamp without timezone * #4968 Check for interrupts in gapfill main loop * #4988 Fix cagg migration crash when refreshing the newly created cagg * #5054 Fix segfault after second ANALYZE * #5086 Reset baserel cache on invalid hypertable cache **Thanks** * @byazici for reporting a problem with segmentby on compressed caggs * @jflambert for reporting a crash with nested user-defined functions. * @jvanns for reporting hypertable FK reference to vanilla PostgreSQL partitioned table doesn't seem to work * @kou for fixing a typo in process_compressed_data_out * @kyrias for reporting a crash when ANALYZE is executed on extended query protocol mode with extension loaded. * @tobiasdirksen for requesting Continuous aggregate on top of another continuous aggregate * @Xima for reporting a bug in Cagg migration * @xvaara for helping reproduce a bug with bitmap scans in transparent decompression

@byazici

This release adds major new features since the 2.8.1 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Hierarchical Continuous Aggregates (aka Continuous Aggregate on top of another Continuous Aggregate) * Improve `time_bucket_gapfill` function to allow specifying the timezone to bucket * Introduce fixed schedules for background jobs and the ability to check job errors. * Use `alter_data_node()` to change the data node configuration. This function introduces the option to configure the availability of the data node. This release also includes several bug fixes. **Features** * #4476 Batch rows on access node for distributed COPY * #4567 Exponentially backoff when out of background workers * #4650 Show warnings when not following best practices * #4664 Introduce fixed schedules for background jobs * #4668 Hierarchical Continuous Aggregates * #4670 Add timezone support to time_bucket_gapfill * #4678 Add interface for troubleshooting job failures * #4718 Add ability to merge chunks while compressing * #4786 Extend the now() optimization to also apply to CURRENT_TIMESTAMP * #4820 Support parameterized data node scans in joins * #4830 Add function to change configuration of a data nodes * #4966 Handle DML activity when datanode is not available * #4971 Add function to drop stale chunks on a datanode **Bugfixes** * #4663 Don't error when compression metadata is missing * #4673 Fix now() constification for VIEWs * #4681 Fix compression_chunk_size primary key * #4696 Report warning when enabling compression on hypertable * #4745 Fix FK constraint violation error while insert into hypertable which references partitioned table * #4756 Improve compression job IO performance * #4770 Continue compressing other chunks after an error * #4794 Fix degraded performance seen on timescaledb_internal.hypertable_local_size() function * #4807 Fix segmentation fault during INSERT into compressed hypertable * #4822 Fix missing segmentby compression option in CAGGs * #4823 Fix a crash that could occur when using nested user-defined functions with hypertables * #4840 Fix performance regressions in the copy code * #4860 Block multi-statement DDL command in one query * #4898 Fix cagg migration failure when trying to resume * #4904 Remove BitmapScan support in DecompressChunk * #4906 Fix a performance regression in the query planner by speeding up frozen chunk state checks * #4910 Fix a typo in process_compressed_data_out * #4918 Cagg migration orphans cagg policy * #4941 Restrict usage of the old format (pre 2.7) of continuous aggregates in PostgreSQL 15. * #4955 Fix cagg migration for hypertables using timestamp without timezone * #4968 Check for interrupts in gapfill main loop * #4988 Fix cagg migration crash when refreshing the newly created cagg * #5054 Fix segfault after second ANALYZE * #5086 Reset baserel cache on invalid hypertable cache **Thanks** * @byazici for reporting a problem with segmentby on compressed caggs * @jflambert for reporting a crash with nested user-defined functions. * @jvanns for reporting hypertable FK reference to vanilla PostgreSQL partitioned table doesn't seem to work * @kou for fixing a typo in process_compressed_data_out * @kyrias for reporting a crash when ANALYZE is executed on extended query protocol mode with extension loaded. * @tobiasdirksen for requesting Continuous aggregate on top of another continuous aggregate * @Xima for reporting a bug in Cagg migration * @xvaara for helping reproduce a bug with bitmap scans in transparent decompression

mkindahl force-pushed the reset-baserel-cache branch from 59dc971 to 61c209b Compare December 13, 2022 08:50

akuzm approved these changes Dec 13, 2022

View reviewed changes

mkindahl self-assigned this Dec 13, 2022

jnidzwetzki approved these changes Dec 13, 2022

View reviewed changes

src/planner/planner.c Outdated Show resolved Hide resolved

mkindahl force-pushed the reset-baserel-cache branch from 7920d89 to f7a8a28 Compare December 13, 2022 15:44

mkindahl marked this pull request as ready for review December 13, 2022 15:47

mkindahl force-pushed the reset-baserel-cache branch from f7a8a28 to cf908a0 Compare December 13, 2022 15:50

mkindahl enabled auto-merge (rebase) December 13, 2022 17:05

mkindahl merged commit 558688c into timescale:main Dec 13, 2022

mkindahl deleted the reset-baserel-cache branch December 13, 2022 18:48

mkindahl mentioned this pull request Dec 14, 2022

Set cache pointer to null before destroying #5070

Closed

fabriziomello mentioned this pull request Jan 6, 2023

Assertion IS_VALID_CHUNK(chunk) fails in CI run #4900

Closed

akuzm mentioned this pull request Jan 25, 2023

Assertion failure in ts_hypertable_has_compression_table in CI #5029

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reset baserel cache on invalid hypertable cache #5086

Reset baserel cache on invalid hypertable cache #5086

mkindahl commented Dec 13, 2022 •

edited

codecov bot commented Dec 13, 2022 •

edited

akuzm left a comment

mkindahl commented Dec 13, 2022 •

edited

jnidzwetzki left a comment

svenklemm commented Dec 13, 2022

mkindahl commented Dec 13, 2022

Reset baserel cache on invalid hypertable cache #5086

Reset baserel cache on invalid hypertable cache #5086

Conversation

mkindahl commented Dec 13, 2022 • edited

codecov bot commented Dec 13, 2022 • edited

Codecov Report

akuzm left a comment

Choose a reason for hiding this comment

mkindahl commented Dec 13, 2022 • edited

jnidzwetzki left a comment

Choose a reason for hiding this comment

svenklemm commented Dec 13, 2022

mkindahl commented Dec 13, 2022

mkindahl commented Dec 13, 2022 •

edited

codecov bot commented Dec 13, 2022 •

edited

mkindahl commented Dec 13, 2022 •

edited