-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue creating dimensional constraints #5459
Fix issue creating dimensional constraints #5459
Conversation
I have not been able to recreate this bug in an isolation test, because it only happens when running really high-concurrency inserts (e.g., parallel-copy with many workers). Still, this is valuable just as a refactor, getting rid of some very legacy PL/pgSQL code. We should try to get rid of the rest of this upcall code in the future. |
3a322aa
to
7c0aaee
Compare
@shhnwz, @konskov: please review this pull request.
|
7c0aaee
to
12b67fe
Compare
Codecov Report
@@ Coverage Diff @@
## main #5459 +/- ##
=======================================
Coverage 90.73% 90.74%
=======================================
Files 229 229
Lines 53563 53655 +92
=======================================
+ Hits 48602 48689 +87
- Misses 4961 4966 +5
... and 5 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
12b67fe
to
18e7950
Compare
7e0a2f4
to
39ff5c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not been able to recreate this bug in an isolation test, because it only happens when running really high-concurrency inserts (e.g., parallel-copy with many workers).
Maybe we should start adding probabilistic tests that just simulate this high-concurrency situation for a couple of minutes. Still better than nothing. Probably can be done with TAP tests.
c1c6c9b
to
2105a06
Compare
During chunk creation, the chunk's dimensional CHECK constraints are created via an "upcall" to PL/pgSQL code. However, creating dimensional constraints in PL/pgSQL code sometimes fails, especially during high-concurrency inserts, because PL/pgSQL code scans metadata using a snapshot that might not see the same metadata as the C code. As a result, chunk creation sometimes fail during constraint creation. To fix this issue, implement dimensional CHECK-constraint creation in C code. Other constraints (FK, PK, etc.) are still created via an upcall, but should probably also be rewritten in C. However, since these constraints don't depend on recently updated metadata, this is left to a future change. Fixes timescale#5456
2105a06
to
3fae9cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall.
SELECT val > (-9223372036854775808)::bigint AND val < 9223372036854775807::bigint | ||
$BODY$ SET search_path TO pg_catalog, pg_temp; | ||
|
||
CREATE OR REPLACE FUNCTION _timescaledb_internal.dimension_slice_get_constraint_sql( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why are we still using SQL functions internally? I am thinking about performance for those calls if they are on a hot path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just legacy stuff and I think it is way overdue for a refactor into C. It also turned out to not be a great idea from a visibility/snapshot perspective.
Don't think it is that big of a performance issue, though, given that the function is only called when creating a new chunk.
Automated backport to 2.10.x not done: cherry-pick failed. Git status
|
for (int i = 0; i < chunk->cube->num_slices; i++) | ||
{ | ||
if (chunk->cube->slices[i]->fd.dimension_id == dimension_id) | ||
{ | ||
chunk->cube->slices[i] = new_slice; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to exit this loop without finding a dimension to update? If not, you might want to have a check after the loop to make sure that you actually found a dimension to update. Regardless, you might want to add a comment describing if this is OK or not.
const Dimension *dim; | ||
Constraint *constr; | ||
|
||
dim = ts_hyperspace_get_dimension_by_id(ht->space, slice->fd.dimension_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable slice
can in theory be NULL here if there is something wrong with the metadata. You might want to ensure that it is not NULL before starting to de-reference it.
dim = ts_hyperspace_get_dimension_by_id(ht->space, slice->fd.dimension_id); | |
Ensure(slice, "dimension slice was missing"); | |
dim = ts_hyperspace_get_dimension_by_id(ht->space, slice->fd.dimension_id); |
This release includes these noteworthy features: * compressed hypertable enhancements: * UPDATE/DELETE support * ON CONFLICT DO UPDATE * Join support for hierarchical Continougs Aggregates * performance improvements **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * #5212 Allow pushdown of reference table joins * #5221 Improve Realtime Continuous Aggregate performance * #5252 Improve unique constraint support on compressed hypertables * #5339 Support UPDATE/DELETE on compressed hypertables * #5344 Enable JOINS for Hierarchical Continuous Aggregates * #5361 Add parallel support for partialize_agg() * #5417 Refactor and optimize distributed COPY * #5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * #5547 Skip Ordered Append when only 1 child node is present * #5510 Propagate vacuum/analyze to compressed chunks * #5584 Reduce decompression during constraint checking * #5530 Optimize compressed chunk resorting * #5639 Support sending telemetry event reports **Bugfixes** * #5396 Fix SEGMENTBY columns predicates to be pushed down * #5427 Handle user-defined FDW options properly * #5442 Decompression may have lost DEFAULT values * #5459 Fix issue creating dimensional constraints * #5570 Improve interpolate error message on datatype mismatch * #5573 Fix unique constraint on compressed tables * #5615 Add permission checks to run_job() * #5614 Enable run_job() for telemetry job * #5578 Fix on-insert decompression after schema changes * #5613 Quote username identifier appropriately * #5525 Fix tablespace for compressed hypertable and corresponding toast * #5642 Fix ALTER TABLE SET with normal tables * #5666 Reduce memory usage for distributed analyze * #5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
During chunk creation, the chunk's dimensional CHECK constraints are created via an "upcall" to PL/pgSQL code. However, creating dimensional constraints in PL/pgSQL code sometimes fails, especially during high-concurrency inserts, because PL/pgSQL code scans metadata using a snapshot that might not see the same metadata as the C code. As a result, chunk creation sometimes fail during constraint creation.
To fix this issue, implement dimensional CHECK-constraint creation in C code. Other constraints (FK, PK, etc.) are still created via an upcall, but should probably also be rewritten in C. However, since these constraints don't depend on recently updated metadata, this is left to a future change.
Fixes #5456