New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable joins for hierarchical continuous aggregates #5344
Conversation
@nikkhils, @akuzm: please review this pull request.
|
c9c4631
to
377f788
Compare
Codecov Report
@@ Coverage Diff @@
## main #5344 +/- ##
==========================================
- Coverage 90.73% 90.73% -0.01%
==========================================
Files 229 229
Lines 53563 53584 +21
==========================================
+ Hits 48602 48620 +18
- Misses 4961 4964 +3
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
88c37cc
to
c5c8109
Compare
tsl/src/continuous_aggs/create.c
Outdated
// if (rte->relkind == RELKIND_VIEW || rte_other->relkind == RELKIND_VIEW) | ||
// ereport(ERROR, | ||
// (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), | ||
// errmsg("joins for hierarchical continuous aggregates are not supported"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
CHANGELOG.md
Outdated
@@ -43,6 +43,7 @@ Sooner to that time, we will announce the specific version of TimescaleDB in whi | |||
* #5343 Set PortalContext when starting job | |||
* #5312 Add timeout support to the ping_data_node() | |||
* #5212 Allow pushdown of reference table joins | |||
* #5344 Enable joins for hierarchical continuous aggregates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* #5344 Enable joins for hierarchical continuous aggregates | |
* #5344 Enable JOINS for Hierarchical Continuous Aggregates |
tsl/test/expected/cagg_joins-12.out
Outdated
@@ -310,7 +310,8 @@ SELECT time_bucket(INTERVAL '1 day', day) AS bucket, | |||
FROM conditions, devices_view | |||
WHERE conditions.device_id = devices_view.device_id | |||
GROUP BY name, bucket, devices_view.device_id; | |||
ERROR: joins for hierarchical continuous aggregates are not supported | |||
NOTICE: refreshing continuous aggregate "conditions_summary_daily_view" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that these statements need the mentioned feature - but don't we need real queries/plan-s to track that it works as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kgyrtkirk and perhaps we should add some JOIN tests in the caggs_on_caggs suite where we extensively test it checking plans, doing different time bucket data type tests... caggs on caggs validation tests... etc. WDYT @RafiaSabih ?
893eba2
to
ad80419
Compare
ad80419
to
131c6b0
Compare
* #5343 Set PortalContext when starting job | ||
* #5312 Add timeout support to the ping_data_node() | ||
* #5212 Allow pushdown of reference table joins | ||
* #5344 Enable JOINS for Hierarchical continuous aggregates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess your CHANGELOG entry is on the wrong place, it should reside on the Unreleased section at the beginning of the file.
\if :IS_JOIN | ||
time_bucket(:BUCKET_WIDTH_1ST, "time") AS bucket, | ||
SUM(temperature) AS temperature, | ||
device_id | ||
\else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insted of all this conditionals what about something like this in the tests that include cagg_on_cagg_common.sql
-- In the cagg_on_cagg.sql
\set EXTRA_GROUPBY_COLUMN
-- In the cagg_on_cagg_joins.sql
\set EXTRA_GROUPBY_COLUMN ,device_id
And then use it in the query in cagg_on_cagg_common.sql
:
\if :IS_JOIN | |
time_bucket(:BUCKET_WIDTH_1ST, "time") AS bucket, | |
SUM(temperature) AS temperature, | |
device_id | |
\else | |
time_bucket(:BUCKET_WIDTH_1ST, "time") AS bucket, | |
SUM(temperature) AS temperature | |
:EXTRA_GROUPBY_COLUMN |
Doing that you'll reduce the log output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have included in the main schema itself as per the previous comment.
@RafiaSabih you also should add tests for distributed hypertables like we do in |
a279300
to
5ed25ac
Compare
d5db146
to
d9357db
Compare
tsl/src/continuous_aggs/create.c
Outdated
if (rte->lateral || rte_other->lateral || | ||
(rte->relkind == RELKIND_VIEW && ts_is_hypertable(rte_other->relid)) || | ||
(rte_other->relkind == RELKIND_VIEW && ts_is_hypertable(rte->relid)) || | ||
(rte->relkind != RELKIND_VIEW && rte_other->relkind != RELKIND_VIEW && | ||
(ts_is_hypertable(rte->relid) == ts_is_hypertable(rte_other->relid)))) | ||
ereport(ERROR, | ||
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), | ||
errmsg("invalid continuous aggregate view"), | ||
errdetail("from clause can only have one hypertable and one normal table"))); | ||
errdetail("views, lateral, multiple hypertables or normal tables" | ||
" are not supported in FROM clause"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch is a bit hard to follow... isn't better to split it in separated checks and with specific message and test case for each of them??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review, I have broken it in multiple conditions. Regarding the test, they are already covered.
d9357db
to
608beb7
Compare
The joins could be between a continuous aggregate and hypertable, continuous aggregate and a regular Postgres table, and continuous aggregate and a regular Postgres view.
608beb7
to
c9c4fe6
Compare
looks like there is a spelling error in the commit message (heirarchical instead of hierarchical)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the extensive tests, please remember to fix the commit message before merging
foreach (l, inp->final_userquery->jointree->fromlist) | ||
{ | ||
Node *jtnode = (Node *) lfirst(l); | ||
JoinExpr *join = NULL; | ||
if (IsA(jtnode, JoinExpr)) | ||
{ | ||
join = castNode(JoinExpr, jtnode); | ||
RangeTblEntry *jrte = rt_fetch(join->rtindex, inp->final_userquery->rtable); | ||
rte->joinaliasvars = jrte->joinaliasvars; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is needed?? Can u add a comment here. I also asked for the same in the other PR https://github.com/timescale/timescaledb/pull/5441/files#r1150565424
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This release includes these noteworthy features: * compressed hypertable enhancements: * UPDATE/DELETE support * ON CONFLICT DO UPDATE * Join support for hierarchical Continougs Aggregates * performance improvements **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * #5212 Allow pushdown of reference table joins * #5221 Improve Realtime Continuous Aggregate performance * #5252 Improve unique constraint support on compressed hypertables * #5339 Support UPDATE/DELETE on compressed hypertables * #5344 Enable JOINS for Hierarchical Continuous Aggregates * #5361 Add parallel support for partialize_agg() * #5417 Refactor and optimize distributed COPY * #5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * #5547 Skip Ordered Append when only 1 child node is present * #5510 Propagate vacuum/analyze to compressed chunks * #5584 Reduce decompression during constraint checking * #5530 Optimize compressed chunk resorting * #5639 Support sending telemetry event reports **Bugfixes** * #5396 Fix SEGMENTBY columns predicates to be pushed down * #5427 Handle user-defined FDW options properly * #5442 Decompression may have lost DEFAULT values * #5459 Fix issue creating dimensional constraints * #5570 Improve interpolate error message on datatype mismatch * #5573 Fix unique constraint on compressed tables * #5615 Add permission checks to run_job() * #5614 Enable run_job() for telemetry job * #5578 Fix on-insert decompression after schema changes * #5613 Quote username identifier appropriately * #5525 Fix tablespace for compressed hypertable and corresponding toast * #5642 Fix ALTER TABLE SET with normal tables * #5666 Reduce memory usage for distributed analyze * #5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
This release contains new features and bug fixes since the 2.10.3 release. We deem it moderate priority for upgrading. This release includes these noteworthy features: * Support for DML operations on compressed chunks: * UPDATE/DELETE support * Support for unique constraints on compressed chunks * Support for `ON CONFLICT DO UPDATE` * Support for `ON CONFLICT DO NOTHING` * Join support for hierarchical Continuous Aggregates **Features** * timescale#5212 Allow pushdown of reference table joins * timescale#5221 Improve Realtime Continuous Aggregate performance * timescale#5252 Improve unique constraint support on compressed hypertables * timescale#5339 Support UPDATE/DELETE on compressed hypertables * timescale#5344 Enable JOINS for Hierarchical Continuous Aggregates * timescale#5361 Add parallel support for partialize_agg() * timescale#5417 Refactor and optimize distributed COPY * timescale#5454 Add support for ON CONFLICT DO UPDATE for compressed hypertables * timescale#5547 Skip Ordered Append when only 1 child node is present * timescale#5510 Propagate vacuum/analyze to compressed chunks * timescale#5584 Reduce decompression during constraint checking * timescale#5530 Optimize compressed chunk resorting * timescale#5639 Support sending telemetry event reports **Bugfixes** * timescale#5396 Fix SEGMENTBY columns predicates to be pushed down * timescale#5427 Handle user-defined FDW options properly * timescale#5442 Decompression may have lost DEFAULT values * timescale#5459 Fix issue creating dimensional constraints * timescale#5570 Improve interpolate error message on datatype mismatch * timescale#5573 Fix unique constraint on compressed tables * timescale#5615 Add permission checks to run_job() * timescale#5614 Enable run_job() for telemetry job * timescale#5578 Fix on-insert decompression after schema changes * timescale#5613 Quote username identifier appropriately * timescale#5525 Fix tablespace for compressed hypertable and corresponding toast * timescale#5642 Fix ALTER TABLE SET with normal tables * timescale#5666 Reduce memory usage for distributed analyze * timescale#5668 Fix subtransaction resource owner **Thanks** * @kovetskiy and @DZDomi for reporting peformance regression in Realtime Continuous Aggregates * @ollz272 for reporting an issue with interpolate error messages
The joins could be between a continuous aggregate and hypertable, continuous aggregate and a regular Postgres table, and continuous aggregate and a regular Postgres view.