New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add invalidations for incomplete aggregates #2403
Conversation
@WireBaron Is it possible to test this somehow? Perhaps in an update script test that compares refreshes in a new region on an updated instance vs a freshly installed one? |
sql/updates/latest-dev.sql
Outdated
@@ -303,6 +303,14 @@ SELECT pg_catalog.pg_extension_config_dump('_timescaledb_config.bgw_job', 'WHERE | |||
GRANT SELECT ON _timescaledb_config.bgw_job TO PUBLIC; | |||
GRANT SELECT ON _timescaledb_config.bgw_job_id_seq TO PUBLIC; | |||
|
|||
-- capture invalidations for any caggs which are not yet complete | |||
INSERT INTO _timescaledb_catalog.continuous_aggs_materialization_invalidation_log | |||
SELECT materialization_id, BIGINT '-9223372036854775808', complete.watermark, invalid.watermark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new invalidated region is [complete.watermark, infinity]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need a test for this in test/sql/updates.
ee75580
to
0a49013
Compare
Codecov Report
@@ Coverage Diff @@
## master #2403 +/- ##
==========================================
- Coverage 90.10% 90.10% -0.01%
==========================================
Files 213 213
Lines 34336 34356 +20
==========================================
+ Hits 30940 30956 +16
- Misses 3396 3400 +4
Continue to review full report at Codecov.
|
0a49013
to
bb63ea0
Compare
INSERT INTO inval_test | ||
SELECT tm.tval, loc.lval, temp | ||
FROM (SELECT generate_series(now() + '30 seconds'::interval, now() + '4 hours'::interval, '30 seconds') AS tval) tm | ||
CROSS JOIN (select 'loc_' || generate_series(1,3,1) AS lval) loc, | ||
generate_series(70.0, 90.0, 1440) temp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this query does not look right, the now() will guarantee that this test will be flaky and the generate_series(70.0, 90.0, 1440)
just returns a single row so you could just replace it by 70.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above about now() just being used to generate a set number of rows above the invalidation threshold. If you still think this will be flaky, I can change this to an arbitrarily far in the future data. Fixed the generate series.
INSERT INTO inval_test | ||
SELECT tm.tval, loc.lval, temp | ||
FROM (SELECT generate_series(now() - '4 hours'::interval, now(), '30 seconds') AS tval) tm | ||
CROSS JOIN (select 'loc_' || generate_series(1,3,1) AS lval) loc, | ||
generate_series(50.0, 70.0, 1443) temp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this query does not look right, the now() will guarantee that this test will be flaky and the generate_series(50.0, 90.0, 1443) just returns a single row so you could just replace it by 50.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using now() here to ensure that the data exists within the refresh interval for the continuous agg we're about to commit. I suppose I could just set the refresh window to be near infinite, but in this case it should be safe to use now() since all we care about is the number of rows created. The generate_series is definitely wrong though, fixed it to 20.0 / 1443.
WITH ( timescaledb.continuous, timescaledb.materialized_only=true, | ||
timescaledb.refresh_lag='-2 hour', | ||
timescaledb.ignore_invalidation_older_than = '24 hour', | ||
timescaledb.refresh_interval='20 minutes' ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to set max_interval_per_job to a big value if the intent here is to get all the data processed by the REFRESH. e.g. max_interval_per_job of 72 hours etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this parameter.
|
||
REFRESH MATERIALIZED VIEW mat_inval; | ||
|
||
INSERT INTO inval_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time range is not in the original data set . So don't think you have any invalidations for the last completed threshold range.
The inserts here should be of 2 kinds.
- generate invalidations: a subset of the range from the insert at line 271.
- New data (if you are testing this case as well) for a new range like at line 311.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually intended to all be new data. I believe we already have other tests for existing invalidations, what this test is intended to verify is that we generate a new invalidation that covers new data when updating to 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the refresh and query post upgrade done int test-rerun.v6.sql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I needed to move that from the post script to the rerun script, as the post script isn't called on the rerun container (the post was just run on the upgraded container and a dump/restore of the upgraded container, both of which had the wrong data and hence no diff).
bb63ea0
to
c2081a2
Compare
c2081a2
to
6bed01c
Compare
df6120d
to
84fc706
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving.
The update test is a little bit hard to follow, but I assume you have tried the update test without adding the invalidation during upgrade with a failed result?
|
||
REFRESH MATERIALIZED VIEW mat_inval; | ||
|
||
INSERT INTO inval_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the refresh and query post upgrade done int test-rerun.v6.sql?
@@ -263,3 +263,49 @@ BEGIN | |||
END $$; | |||
|
|||
REFRESH MATERIALIZED VIEW mat_ignoreinval; | |||
|
|||
-- test proper invalidations after update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you modify the comment here? This test case is for inserts after a REFRESH. (We have been using the term invalidations to refer to changes to the raw data that would invalidate a previously computed materialization). These inserts (lines 309-311) are not logged into the invalidation_log catalog tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's confusing as the terminology has slightly changed in 2.0 (we now also have invalidations for regions that have never been refreshed). I'm going with test new data beyond the invalidation threshold is properly handled
here, let me know if that works.
sql/updates/latest-dev.sql
Outdated
@@ -303,6 +303,11 @@ SELECT pg_catalog.pg_extension_config_dump('_timescaledb_config.bgw_job', 'WHERE | |||
GRANT SELECT ON _timescaledb_config.bgw_job TO PUBLIC; | |||
GRANT SELECT ON _timescaledb_config.bgw_job_id_seq TO PUBLIC; | |||
|
|||
-- set up future invalidation for current caggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- set up future invalidation for current caggs | |
-- Add entry to materialization invalidation log to indicate that [watermark, +infinity) is invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -303,6 +303,11 @@ SELECT pg_catalog.pg_extension_config_dump('_timescaledb_config.bgw_job', 'WHERE | |||
GRANT SELECT ON _timescaledb_config.bgw_job TO PUBLIC; | |||
GRANT SELECT ON _timescaledb_config.bgw_job_id_seq TO PUBLIC; | |||
|
|||
-- set up future invalidation for current caggs | |||
INSERT INTO _timescaledb_catalog.continuous_aggs_materialization_invalidation_log | |||
SELECT materialization_id, BIGINT '-9223372036854775808', watermark, 9223372036854775807 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a new cagg that has been never refreshed , do we need an entry in the cont agg materialization log that says [-infinity, +infinity) is invalid? (cc @erimatnor )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like that's pretty tight window. I believe the background job gets launched as soon as we create the materialized view in 1.7.
1c8e07e
to
2d90304
Compare
I did add a new invalidation for (-inf, inf) when updating 1.7 caggs that have no completion_threshold. Unfortunately I was unable to add a test to reliably demonstrate the correctness of this change in the update test. |
One question was whether we should add an invalidation for [-Inf, ignore_invalidation_older_than] to allow refreshing of the region where invalidations previously had been ignored. Need not be in this PR, though. |
Given how long this has been in review, I'm going to go ahead and push this now that it's finally cleared to go. I'll add a new issue for this older region though. |
As part of the 2.0 continous aggregate changes, we are removing the continuous_aggs_completed_threshold table. However, this may result in currently running aggregates being considered complete even if their completed threshold hadn't reached the invalidation threshold. This change fixes this by adding an entry to the invalidation log for any such aggregates. Fixes timescale#2314
2d90304
to
dc07552
Compare
As part of the 2.0 continous aggregate changes, we are removing the
continuous_aggs_completed_threshold table. However, this may result
in currently running aggregates being considered complete even if
their completed threshold hadn't reached the invalidation threshold.
This change fixes this by adding an entry to the invalidation log
for any such aggregates.
Fixes #2314