New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge adjacent invalidations during refresh #2440
Merge adjacent invalidations during refresh #2440
Conversation
b82b8c3
to
7a52a7b
Compare
75b9c59
to
81dade6
Compare
Codecov Report
@@ Coverage Diff @@
## master #2440 +/- ##
=======================================
Coverage 90.10% 90.11%
=======================================
Files 213 213
Lines 34336 34356 +20
=======================================
+ Hits 30940 30961 +21
+ Misses 3396 3395 -1
Continue to review full report at Codecov.
|
static bool | ||
invalidations_can_be_merged(const Invalidation *a, const Invalidation *b) | ||
{ | ||
/* To account for adjacency, expand one window 1 step in each | ||
* direction. This makes adjacent invalidations overlapping. */ | ||
int64 a_start = int64_saturating_sub(a->lowest_modified_value, 1); | ||
int64 a_end = int64_saturating_add(a->greatest_modified_value, 1); | ||
|
||
return a_end >= b->lowest_modified_value && a_start <= b->greatest_modified_value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it guaranteed that the invalidations are ordered so that a << b
? If that is the case, an assertion would be good, if not, testing both orders is probably good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the question. This function checks for overlap of two ranges and order shouldn't matter. Either they overlap or they don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I misread the code. Now I see what you did.
static bool | ||
invalidations_can_be_merged(const Invalidation *a, const Invalidation *b) | ||
{ | ||
/* To account for adjacency, expand one window 1 step in each | ||
* direction. This makes adjacent invalidations overlapping. */ | ||
int64 a_start = int64_saturating_sub(a->lowest_modified_value, 1); | ||
int64 a_end = int64_saturating_add(a->greatest_modified_value, 1); | ||
|
||
return a_end >= b->lowest_modified_value && a_start <= b->greatest_modified_value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I misread the code. Now I see what you did.
81dade6
to
a7d135c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I have two nits. See the comments.
* matches a window, and, optionally, adds the invalidation segments covered | ||
* by the window to the invalidation store in the passed in state. These |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this at all. What is the invalidation store? I know about hypertable invalidation log and cagg (or materialized hypertable) invalidation log. Also how is optionally
specified, i.e., when is it done and when not?
What does the passed in state
mean?
Taking in account my current knowledge about invalidation and working with it I am not be able to understand. I guess if somebody will need to deal with it later, will have hard time to understand too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function has a parameter called "state", which is of type CaggInvalidationState
. The caller passes in this state as an argument to the function and it contains the invalidation store. The invalidation store is of type InvalidationStore
(actually just a Tuplestorestate
with tuple descriptor), which is defined in this file/header, and explained more extensively in the large comment at the top of this file.
I am happy to make this more clear, but I am not sure how to do it at this point. There is only on "state" passed into this function with said name. There is only one InvalidationStore
so IMO, the reference is unambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explanation. I still don't understand why optionally
. Do you mean if the invalidation store is not null in state
? If so, then I suggest to omit optionally
as it is confusing.
Now I understand what you mean by in the passed in state
- I think there is some problem with the grammar, which didn't allow me to understand. May be:
* matches a window, and, optionally, adds the invalidation segments covered | |
* by the window to the invalidation store in the passed in state. These | |
* matches a window, and, optionally, adds the invalidation segments covered | |
* by the window to the invalidation store, which is given in the argument state. These |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed optionally and clarified the other parts.
@@ -1163,3 +1163,23 @@ ORDER BY 1,2; | |||
10 | 1 | 10 | |||
(1 row) | |||
|
|||
-- Test that adjacent invalidations are merged | |||
INSERT INTO conditions VALUES(1, 1, 1.0), (2, 1, 2.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be easier to read if it is on separate line as others, otherwise it feels like a hole between 1
and 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional in order to create a range invalidation between 1-2. Each statement is its own transaction and thus records a separate invalidation. I want to test a mix of both non-trivial (length > 1) and trivial (length=1) ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarifying!
May be:
INSERT INTO conditions VALUES(1, 1, 1.0), (2, 1, 2.0); | |
INSERT INTO conditions VALUES (1, 1, 1.0), | |
(2, 1, 2.0); |
It's not important.
a7d135c
to
f9fedd7
Compare
When setting up an index scan for invalidations, a table attribute number was used instead of the corresponding index attribute number. While the attribute numbers happened to be the same, it isn't future proof to use the wrong attribute reference.
Since invalidations are inclusive in both ends, adjacent invalidations can be merged. However, adjacency wasn't accounted for when merging invalidations, which meant that a refresh could leave more invalidations in the log than strictly necessary. Note that this didn't otherwise affect the correctness of a refresh.
This change adds views for invalidation tables to simplify queries in the test.
ee189da
to
520bf2d
Compare
Since invalidations are inclusive in both ends, adjacent invalidations
can be merged. However, adjacency wasn't accounted for when merging
invalidations, which meant that a refresh could leave more
invalidations in the log than strictly necessary. Note that this
didn't otherwise affect the correctness of a refresh.
The PR also includes the following fix:
Fix index attribute in invalidation scan
When setting up an index scan for invalidations, a table attribute
number was used instead of the corresponding index attribute
number. While the attribute numbers happened to be the same, it isn't
future proof to use the wrong attribute reference.