Cap invalidation threshold at last data bucket #2338

erimatnor · 2020-09-08T08:32:08Z

When refreshing with an "infinite" refresh window going forward in
time, the invalidation threshold is also moved forward to the end of
the valid time range. This effectively renders the invalidation
threshold useless, leading to unnecessary write amplification.

To handle infinite refreshes better, this change caps the refresh
window at the end of the last bucket of data in the underlying
hypertable, as to not move the invalidation threshold further than
necessary. For instance, if the max time value in the hypertable is
11, a refresh command such as:

CALL refresh_continuous_aggregate(NULL, NULL);

would be turned into

CALL refresh_continuous_aggregate(NULL, 20);

assuming that a bucket starts at 10 and ends at 20 (exclusive). Thus
the invalidation threshold would at most move to 20, allowing the
threshold to still do its work once time again moves forward and
beyond it.

Note that one must never process invalidations beyond the invalidation
threshold without also moving it, as that would clear that area from
invalidations and thus prohibit refreshing that region once the
invalidation threshold is moved forward. Therefore, if we do not move
the threshold further than a certain point, we cannot refresh beyond
it either. An alternative, and perhaps safer, approach would be to
always invalidate the region over which the invalidation threshold is
moved (i.e., new_threshold - old_threshold). However, that is left for
a future change.

It would be possible to also cap non-infinite refreshes, e.g.,
refreshes that end at a higher time value than the max time value in
the hypertable. However, when an explicit end is specified, it might
be on purpose so optimizing this case is also left for the future.

Closes #2333

codecov · 2020-09-08T08:45:31Z

Codecov Report

Merging #2338 into master will increase coverage by 0.25%.
The diff coverage is 94.26%.

@@            Coverage Diff             @@
##           master    #2338      +/-   ##
==========================================
+ Coverage   90.13%   90.39%   +0.25%     
==========================================
  Files         212      213       +1     
  Lines       34391    34898     +507     
==========================================
+ Hits        31000    31545     +545     
+ Misses       3391     3353      -38

Impacted Files	Coverage Δ
src/catalog.h	`100.00% <ø> (ø)`
src/compat.h	`100.00% <ø> (ø)`
src/plan_expand_hypertable.c	`94.13% <ø> (ø)`
src/utils.h	`100.00% <ø> (ø)`
tsl/src/continuous_aggs/job.c	`0.00% <ø> (-100.00%)`	⬇️
tsl/src/fdw/data_node_scan_plan.c	`97.15% <ø> (ø)`
tsl/src/init.c	`88.88% <ø> (ø)`
tsl/test/src/test_chunk_stats.c	`100.00% <ø> (ø)`
tsl/src/remote/txn.c	`87.63% <46.66%> (-1.17%)`	⬇️
src/cross_module_fn.c	`56.81% <50.00%> (-0.67%)`	⬇️
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f32439...f0e3bfb. Read the comment docs.

pmwkaa

Looks good

pmwkaa · 2020-09-08T10:44:56Z

src/hypertable.c

+								0 /*count*/);
+	if (res < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INTERNAL_ERROR),


I wonder, if this case is possible and should we test for it

I don't know a good way to make SPI fail. Maybe inject a bad query; but it seems odd to deliberately do that. I think we have to trust that SPI works the way it should.

k-rus · 2020-09-08T12:12:37Z

@erimatnor Is it applicable only in the case of refresh policy or with manual refresh too? From the commit message it sounds that both ways are covered.

erimatnor · 2020-09-08T14:17:17Z

@erimatnor Is it applicable only in the case of refresh policy or with manual refresh too? From the commit message it sounds that both ways are covered.

Both are covered. The policy uses the same "manual" refresh API underneath the hood.

WireBaron

Just some minor questions/comments.

src/hypertable.c

WireBaron · 2020-09-08T22:55:33Z

src/hypertable.c

+	if (SPI_connect() != SPI_OK_CONNECT)
+		elog(ERROR, "could not connect to SPI");
+
+	res = SPI_execute_with_args(command->data,


Why not just SPI_execute here?

Yes, that's a good question. This was code that existed previously, which I moved here for code-reuse purposes. I will see if a simple SPI_execute works (it should).

WireBaron · 2020-09-08T23:06:50Z

tsl/src/continuous_aggs/invalidation_threshold.c

+ *
+ * The new invalidation threshold returned is the end of the given refresh
+ * window, unless it ends at "infinity" in which case the threshold is capped
+ * at the end of the last bucket materialized.


Looks like this will return infinity when that's the end of the refresh window and there's no materialized data yet. Is this correct? Should you mention this case in the comment here?

Good catch. Not exactly intended, not wrong either, but certainly non-optimal. I fixed this so that it returns the min time value in that case and we avoid moving the threshold unnecessarily. I added extra test cases to cover this corner-case.

k-rus · 2020-09-09T08:33:20Z

@erimatnor A suggestion to improve the commit message, the second paragraph:

To handle infinite refreshes better, this change caps the invalidation
forward to the end of the last
bucket of data in the underlying hypertable. Subsequently the refresh window
is caped backward from +infinity to the end of the last
bucket of data in the underlying hypertable. Caping the refresh window is necessary,
since one cannot
refresh and process invalidations beyond the invalidation threshold,
as that would clear that area from invalidations and thus prohibit
refreshing that region once the invalidation threshold is moved
forward. An alternative, and perhaps safer, approach would be to
always invalidate the region over which the invalidation threshold is
moved (i.e., new_threshold - old_threshold). However, that is left for
a future change.

erimatnor · 2020-09-09T11:02:55Z

@erimatnor A suggestion to improve the commit message, the second paragraph:

To handle infinite refreshes better, this change caps the invalidation
forward to the end of the last
bucket of data in the underlying hypertable. Subsequently the refresh window
is caped backward from +infinity to the end of the last
bucket of data in the underlying hypertable. Caping the refresh window is necessary,
since one cannot
refresh and process invalidations beyond the invalidation threshold,
as that would clear that area from invalidations and thus prohibit
refreshing that region once the invalidation threshold is moved
forward. An alternative, and perhaps safer, approach would be to
always invalidate the region over which the invalidation threshold is
moved (i.e., new_threshold - old_threshold). However, that is left for
a future change.

I changed to commit message, hopefully to your liking. I didn't really understand your suggested change, however, in particular "Subsequently the refresh window is caped backward from +infinity to the end of the last bucket...".

k-rus

LGTM
Few nits.

src/hypertable.c

k-rus · 2020-09-09T10:39:47Z

tsl/src/continuous_aggs/invalidation_threshold.c

-bool
-continuous_agg_invalidation_threshold_set(int32 raw_hypertable_id, int64 invalidation_threshold)
+int64
+invalidation_threshold_set_or_get(int32 raw_hypertable_id, int64 invalidation_threshold)


set is confusing, since it updates value if applicable. I was already confused by the original function name when reviewed another PR. May be update?
set_or_get is confusing with or, since it always gets up-to-date threshold value. I think or_get can be omitted.
So the suggestion:

Suggested change

invalidation_threshold_set_or_get(int32 raw_hypertable_id, int64 invalidation_threshold)

invalidation_threshold_update(int32 raw_hypertable_id, int64 invalidation_threshold)

I wanted to point out that the existing threshold is returned, which is not clear if it is just called update. Update, like set, implies it is always updated, but it is only set if the new value is greater than the old one.

set_or_get is literally what it is doing; it either sets the threshold to the new value or returns the old one. If the returned value equal or higher than the value given is input, one knows that the threshold was not updated/set.

k-rus · 2020-09-09T10:42:32Z

tsl/src/continuous_aggs/invalidation_threshold.c

+	if (TS_TIME_IS_INTEGER_TIME(refresh_window->type))
+		max_refresh = TS_TIME_IS_MAX(refresh_window->end, refresh_window->type);
+	else
+		max_refresh = TS_TIME_IS_END(refresh_window->end, refresh_window->type) ||
+					  TS_TIME_IS_NOEND(refresh_window->end, refresh_window->type);


It might be useful to have this in a function, so it can be utilised in other places when needed.

I think it makes sense to do that when, and if, the need arises.

tsl/src/continuous_aggs/refresh.c

k-rus · 2020-09-09T11:36:12Z

I changed to commit message, hopefully to your liking. I didn't really understand your suggested change, however, in particular "Subsequently the refresh window is caped backward from +infinity to the end of the last bucket...".

The new commit message doesn't improve the most confusing sentence. It says:

this change caps the invalidation
threshold, and subsequently the refresh window, at the end of the last
bucket of data in the underlying hypertable.

The refreshed window is capped from +infinity, which is the request to user. While the invalidation threshold cannot be capped, but should be increased forward. The confusion comes from that the candidate invalidation threshold is capped, not the invalidation threshold itself. Will it be better?:

this change caps the refresh window at the of the last bucket of data in underlying hypertable, and the invalidation threshold is move forward to the capped value.

When refreshing with an "infinite" refresh window going forward in time, the invalidation threshold is also moved forward to the end of the valid time range. This effectively renders the invalidation threshold useless, leading to unnecessary write amplification. To handle infinite refreshes better, this change caps the refresh window at the end of the last bucket of data in the underlying hypertable, as to not move the invalidation threshold further than necessary. For instance, if the max time value in the hypertable is 11, a refresh command such as: ``` CALL refresh_continuous_aggregate(NULL, NULL); ``` would be turned into ``` CALL refresh_continuous_aggregate(NULL, 20); ``` assuming that a bucket starts at 10 and ends at 20 (exclusive). Thus the invalidation threshold would at most move to 20, allowing the threshold to still do its work once time again moves forward and beyond it. Note that one must never process invalidations beyond the invalidation threshold without also moving it, as that would clear that area from invalidations and thus prohibit refreshing that region once the invalidation threshold is moved forward. Therefore, if we do not move the threshold further than a certain point, we cannot refresh beyond it either. An alternative, and perhaps safer, approach would be to always invalidate the region over which the invalidation threshold is moved (i.e., new_threshold - old_threshold). However, that is left for a future change. It would be possible to also cap non-infinite refreshes, e.g., refreshes that end at a higher time value than the max time value in the hypertable. However, when an explicit end is specified, it might be on purpose so optimizing this case is also left for the future. Closes timescale#2333

erimatnor added the continuous_aggregate label Sep 8, 2020

erimatnor added this to the 2.0.0 milestone Sep 8, 2020

erimatnor marked this pull request as ready for review September 8, 2020 08:47

erimatnor requested a review from a team as a code owner September 8, 2020 08:47

erimatnor requested review from pmwkaa, svenklemm, gayyappan, k-rus, mkindahl and WireBaron and removed request for a team and gayyappan September 8, 2020 08:47

pmwkaa approved these changes Sep 8, 2020

View reviewed changes

WireBaron approved these changes Sep 8, 2020

View reviewed changes

erimatnor force-pushed the caggs-cap-invalidation-threshold branch 5 times, most recently from 34cf22a to 726cf06 Compare September 9, 2020 10:59

k-rus approved these changes Sep 9, 2020

View reviewed changes

erimatnor force-pushed the caggs-cap-invalidation-threshold branch from 726cf06 to b5b30ae Compare September 9, 2020 16:04

erimatnor force-pushed the caggs-cap-invalidation-threshold branch from b5b30ae to f0e3bfb Compare September 9, 2020 17:20

erimatnor merged commit f49492b into timescale:master Sep 9, 2020

erimatnor deleted the caggs-cap-invalidation-threshold branch September 9, 2020 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cap invalidation threshold at last data bucket #2338

Cap invalidation threshold at last data bucket #2338

erimatnor commented Sep 8, 2020 •

edited

codecov bot commented Sep 8, 2020 •

edited

pmwkaa left a comment

pmwkaa Sep 8, 2020 •

edited

erimatnor Sep 9, 2020

k-rus commented Sep 8, 2020

erimatnor commented Sep 8, 2020

WireBaron left a comment

WireBaron Sep 8, 2020

erimatnor Sep 9, 2020

WireBaron Sep 8, 2020

erimatnor Sep 9, 2020

k-rus commented Sep 9, 2020 •

edited

erimatnor commented Sep 9, 2020

k-rus left a comment

k-rus Sep 9, 2020

erimatnor Sep 9, 2020 •

edited

k-rus Sep 9, 2020

erimatnor Sep 9, 2020

k-rus commented Sep 9, 2020

	invalidation_threshold_set_or_get(int32 raw_hypertable_id, int64 invalidation_threshold)
	invalidation_threshold_update(int32 raw_hypertable_id, int64 invalidation_threshold)

Cap invalidation threshold at last data bucket #2338

Cap invalidation threshold at last data bucket #2338

Conversation

erimatnor commented Sep 8, 2020 • edited

codecov bot commented Sep 8, 2020 • edited

Codecov Report

pmwkaa left a comment

Choose a reason for hiding this comment

pmwkaa Sep 8, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k-rus commented Sep 8, 2020

erimatnor commented Sep 8, 2020

WireBaron left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k-rus commented Sep 9, 2020 • edited

erimatnor commented Sep 9, 2020

k-rus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erimatnor Sep 9, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k-rus commented Sep 9, 2020

erimatnor commented Sep 8, 2020 •

edited

codecov bot commented Sep 8, 2020 •

edited

pmwkaa Sep 8, 2020 •

edited

k-rus commented Sep 9, 2020 •

edited

erimatnor Sep 9, 2020 •

edited