Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hierarchical Continuous Aggregates validations #5013

Merged

Conversation

fabriziomello
Copy link
Contributor

@fabriziomello fabriziomello commented Nov 22, 2022

Commit 3749953 introduce Hierarchical Continuous Aggregates (aka
Continuous Aggregate on top of another Continuous Aggregate) but it
lacks of some basic validations.

Validations added during the creation of a Hierarchical Continuous
Aggregate:

  • Forbid create a continuous aggregate with fixed-width bucket on top of
    a continuous aggregate with variable-width bucket.

  • Forbid incompatible bucket widths:

    • should not be equal;
    • bucket width of the new continuous aggregate should be greater than
      the source continuous aggregate;
    • bucket width of the new continuous aggregate should be multiple of
      the source continuous aggregate.

@codecov
Copy link

codecov bot commented Nov 22, 2022

Codecov Report

Merging #5013 (3c6d16a) into main (26e3be1) will increase coverage by 0.05%.
The diff coverage is 85.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5013      +/-   ##
==========================================
+ Coverage   89.58%   89.64%   +0.05%     
==========================================
  Files         226      227       +1     
  Lines       51347    51599     +252     
==========================================
+ Hits        46001    46256     +255     
+ Misses       5346     5343       -3     
Impacted Files Coverage Δ
src/chunk.h 100.00% <ø> (ø)
src/cross_module_fn.c 67.55% <0.00%> (-1.10%) ⬇️
src/process_utility.c 90.19% <0.00%> (-0.09%) ⬇️
src/ts_catalog/continuous_agg.h 100.00% <ø> (ø)
tsl/src/chunk_api.c 95.67% <ø> (ø)
tsl/src/init.c 96.00% <ø> (ø)
tsl/src/nodes/frozen_chunk_dml/frozen_chunk_dml.c 64.44% <64.44%> (ø)
tsl/src/fdw/modify_exec.c 85.22% <82.45%> (+0.25%) ⬆️
tsl/src/fdw/modify_plan.c 91.25% <83.33%> (-0.75%) ⬇️
tsl/src/continuous_aggs/create.c 87.95% <85.91%> (-0.14%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83b13cf...3c6d16a. Read the comment docs.

@fabriziomello fabriziomello changed the title Add caggs on caggs validations Add Hierarchical Continuous Aggregates validations Nov 24, 2022
@fabriziomello fabriziomello self-assigned this Nov 24, 2022
@fabriziomello fabriziomello added continuous_aggregate enhancement An enhancement to an existing feature for functionality labels Nov 24, 2022
@fabriziomello fabriziomello added this to the TimescaleDB 2.9 milestone Nov 24, 2022
@fabriziomello fabriziomello marked this pull request as ready for review November 24, 2022 16:31
@maxhertrampf
Copy link

maxhertrampf commented Nov 24, 2022

Thanks for adding nested caggs!

I have a quick question about this sentence:

bucket width of the new continuous aggregate should be multiple of
the source continuous aggregate.

What exactly does "multiple" mean? Can I only build 24h on top of 1h (i.e. always the same multiple)? Or can I also build 1 day on top of 1 hour or 1 month on top of 1 day (i.e. variable multiple (most days have 24 hours, some 23/25; months can have 28-31 days) - of course respecting boundaries in the specified time zone)?

We would need both to work, if that's not too much effort :)

@fabriziomello fabriziomello force-pushed the cagg_on_cagg_validations branch 3 times, most recently from 4b499bf to 62077b4 Compare November 24, 2022 18:48
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this looks quite good, but I would consider the following:

  • Using errors for cases that we think are not sensible might block users from interesting use-cases. I think a warning would be sufficient for these cases. Errors should be reserved for situations that will just not work.
  • I think you should avoid enabling verbose message for entire tests and just increase and decrease the message level based on what is necessary. Maintaining tests with a lot of extraneous false mismatches takes extra time and risk missing actual failures.
  • I think you're missing one test case.

Comment on lines 1281 to 1293
if ((bucket_info_source.bucket_width == BUCKET_WIDTH_VARIABLE &&
bucket_info.bucket_width != BUCKET_WIDTH_VARIABLE))
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot create continuous aggregate with fixed-width bucket on top of "
"one using variable-width bucket"),
errdetail("Continuous aggregate with a fixed time bucket width (e.g. 61 days) "
"cannot be created on top of one using variable time bucket width "
"(e.g. 1 month).\n"
"The variance can lead to the fixed width one not being a multiple "
"of the variable width one.")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this includes the case where you build one week (that is, 7 days) on top of days with DST. This should be conceptually OK, but not sure if you count such as a week as a variable width bucket or a fixed width bucket.

The same applies building a year (12 months) from months (which varies), but creating a continuous aggregate seems to mark a month bucket as variable, and also a year (using 12 months for a year), so this does not seem to be an issue.

This also have the risk of blocking the user if we have a bug in our code, so unless proceeding with execution when you have fixed-size buckets on top of variable width buckets would break the system, I think it would be sufficient with a warning message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed a lot already about it (see the issue https://github.com/timescale/product/issues/337) and we decided to be more restrictive in this first version and perhaps relax it (or make it configurable) in the future.

tsl/src/continuous_aggs/create.c Outdated Show resolved Hide resolved
tsl/src/continuous_aggs/create.c Outdated Show resolved Hide resolved
tsl/src/continuous_aggs/create.c Show resolved Hide resolved
tsl/test/expected/cagg_on_cagg_integer.out Outdated Show resolved Hide resolved
tsl/test/expected/cagg_on_cagg_integer_dist_ht.out Outdated Show resolved Hide resolved
tsl/test/expected/cagg_on_cagg_integer_dist_ht.out Outdated Show resolved Hide resolved
tsl/test/sql/include/cagg_on_cagg_common.sql Outdated Show resolved Hide resolved
@fabriziomello fabriziomello force-pushed the cagg_on_cagg_validations branch 3 times, most recently from 0701169 to f954c2b Compare November 25, 2022 15:36
@fabriziomello
Copy link
Contributor Author

What exactly does "multiple" mean? Can I only build 24h on top of 1h (i.e. always the same multiple)? Or can I also build 1 day on top of 1 hour or 1 month on top of 1 day (i.e. variable multiple (most days have 24 hours, some 23/25; months can have 28-31 days) - of course respecting boundaries in the specified time zone)?

All use cases you mentioned are supported, but the following:

  • 90 days (fixed-width bucket) on top of 1 month (variable-width bucket) will not work
  • 7 days on top of 3 days (are not multiple) will not work, but 9 days on top of 3 days will work

@fabriziomello fabriziomello force-pushed the cagg_on_cagg_validations branch 2 times, most recently from 3d54fe6 to e06f27b Compare November 25, 2022 19:37
@fabriziomello fabriziomello enabled auto-merge (rebase) November 25, 2022 20:19
Commit 3749953 introduce Hierarchical Continuous Aggregates (aka
Continuous Aggregate on top of another Continuous Aggregate) but it
lacks of some basic validations.

Validations added during the creation of a Hierarchical Continuous
Aggregate:

* Forbid create a continuous aggregate with fixed-width bucket on top of
  a continuous aggregate with variable-width bucket.

* Forbid incompatible bucket widths:
  - should not be equal;
  - bucket width of the new continuous aggregate should be greater than
    the source continuous aggregate;
  - bucket width of the new continuous aggregate should be multiple of
    the source continuous aggregate.
@fabriziomello fabriziomello merged commit 35c9120 into timescale:main Nov 25, 2022
@maxhertrampf
Copy link

All use cases you mentioned are supported

Thank you, @fabriziomello !

The not-working cases you listed make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
continuous_aggregate enhancement An enhancement to an existing feature for functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants