-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TimescaleDB 2.x Continuous Aggrecation long recalculation #2867
Comments
It would be great to solve this problem somehow. I've tried workaround with insert triggers on |
Thanks for reporting this issue. Looks indeed to be an issue with the invalidation logic. |
I am seeing similar issues with The aggregations are making per minute, per hour and per day buckets from 2 months of data (100s of millions of rows). Some lighter (in number of rows per bucket) CAGGs have been able to complete (very slowly), but the heavier ones are not completing (have been running for 24h) even when restricting the |
I also have some issues I think may be related to this. We upgraded to 2.0 a couple of days ago, but after that continuous aggregate refresh jobs have "stopped working". What happens now is that they just run forever. I've tried all I could think of and even removed all jobs and just created one single continuous_aggregate_policy. That job has now been running for more than 24 hours constantly using 100% of one cpu core. The job is materializing 1 minute data into hourly with start_offset => INTERVAL '30 days', end_offset => INTERVAL '1 hour'. Like @aelg, running refresh_continuous_aggregate manually on smaller intervals also takes a very long time. Some background info: we are running 400 nodes caching data and each node writing to the timescaledb every 30 minutes. These nodes may have unstable connection, so the write into timescaledb may also be hours and even days late. In other words, data are being backfilled constantly. Another thing maybe worth mentioned (not sure if this is relevant or not?); the hypertable was created using default chunk size, so I see that chunks are apparently created weekly with a total size of up to 74GB per chunk. I lowered to 1 day chunk size now, maybe I should even go lower? (server has 90 GB RAM). |
We are using a workaround to mitigate this issue in production now.
This query will return a record for each cont. agg. view.
This query will probably hang on lock because background jobs are running so in other console execute:
|
This 2 values are
Seems its a bit messed up (i'm not sure about all values, but some of them are makes sense). In TSDB source code I saw that they are using I would suggest to just clear all records in this table and re-add "special" record for each view you have. Save all uniq cont. agg. view ids:
For us it returns Then delete all this mess with:
And re-add just few records (change with your ids!):
|
Thanks! So far this seems to solve several issues. Will continue testing tomorrow. |
Currently looking into this issue, but would need some help to understand the underlying cause. Do people generally experience these issues after upgrading their continuous aggregates from a previous version? Specifically, I am wondering if the invalidation logs already had a lot of entries prior to updating or whether lots of entries appeared after the update? If anyone has a script to reproduce these issues on a fresh installation, that would be tremendously helpful. I am trying to reproduce it myself in the meantime. |
Sure.
Everything fine till this moment and all data in datapoints table are materialized. In bash shell:
Now there are 10_000 record in hypertable invalidation log
And when calling refresh_continuous_aggregate all records from continuous_aggs_hypertable_invalidation_log will be copied to continuous_aggs_materialization_invalidation_log, and materialization process will run again and again. Expected behavior will be to invalidate and materialize all this little changes in one pass. |
@dimonzozo Thank you. Can you clarify the last bit: "And when calling refresh_continuous_aggregate all records from continuous_aggs_hypertable_invalidation_log will be copied to continuous_aggs_materialization_invalidation_log, and materialization process will run again and again." How do you run |
For tests i run the same command With test data, this call takes lots more time than first call. On production, it never completes because in our data invalidation of each 10m interval takes up to 40 seconds.
My thought process was the following. I did look through the source code and found out that the first step of executing |
@dimonzozo Thanks for the additional information. I tested your reproduction case and, while the second refresh was indeed slower, it did complete without too much delay. Obviously, the refresh time is somewhat proportional to the amount of invalidations it needs to process and range to materialize and maybe there is something we can do to handle a huge amount of invalidations better. A couple of observations, though. If you have many, many invalidation records due to single row (non batched) inserts, then the processing of those invalidations will also take a longer time (as evident by the example). We do merge invalidations, but only if ranges are adjacent or overlap, and only for the cagg being processed by the current command. One workaround, until we can optimize for single row inserts, might be to manually materialize smaller ranges of backfill in a single refresh, and then do several of them instead. Another option might be to provide a "hard refresh" option where we clear all the invalidations in the refreshed range without further processing, and then proceed with refreshing the whole range instead of smaller bits within the refresh window. |
Yeah. I saw this logic in source code. Great work, BTW! Source code is very clear, has lots of comments and easy to read. "hard refresh" is what we doing in our case and that what I suggested to @slasktrat in comments before. As workaround this is totally fine. Also I thought I can do a script which will replace lots of small invalidation ranges with single record, but running it by TSDB action will fail due locks on the table. Other workaround ideas would be very helpful. |
Can confirm that the workaround suggested by @dimonzozo seems to fix things for me as well. That is clearing out the @erimatnor FYI I created the database in 2.0.0 and then filled it with data, the instance was upgraded from 1.7.4 though, but the databases where dropped and recreated, then filled with 2 months of data. This still seems to have created a lot of rows in the invalidation log. |
Also I'm using BIGINT with nanoseconds since epoch as the time column if that should matter. |
I've been testing some more and when using the built-in add_continuous_aggregate_policy ,continuous_aggs_materialization_invalidation_log is filled with an extreme amount of entries and the result is that the initial job runs forever (I killed the job after 28 hours). As long as the schedule interval is not too big it so far seems sufficient for me to do some cleanup in the continuous_aggs_materialization_invalidation_log immediately after creating the aggregate policy. The same job that was running for 28 hours without completing did now have an initial run duration of 12 minutes, and subsequent runs are completed in seconds, when creating the aggregate policy using my custom function below.
|
The joy did only last a few hours, already the continuous_aggs_materialization_invalidation_log is polluted with more entries than the refresh job are able to manage.. Seems we have to go all manual like @dimonzozo after all. :/ |
I combined some ideas and have another possible workaround (not properly tested!). This procedure will properly handle invalidations intervals and can be used instead of
Possible usage is:
|
Funny! I'm testing almost exact the same thing
|
Wow! Great way to avoid calling |
I see the problem here. This procedure join all records which already in And another difficulty is that all records in hypertable_invalidation_log should be processed (squashed) and then copied to materialization_invalidation_log in multiple copies (for each cont. agg on target hypertable). I tried to solve it my procedure, but stuck with transactions issue. |
Yeah, I also tried to chunk the job up in smaller pieces and running the refresh synchronous, but also got stuck with transaction issue.. I'll let this method run over night and see how it performs. So far I got much higher success rate than with the built-in logic, but not 100%. This workaround can be improved in many ways, but hopefully the timescaledb guys will fix the root issue. Should not be necessary to use these workarounds.. |
It's far from perfect and some values are set to fit our case, but I modified my own custom job to the following and now we have an automated refresh with all jobs having at least some success rate - unlike with the built-in functionality. But I hope this issue will be prioritized as upgrading from 1.7 to 2.0 in practice broke our service and caused several days of "downtime" until this workaround was up and running. 😢
|
Update: The latest workaround has been running for a few days now and it actually works very well. An additional benefit is that all caggs are now continuously updated with increased control of concurrent jobs without the log being spammed with out of background workers. |
@slasktrat Great to hear that you were able to work around the issue. A quick update on our end. I believe we have a solution to optimize our invalidation handling for lots of small invalidations. Essentially, what we are testing is a way to expand each invalidation to the closest bucket boundaries. This should be safe since we always materialize full buckets (except for some corner-cases when you drop chunks, but in that case we might at worst invalidate more data than necessary, which shouldn't be an issue either). Thus, if you insert a value every minute and you have a 10 minute continuous aggregate bucket, you will expand each minute invalidation to the full 10 minute bucket, which in turn will merge with the next bucket if that one was invalidated too, and so on. Still, I think there are some corner cases where this might still not be optimal. For instance, let's say you have 1 minute buckets and you insert a value every 2 minutes. Then you will only invalidate every other bucket, which still leads to lots of invalidations if you e.g., refresh 1 week's worth of data. Internally, we will actually materialize each invalidated range separately, which is why materialization is slow for these corner cases where we cannot merge ranges into bigger ones. Obviously, the situation we want to avoid on the other end of the spectrum is that you have to re-materialize too much when you've, e.g., only invalidated a couple of buckets across a refresh window o, e.g., a year. Then it is better to do a number of smaller materializations instead of re-materializing the whole year's worth of data. There might be some additional heuristics we can implement to optimize further for these worst-case scenarios where you have backfill across, e.g., every other bucket. For instance, we could try to set a limit on how many materializations we do in a refresh window and try to expand invalidations across the N adjacent buckets or simply fall back to a brute-force refresh of the whole refresh window. I think we might take an incremental approach here to see what works best and tweak this further if necessary across multiple releases. Sometimes, the approach that works well for one use case does not work well for other use case, so we want to be cautious about making too many assumptions. |
Just for the benefit of anyone else who runs into this, here’s an example of how to implement the solution @dimonzozo and @slasktrat discussed above. Thank you both for your work on this. Prerequisites :
Steps : Step 1: Create supporting tables for the User Defined Action to be created CREATE TABLE custom_invalidation_log( Step 2: Create Procedure for User Defined Action using TimescaleDB Automation Framework
Note that there are two default variables : max_concurrent_jobs (integer) max_job_runtime (interval) You might want to change these variables as per your need. Also, note that this script/action will run for all continuous aggregates which might or might not be required. You might want to run this script for specific jobs at specific schedules, for which you will need to alter the script. Step 3: Register the procedure run_all_continuous_aggregates to be run every hour (or whenever as per the need).
|
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes #2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes #2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes #2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes #2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes timescale#2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes timescale#2867
The refreshing of a continuous aggregate is slow when many small invalidations are generated by frequent single row insert backfills. This change adds an optimization that merges small invalidations by first expanding invalidations to full bucket boundaries. There is really no reason to maintain invalidations that aren't covering full buckets since refresh windows are already aligned to buckets anyway. Fixes #2867
When there are many small (e.g., single timestamp) invalidations that cannot be merged despite expanding invalidations to full buckets (e.g., invalidations are spread across every second bucket in the worst case), it might no longer be beneficial to materialize every invalidation separately. Instead, this change adds a threshold for the number of invalidations used by the refresh (currently 10 by default) above which invalidations are merged into one range based on the lowest and greatest invalidated time value. The limit can be controlled by an anonymous session variable for debugging and tweaking purposes. It might be considered for promotion to an official GUC in the future. Fixes #2867
Heads up: The fix for this will appear in TimescaleDB 2.0.2, which has just been tagged and will be released shortly. |
We have an issue with the Continuous Aggregation feature with TimescaleDB 2.x upgrade.
We storing data with 1 seconds granularity in hypertable and doing rollups to 10m and 1h intervals with continuous aggregation views.
Before 2.0 we have no issues with refreshes, but with 2.0 rollups get stuck after inserting a bunch of historical data.
We've looked through a code and was able to tell that the issue raises in invalidation logic.
Our system doing separate inserts of data points rounded to 1-second grid (shown in red on the diagram) and this action leads to creating invalidation records in the
_timescaledb_catalog.continuous_aggs_hypertable_invalidation_log
table. Because all of these records are 0 length intervals - the aggregation process cannot merge them into bigger intervals.This leads to adding dozens of records into
_timescaledb_catalog.continuous_aggs_materialization_invalidation_log
and the continuous aggregation job invalidates the same 10 minutes interval again and again for each changed record.Maybe the invalidation process can be more clever and join intervals that fall into the same bucket to avoid recalculating the same intervals over and over again? Another thought is to recalculate the bucket once taking into account all intervals currently invalidating it and not once per invalidation log record.
I hope this problem is not ours only. Please help us to understand and investigate the problem properly.
The text was updated successfully, but these errors were encountered: