Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Continuous aggregate is not refreshed automatically after insert into compressed chunks #5455

Closed
riyazkittur opened this issue Mar 16, 2023 · 4 comments

Comments

@riyazkittur
Copy link

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Continuous aggregate

What happened?

After the merge of pull request #5252
I am able to do insert on conflict do nothing in compressed chunks. However, continuous aggregates are not getting refreshed on the next schedule run.

TimescaleDB version affected

latest

PostgreSQL version used

14.7

What operating system did you use?

Windows 10 x64

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

Pre-requisite : 
Build latest timescale extension from source after the merge of pull request  https://github.com/timescale/timescaledb/pull/5252
Create a hypertable, materialized view on the hypertable
Create continuous aggregate and compression policy on hypertable
Insert data into hypertable such that the chunk gets compressed.
Try Inserting data into compressed chunk using Insert ON conflict do nothing (supported in the pull request #5252)
Wait for the next schedule run of continuous aggregate refresh.

Expected: 
Aggregate shall reflect the insert into compressed chunks.
Actual:
No change in aggregate until we CALL refresh_continuous_aggregate manually.
@lkshminarayanan
Copy link
Contributor

Hi @riyazkittur,

Thank you for filing the issue!

I tried to reproduce the problem with all the details from your issue but in my attempts, the continuous aggregates always got updated as per the defined schedule. There is a chance that the bug is triggered only with certain queries. So, can you share the definition of the hypertable and continuous aggregate you used and the related compression and refresh policies?

@riyazkittur
Copy link
Author

CREATE TABLE events_data
(
    event_time timestamp with time zone NOT NULL,
    stream_id integer NOT NULL,
    event_data double precision NOT NULL,
    CONSTRAINT vector_events_pkey PRIMARY KEY (stream_id, event_time)
)
TABLESPACE pg_default;

ALTER TABLE public.events_data
    OWNER to postgres;
	
	
	 SELECT create_hypertable('events_data', 'event_time', 'stream_id', 1, 
    chunk_time_interval    =>  INTERVAL '1 day',
    create_default_indexes => FALSE,
    migrate_data           => TRUE);
	
	CREATE MATERIALIZED VIEW events_data_agg_5m
  WITH (timescaledb.continuous) AS
  SELECT ev.stream_id, time_bucket('5 minute', ev.event_time) AS agg_interval,
    SUM(ev.event_data0::decimal) as sum0, MIN(ev.event_data) as min0, MAX(ev.event_data) as max0,
    COUNT(ev.event_data) as count
  FROM events_data ev
  WHERE ev.event_data != 'NaN'
  GROUP BY stream_id, agg_interval;

  ALTER MATERIALIZED VIEW events_data_agg_5m SET (timescaledb.materialized_only = true);

    SELECT add_continuous_aggregate_policy('events_data_agg_5m',
    start_offset      => INTERVAL '32 days',
    end_offset        => INTERVAL '10 minute',
    schedule_interval => INTERVAL '10 minute');
	
	SELECT add_compression_policy('events_data', INTERVAL '1 month');

@riyazkittur
Copy link
Author

riyazkittur commented Mar 21, 2023

Please make sure the chunks are already compressed and try to insert into compressed chunks using INSERT ...ON CONFLICT DO NOTHING . Data inserted in compressed chunks is not reflecting in continuous aggregates

@lkshminarayanan
Copy link
Contributor

Hi @riyazkittur,

I tried reproducing the issue with the definitions you gave but unfortunately I'm still unable to reproduce it in my setup.

Is it possible the event_time value of the new rows that you inserted into the compressed chunks are older than the start_offset of the continuous aggregate refresh policy?

I see that you have a policy to compress chunks older than a month, which means the compressed chunks you have are atleast older than a month. On the other hand, your refresh policy only considers data that is atmost 32 days older when the job is run. The new rows, that you inserted into the compressed chunks, will be materialized in the continuous aggregate during a scheduled refresh only if they fall within the overlapping 2 day window between the compression policy and the refresh policy. Hence the above question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants