Database continuous aggregation with time zone #1248

Speedkore · 2019-05-28T16:45:27Z


CREATE TABLE hypertable (
	"whatever" float8 NOT NULL,
	"timestamp" timestamptz NOT NULL
);

SELECT create_hypertable('hypertable', 'timestamp');

INSERT INTO hypertable
  VALUES 
  (9.6,'2019-01-01 13:00'),
  (8.8,'2019-01-01 13:15');

create view  aggregatedView with
(timescaledb.continuous, timescaledb.refresh_interval = '5m', timescaledb.refresh_lag='5m')
as
SELECT 
	sum("whatever")::BIGINT as "whatever",
	time_bucket('1 day', "timestamp" AT TIME ZONE 'Europe/Madrid') as "day"
FROM hypertable
GROUP BY time_bucket('1 day', "timestamp" AT TIME ZONE 'Europe/Madrid');

SQL Error [XX000]: ERROR: time_bucket function for continuous aggregate query should be called on the dimension column of the hypertable

I'm getting this error maybe because what i want to do is not possible, the select alone works fine.

So is it possible to aggregate by day in a specific time zone with a continuous aggregation?

The text was updated successfully, but these errors were encountered:

Eedredondo · 2019-06-03T08:57:05Z

I have the same problem. Can´t convert to local time. Original table.fecha_hora is a TIMESTAMPTZ :

CREATE VIEW va_ftp_pxd_datos
WITH ( timescaledb.continuous )
AS
SELECT time_bucket('1 day', fecha_hora::TIMESTAMP)
AS dia,
avg(ftp_pxd_datos.activa_i) AS activa,
avg(ftp_pxd_datos.reactiva_i) AS reactiva,
ftp_pxd_datos.id_cups
FROM ftp_pxd_datos
GROUP BY ftp_pxd_datos.id_cups, dia

cevian · 2019-06-06T15:12:49Z

We currently don't support timezone conversions inside of time_bucket with continuous aggregates. There are two possible workarounds:

Do the timestamptz -> timestamp conversion when querying out of the view. or
store the time in the underlying table as a timestamp instead of timestamptz.

This turns out to be a very complex issue to implement because of daylight savings time issues so we would like to hear from the community if the above workarounds are sufficient or if you need a fuller solution. Please upvote and leave comments with your usecases.

Speedkore · 2019-06-07T08:33:00Z

#1 would give wrong results because the utc start and end of the day is different for each timezone, so every aggregate for each timezone has different aggregation results
#2 can be a solution, storing the timestamp without timezone for each time zone you are interested and calculate different aggregates using each column
A #3 solution can be calculate an hourly aggregation and then the client can make the aggregate outside the db, manually slicing the data in days for each time zone interested (although some countries have 15min or 30 shift time zone)

A fuller solution would come late for me anyway, so i guess i'm fine for now knowing that it is not supported

tobiasdirksen · 2019-08-14T11:31:34Z

This is a must-have for me.
#1: When aggregating on days, it doesn't give sense to convert when querying out of the view. The aggregation must be done from 00:00 to 00:00 in local time.
#2: Local timestamps are not unique around daylight saving time changes, so this solution is not very good either.
I can't find a good work around for this. Any suggestions?

boromisp · 2019-08-16T13:33:15Z

I could not find a case where the non-unique local time would cause a problem for daily aggregates, since in that case only the date matters.

The only case I could find without a workaround is places with sub-hour shifted time zones and daylight saving:

SET TIMEZONE TO 'Australia/Adelaide';

SELECT time_bucket(
    '1 hour',
    ts AT TIME ZONE 'Australia/Adelaide'
) AT TIME ZONE 'Australia/Adelaide' AS ts_hour, ts
FROM generate_series(
    '2019-04-07 01:00:00+10:30'::timestamptz,
    '2019-04-07 03:59:59+09:30'::timestamptz,
    '00:15:00'::interval
) AS ts;

          ts_hour          |            ts
---------------------------+---------------------------
 2019-04-07 01:00:00+10:30 | 2019-04-07 01:00:00+10:30
 2019-04-07 01:00:00+10:30 | 2019-04-07 01:15:00+10:30
 2019-04-07 01:00:00+10:30 | 2019-04-07 01:30:00+10:30
 2019-04-07 01:00:00+10:30 | 2019-04-07 01:45:00+10:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:00:00+10:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:15:00+10:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:30:00+10:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:45:00+10:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:00:00+09:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:15:00+09:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:30:00+09:30
 2019-04-07 02:00:00+09:30 | 2019-04-07 02:45:00+09:30
 2019-04-07 03:00:00+09:30 | 2019-04-07 03:00:00+09:30
 2019-04-07 03:00:00+09:30 | 2019-04-07 03:15:00+09:30
 2019-04-07 03:00:00+09:30 | 2019-04-07 03:30:00+09:30
 2019-04-07 03:00:00+09:30 | 2019-04-07 03:45:00+09:30
(16 rows)

Here the rows 5 - 8 are obviously wrong.

There are not many places like this.

I'm not sure how time_bucket is implemented, but maybe the new time zone aware date_trunc could be used as an inspiration.

cressman · 2020-02-06T19:56:28Z

The non-unique local time is always an issue. I can't store Nov. 6, 2am and then store a second record with the same local time without overwriting the first.
I hope to see a timezone aware time_bucket function.

lely475 · 2020-08-02T17:02:50Z

As timescale is specifically targeted towards time series data with the intent of simplifying overhead and data analysis I was surprised to find out, that this feature is not supported. I agree that a timezone aware time_bucket function would be very helpful and necessary.

In my use case I'm receiving sensor data from different farms around the world. So for each of the sensors I want to store the daily average from 0:00-23:59 in its local timezone. However as the time bucket function would calculate the daily average based on UTC time, it is making a "mistake" in hours of the timezone the sensor is in. In the worst case this could be +/- 12h.
And none of the workarounds seem really satisfying:
#1 is not useful for reasons stated above
#2 enforces the saving of another column for all sensors and all system which (in the long run) creates a constantly growing, unnecessary overhead and just makes life more complicated in general
#3 works well if I'm analyzing a time frame in the month scale, but creates a larger and larger overhead for longer (yearly) periods: instead of extracting 1 value/day I have to calculate avg(24 values)/day, thus slowing down the SELECT query

So all in all I would be really happy to see this feature added in the future.

gjeusel · 2020-10-11T14:47:02Z

Any news on this issue @cevian ?

I believe adding an optional time_zone argument to time_bucket to mimic the signature of current postgresql date_trunc seems like a good plan.

What are your thoughts ?

sachaventura · 2020-10-13T23:36:01Z

Just adding my vote for this feature.
This would be game-changing for our infrastructure.
We currently perform aggregation (15 minutes to monthly) on 5-minute data on the fly and it is killing our database. We had to heavily rate-limit our API for those granularity-requests otherwise our database just dies...but it's preventing us from scaling up.

mdespriee · 2021-01-06T17:14:51Z

also voting for this feature. We have a use-case very similar to the one described by @lely475

Greenbones · 2021-01-18T20:58:36Z

We discovered this limitation today and it's also an issue for us, so adding my vote as well.

arnevanacker · 2021-02-14T15:40:48Z

I'd also like to chime in and request for a timezone aware way to use continuous aggregates.
The proposed solutions really create a lot of overhead and make it significantly harder to use an otherwise great product.

erimatnor · 2021-02-17T10:14:55Z

Possibly related to #414

emanzx · 2021-04-01T06:41:50Z

I also +1 for this features too. This really a game-changing features. I try to create cagg for a table so I can get 1 day of sum of specific metrics but it always start at 0:00 UTC time and this introduce incorrect calculation

aadomingos · 2021-05-18T20:51:40Z

I think it would also be immensely valuable to have a timezone feature. It really expands the universe of possibility of what this tool can do when dealing with global sensors.

brki · 2021-05-19T10:00:59Z

Another vote from me.

For me, the issue is using the INTERVAL '1 day' continuous aggregate view from Grafana.

As far as I can tell, Grafana assumes that the database has dates stored in UTC. That is, this query expression

WHERE $__timeFilter(event_time)

is transformed by Grafana into this SQL:

WHERE event_time BETWEEN '2021-04-12T22:00:00Z' AND '2021-05-18T21:59:59.999Z'

Grafana works fine for queries against the raw data (non-aggregated) hypertable, when the event_time column has the type timestamp with time zone

However, for the 1 day aggregate, I want data to reflect the day in a specific time zone - my time zone.

If I change the hypertable to use a timestamp without time zone column, then $__timeFilter(event_time) queries from Grafana that use the raw data hypertable are off (for my local time zone).

If the data is stored without time zone, I suppose I could rewrite all the queries from Grafana that use the raw data table to use something like $__timeFilter(event_time AT TIME ZONE 'Europe/Zurich'), but ... yuck.

MA-MacDonald · 2021-07-13T06:20:27Z

@brki I agree with you. The ability to bucket 1 day local time on continuous aggregates makes the most sense to me. From my experience the ability to see daily reports midnight-midnight local time is one of the most common reporting intervals.

This ability would sell me on Timescale as most of my work involves generating daily, weekly, monthly reports on our IT/Data Center infrastructure.

pgwhalen · 2021-10-05T22:13:39Z

+1. We understand the complexities, but unfortunately the workarounds are pretty limited.

My team's mental model seems to match the proposed solution of time_bucket taking a timezone argument. For what it's worth, our use case is for financial data that is mostly written during business days, so the intricacies around daylight savings aren't too relevant to us.

msdrigg · 2021-12-15T16:13:20Z

It is good to see progress on this issue with time_bucket_ng, and I appreciate the devs working on this. I do want to offer a +1 that the full support for timezones in continuous aggregates would be very helpful to my use case.

yangm97 · 2022-02-21T21:34:00Z

Bumped into this recently, I wonder why nobody brought up [MATERIALIZED] VIEWs as a workaround and then let time_bucket work on the local_time column. For instance:

CREATE OR REPLACE VIEW "local_foobar" AS SELECT 
    "timestamp" AT TIME ZONE (
    	SELECT "value" FROM "environment" WHERE "key" = 'LOCAL_TIMEZONE'
    ) AS local_timestamp, 
    * 
FROM "foobar"

narendracode · 2022-04-13T01:36:30Z

@yangm97 this would work for simple case but if you're working with continuous aggregation, hypertable then this would not work.

svenklemm · 2022-09-06T20:32:00Z

Continuous aggregates with timezone are possible since 2.8.0

Himani2000 · 2022-11-30T10:20:16Z

@svenklemm I have one question if we do timezone based continous aggregation then do we have to create the aggregate for each timezone (please see below example)


SELECT
time_bucket('1 day', "time", 'Asia/kolkata') AS day,
user_id,
sum(value) AS total,
avg(value) AS average
FROM test
GROUP BY time_bucket('1 day', "time", 'Asia/kolkata'), user_id ;

Is there not a way to dynamically get the localtimezone based aggregates ?

cevian added the limitation label Jun 6, 2019

NunoFilipeSantos added the feature-enhancement label May 11, 2021

NunoFilipeSantos added feature-request Feature proposal and removed limitation feature-enhancement labels Sep 27, 2021

pfried mentioned this issue Oct 29, 2021

Support for month, year and time zones in time_bucket #414

Closed

NunoFilipeSantos added the continuous_aggregate label Nov 22, 2021

svenklemm closed this as completed Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database continuous aggregation with time zone #1248

Database continuous aggregation with time zone #1248

Speedkore commented May 28, 2019 •

edited

Eedredondo commented Jun 3, 2019

cevian commented Jun 6, 2019

Speedkore commented Jun 7, 2019

tobiasdirksen commented Aug 14, 2019 •

edited

boromisp commented Aug 16, 2019

cressman commented Feb 6, 2020 •

edited

lely475 commented Aug 2, 2020

gjeusel commented Oct 11, 2020

sachaventura commented Oct 13, 2020

mdespriee commented Jan 6, 2021

Greenbones commented Jan 18, 2021

arnevanacker commented Feb 14, 2021

erimatnor commented Feb 17, 2021

emanzx commented Apr 1, 2021

aadomingos commented May 18, 2021

brki commented May 19, 2021

MA-MacDonald commented Jul 13, 2021

pgwhalen commented Oct 5, 2021

msdrigg commented Dec 15, 2021

yangm97 commented Feb 21, 2022

narendracode commented Apr 13, 2022

svenklemm commented Sep 6, 2022

Himani2000 commented Nov 30, 2022

Database continuous aggregation with time zone #1248

Database continuous aggregation with time zone #1248

Comments

Speedkore commented May 28, 2019 • edited

Eedredondo commented Jun 3, 2019

cevian commented Jun 6, 2019

Speedkore commented Jun 7, 2019

tobiasdirksen commented Aug 14, 2019 • edited

boromisp commented Aug 16, 2019

cressman commented Feb 6, 2020 • edited

lely475 commented Aug 2, 2020

gjeusel commented Oct 11, 2020

sachaventura commented Oct 13, 2020

mdespriee commented Jan 6, 2021

Greenbones commented Jan 18, 2021

arnevanacker commented Feb 14, 2021

erimatnor commented Feb 17, 2021

emanzx commented Apr 1, 2021

aadomingos commented May 18, 2021

brki commented May 19, 2021

MA-MacDonald commented Jul 13, 2021

pgwhalen commented Oct 5, 2021

msdrigg commented Dec 15, 2021

yangm97 commented Feb 21, 2022

narendracode commented Apr 13, 2022

svenklemm commented Sep 6, 2022

Himani2000 commented Nov 30, 2022

Speedkore commented May 28, 2019 •

edited

tobiasdirksen commented Aug 14, 2019 •

edited

cressman commented Feb 6, 2020 •

edited