Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-based constraint exclusion doesn't work with NOW() or intervals over timezone types #118

Closed
mfreed opened this issue Jul 7, 2017 · 4 comments
Assignees
Milestone

Comments

@mfreed
Copy link
Member

mfreed commented Jul 7, 2017

Timescale's planner should eliminate many chunks via constraint exclusion based on their time intervals. This works correctly when the WHERE conditional provides a timestamp, but not a date or interval type.

In the following hypertable, I have three chunks covering different time intervals.

The following works correctly:

# EXPLAIN SELECT * from conditions WHERE time < timestamptz '2017-03-07 18:18:03.776258+00';
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Append  (cost=0.00..20.69 rows=324 width=56)
   ->  Seq Scan on conditions  (cost=0.00..0.00 rows=1 width=56)
         Filter: ("time" < '2017-03-07 18:18:03.776258+00'::timestamp with time zone)
   ->  Bitmap Heap Scan on _hyper_1_8_chunk  (cost=6.65..20.69 rows=323 width=56)
         Recheck Cond: ("time" < '2017-03-07 18:18:03.776258+00'::timestamp with time zone)
         ->  Bitmap Index Scan on "11-conditions_time_idx"  (cost=0.00..6.57 rows=323 width=0)
               Index Cond: ("time" < '2017-03-07 18:18:03.776258+00'::timestamp with time zone)
(7 rows)

The following is incorrect:

# EXPLAIN SELECT * from conditions WHERE time < date '2017-03-07 18:18:03.776258+00';
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Append  (cost=0.00..62.07 rows=970 width=56)
   ->  Seq Scan on conditions  (cost=0.00..0.00 rows=1 width=56)
         Filter: ("time" < '2017-03-07'::date)
   ->  Bitmap Heap Scan on _hyper_1_2_chunk  (cost=6.65..20.69 rows=323 width=56)
         Recheck Cond: ("time" < '2017-03-07'::date)
         ->  Bitmap Index Scan on "2-conditions_time_idx"  (cost=0.00..6.57 rows=323 width=0)
               Index Cond: ("time" < '2017-03-07'::date)
   ->  Bitmap Heap Scan on _hyper_1_7_chunk  (cost=6.65..20.69 rows=323 width=56)
         Recheck Cond: ("time" < '2017-03-07'::date)
         ->  Bitmap Index Scan on "10-conditions_time_idx"  (cost=0.00..6.57 rows=323 width=0)
               Index Cond: ("time" < '2017-03-07'::date)
   ->  Bitmap Heap Scan on _hyper_1_8_chunk  (cost=6.65..20.69 rows=323 width=56)
         Recheck Cond: ("time" < '2017-03-07'::date)
         ->  Bitmap Index Scan on "11-conditions_time_idx"  (cost=0.00..6.57 rows=323 width=0)
               Index Cond: ("time" < '2017-03-07'::date)
(15 rows)

I see similar incorrect behavior when performing the following:

# EXPLAIN SELECT * from conditions WHERE time < timestamptz '2017-03-07 18:18:03.776258+00' - interval '30 minutes';

# EXPLAIN SELECT * from conditions WHERE time < (timestamptz '2017-03-07 18:18:03.776258+00' - interval '30 minutes')::timestamptz;

# EXPLAIN SELECT * from conditions WHERE time < (date '2017-03-07 18:18:03.776258+00')::timestamptz;

We expected such behavior when including the conditional NOW() in the where clause, as the planner might not be able to evaluate that as a constant when generating its plan (e.g., due to prepared statements). But that same issue doesn't obviously apply to the above?

@mfreed
Copy link
Member Author

mfreed commented Jul 7, 2017

Follow-up: The interval issue at least seems closely related to the fact that we're dealing with timestamps: timestamptz_mi_interval() is marked as NOT immutable, because something like interval '1 month' depends on the timezones.

So if you define your time colume as "timestamp without time zone", the interval issue doesn't occur:

# \d test
               Table "public.test"
 Column |            Type             | Modifiers 
--------+-----------------------------+-----------
 time   | timestamp without time zone | not null

# EXPLAIN select * from test where time < date '2017-07-07 19:40:05.134715' - interval '3 months';
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Append  (cost=0.00..26.65 rows=618 width=16)
   ->  Seq Scan on test  (cost=0.00..0.00 rows=1 width=16)
         Filter: ("time" < '2017-04-07 00:00:00'::timestamp without time zone)
   ->  Bitmap Heap Scan on _hyper_4_10_chunk  (cost=8.93..26.65 rows=617 width=16)
         Recheck Cond: ("time" < '2017-04-07 00:00:00'::timestamp without time zone)
         ->  Bitmap Index Scan on "13-test_time_idx"  (cost=0.00..8.78 rows=617 width=0)
               Index Cond: ("time" < '2017-04-07 00:00:00'::timestamp without time zone)
(7 rows)

Notice that the planner only touched 1 of the chunks, rather than all three.

You still can't use the NOW() function here, so if you execute the same with the initial offset as now rather than a specific time, it's going to again scan them all. This is likely due to the fact that planner doesn't treat turn now immediately into a constant (e.g., due to prepared statements?), so doesn't do constraint exclusion by default.

# EXPLAIN select * from test where time < now() - interval '3 months';
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Append  (cost=0.00..89.21 rows=1852 width=16)
   ->  Seq Scan on test  (cost=0.00..0.00 rows=1 width=16)
         Filter: ("time" < (now() - '3 mons'::interval))
   ->  Bitmap Heap Scan on _hyper_4_9_chunk  (cost=8.94..29.74 rows=617 width=16)
         Recheck Cond: ("time" < (now() - '3 mons'::interval))
         ->  Bitmap Index Scan on "12-test_time_idx"  (cost=0.00..8.79 rows=617 width=0)
               Index Cond: ("time" < (now() - '3 mons'::interval))
   ->  Bitmap Heap Scan on _hyper_4_10_chunk  (cost=8.94..29.74 rows=617 width=16)
         Recheck Cond: ("time" < (now() - '3 mons'::interval))
         ->  Bitmap Index Scan on "13-test_time_idx"  (cost=0.00..8.79 rows=617 width=0)
               Index Cond: ("time" < (now() - '3 mons'::interval))
   ->  Bitmap Heap Scan on _hyper_4_11_chunk  (cost=8.94..29.74 rows=617 width=16)
         Recheck Cond: ("time" < (now() - '3 mons'::interval))
         ->  Bitmap Index Scan on "14-test_time_idx"  (cost=0.00..8.79 rows=617 width=0)
               Index Cond: ("time" < (now() - '3 mons'::interval))

@mfreed mfreed changed the title Constraint exclusion doesn't work on date/interval types Time-based constraint exclusion doesn't work with NOW() or intervals over timezone types Jul 7, 2017
@mfreed
Copy link
Member Author

mfreed commented Aug 3, 2017

#145

@mfreed
Copy link
Member Author

mfreed commented Aug 21, 2017

Addressed in 0.4.0 release: https://github.com/timescale/timescaledb/releases/tag/0.4.0

@mfreed
Copy link
Member Author

mfreed commented Aug 21, 2017

The above examples actually conflate two things: In the one example

# EXPLAIN SELECT * from conditions WHERE time < date '2017-03-07 18:18:03.776258+00';

the problem was that it was actually mixing types (timestamptz and date) in the query, and a certain time (e.g., '2017-03-07') actually means something different with a timezone and not.

So the queries given above with NOW(), timestamptz - interval, etc. all do the correct thing, but if you mix types, you can still trigger scans over all chunks. For those, just query (or cast) to the proper type (e.g., timestamp, timestamptz, int, etc.)

@mfreed mfreed closed this as completed Aug 21, 2017
@mfreed mfreed added this to the 0.4.0 milestone Aug 22, 2017
phemmer added a commit to phemmer/telegraf that referenced this issue Sep 22, 2021
major performance hit in timescaledb for some queries when timestamp with time zone is used:
timescale/timescaledb#118 (comment)
syvb pushed a commit to syvb/timescaledb that referenced this issue Sep 8, 2022
118: Adding fuzzing test for tdigest r=WireBaron a=WireBaron

This also includes fixes for numerous found issues.

Co-authored-by: Brian Rowe <brian@timescale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants