Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Different time format in filter causes time partitioning to not work #29533

Closed
2 of 3 tasks
IanMeta opened this issue Jan 4, 2024 · 1 comment
Closed
2 of 3 tasks

Comments

@IanMeta
Copy link

IanMeta commented Jan 4, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.0

What's Wrong?

When querying a table with time partitions, using time format like 'YYYY-mm-dd hh:mm:ss' would scan the correct partitions in query plan, but using format like 'YYYY/mm/dd hh:mm:ss' or 'YYYY-mm-ddThh:mm:ss' would perform a whole table scan but still yield the correct results.

What You Expected?

The query plan should scan the correct partition based on the filter regardless of the format.

How to Reproduce?

  1. create the table
CREATE TABLE test_table (
  AreaId bigint(20) NULL,
  BranchId bigint(20) NULL,
  CompanyId bigint(20) NULL,
  __time datetime NULL,
  value bigint(20) SUM NULL,
) ENGINE=OLAP
AGGREGATE KEY(AreaId, BranchId, CompanyId, __time)
COMMENT 'OLAP'
PARTITION BY RANGE(__time)
(
PARTITION p202310 VALUES [('2023-10-01 00:00:00'), ('2023-11-01 00:00:00')),
PARTITION p202311 VALUES [('2023-11-01 00:00:00'), ('2023-12-01 00:00:00')),
PARTITION p202312 VALUES [('2023-12-01 00:00:00'), ('2024-01-01 00:00:00')),
PARTITION p202401 VALUES [('2024-01-01 00:00:00'), ('2024-02-01 00:00:00')),
PARTITION p202402 VALUES [('2024-02-01 00:00:00'), ('2024-03-01 00:00:00')),
PARTITION p202403 VALUES [('2024-03-01 00:00:00'), ('2024-04-01 00:00:00')),
PARTITION p202404 VALUES [('2024-04-01 00:00:00'), ('2024-05-01 00:00:00'))
)
DISTRIBUTED BY HASH(AreaId, BranchId, CompanyId) BUCKETS 32
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"is_being_synced" = "false",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "MONTH",
"dynamic_partition.time_zone" = "Etc/UTC",
"dynamic_partition.start" = "-2147483648",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_allocation" = "tag.location.default: 1",
"dynamic_partition.buckets" = "32",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.history_partition_num" = "450",
"dynamic_partition.hot_partition_num" = "0",
"dynamic_partition.reserved_history_periods" = "NULL",
"dynamic_partition.storage_policy" = "",
"dynamic_partition.storage_medium" = "HDD",
"dynamic_partition.start_day_of_month" = "1",
"storage_format" = "V2",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
  1. Insert some data
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2023-10-15 10:00:00', 1000),
(2, 2, 2, '2023-10-15 12:00:00', 2000),
(3, 3, 3, '2023-10-15 14:00:00', 3000);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2023-11-15 10:00:00', 1500),
(2, 2, 2, '2023-11-15 12:00:00', 2500),
(3, 3, 3, '2023-11-15 14:00:00', 3500);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2023-12-15 10:00:00', 1200),
(2, 2, 2, '2023-12-15 12:00:00', 2200),
(3, 3, 3, '2023-12-15 14:00:00', 3200);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2024-01-15 10:00:00', 1800),
(2, 2, 2, '2024-01-15 12:00:00', 2800),
(3, 3, 3, '2024-01-15 14:00:00', 3800);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2024-02-15 10:00:00', 1400),
(2, 2, 2, '2024-02-15 12:00:00', 2400),
(3, 3, 3, '2024-02-15 14:00:00', 3400);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2024-03-15 10:00:00', 1600),
(2, 2, 2, '2024-03-15 12:00:00', 2600),
(3, 3, 3, '2024-03-15 14:00:00', 3600);
INSERT INTO test_table (AreaId, BranchId, CompanyId, __time, value)
VALUES (1, 1, 1, '2024-04-15 10:00:00', 1300),
(2, 2, 2, '2024-04-15 12:00:00', 2300),
(3, 3, 3, '2024-04-15 14:00:00', 3300);
  1. The query below partitions correctly:
explain select * from test_table 
where __time between '2024-03-01 00:00:00' and '2024-03-30 23:59:59';
     partitions=1/454, tablets=32/32, tabletList=20735387,20735389,20735391 ...
  1. The query below does not:
explain select * from test_table 
where __time between '2024-03-01T00:00:00' and '2024-03-30T23:59:59';
     partitions=7/454, tablets=224/224, tabletList=20735451,20735453,20735455 ...
  1. same for the queries below:
explain select * from test_table 
where __time between convert_tz('2024-03-01 00:00:00', '+08:00', 'America/Los_Angeles') and convert_tz('2024-03-01 00:00:00', '+08:00', 'America/Los_Angeles');
explain select * from test_table 
where __time between CAST('2024-03-01T00:00:00' as datetime) and CAST('2024-03-30T23:59:59' as datetime);

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@LemonLiTree
Copy link
Collaborator

please upgrade and then
set enable_fold_constant_by_be = true;
All correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants