Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] doris 1.0 LAST_VALUE function returns wrong result #9227

Closed
2 of 3 tasks
hf200012 opened this issue Apr 26, 2022 · 2 comments · Fixed by #9247
Closed
2 of 3 tasks

[Bug] doris 1.0 LAST_VALUE function returns wrong result #9227

hf200012 opened this issue Apr 26, 2022 · 2 comments · Fixed by #9247

Comments

@hf200012
Copy link
Contributor

hf200012 commented Apr 26, 2022

Search before asking

  • I had searched in the issues and found no similar issues.

This is not a bug, But, This window function may have multiple problems with the same value because of order by

Version

1.0

What's Wrong?

The result data returned by the LAST_VALUE function is not as expected
User table creation statement:

CREATE TABLE `vehicle_state_record` (
  `vehicle_id` bigint(20) NOT NULL,
  `time` datetime NOT NULL,
  `fleet_id` bigint(20) NULL,
  `fleet_name` varchar(300) NULL ,
  `tenant_id` bigint(20) NOT NULL ,
  `app_id` bigint(20) NOT NULL ,
  `record` tinyint(4) NOT NULL DEFAULT "1" ,
  `vehicle_no` varchar(16) NULL ,
  `state` tinyint(4) NOT NULL ,
  `snap_id` bigint(20) NOT NULL ,
  `insert_time` datetime NULL
) ENGINE=OLAP
UNIQUE KEY(`vehicle_id`, `time`, `fleet_id`)
PARTITION BY RANGE(`time`)
(PARTITION p202203 VALUES [('2022-03-01 00:00:00'), ('2022-04-01 00:00:00')),
PARTITION p202204 VALUES [('2022-04-01 00:00:00'), ('2022-05-01 00:00:00')),
PARTITION p202205 VALUES [('2022-05-01 00:00:00'), ('2022-06-01 00:00:00')),
PARTITION p202206 VALUES [('2022-06-01 00:00:00'), ('2022-07-01 00:00:00')),
PARTITION p202207 VALUES [('2022-07-01 00:00:00'), ('2022-08-01 00:00:00')))
DISTRIBUTED BY HASH(`vehicle_id`, `time`, `fleet_id`) BUCKETS 4
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "MONTH",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-2147483648",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_allocation" = "tag.location.default: 1",
"dynamic_partition.buckets" = "4",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.history_partition_num" = "1",
"dynamic_partition.hot_partition_num" = "0",
"dynamic_partition.reserved_history_periods" = "NULL",
"dynamic_partition.start_day_of_month" = "1",
"in_memory" = "false",
"storage_format" = "V2"
);

User data:

INSERT INTO test.vehicle_state_record (vehicle_id,`time`,fleet_id,fleet_name,tenant_id,app_id,record,vehicle_no,state,snap_id,insert_time) VALUES
  (307814477030522499,'2022-04-21 11:18:11',307814443205600326,'JLP_05_01_02',867,10001,1,'E012730',1,84295,'2022-04-24 16:32:51')
  (307814477030522499,'2022-04-23 10:21:24',307814443205600326,'JLP_05_01_02',867,10001,1,'E012730',1,84295,'2022-04-25 10:22:59')
  (307814477030522499,'2022-04-24 10:21:24',307814443205600326,'JLP_05_01_02',867,10001,1,'E012730',1,84295,'2022-04-25 10:21:59')
  (307814477030522499,'2022-04-22 10:05:24',307814443205600326,'JLP_05_01_02',867,10001,1,'E012730',1,84295,'2022-04-25 10:30:56')
  (307814477030522499,'2022-04-22 10:21:24',307814443205600326,'JLP_05_01_02',867,10001,1,'E012730',0,84295,'2022-04-25 10:23:21')

check sentence:

 SELECT vehicle_id,`time`,state ,day(`time`) myday,LAST_VALUE(state) over(PARTITION by day (`time`) order by `time` ASC ) as fegg
from vehicle_state_record

What You Expected?

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@linzch3
Copy link

linzch3 commented May 10, 2022

Hi guys, I don't think it's a bug. Instead, the correct result is returned by doris.

OVER aggregates compute an aggregated value for every input row over a range of ordered rows. In contrast to GROUP BY aggregates, OVER aggregates do not reduce the number of result rows to a single row for every group. Instead OVER aggregates produce an aggregated value for every input row.
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/queries/over-agg/

Have you double-checked whether this issue is right? @morningman @HappenLee

@linzch3
Copy link

linzch3 commented May 11, 2022

this issue should be closed, it is not an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants