Skip to content

Conversation

@BiteTheDDDDt
Copy link
Contributor

@BiteTheDDDDt BiteTheDDDDt commented Mar 5, 2024

Proposed changes

fix wrong data in mv when routine load with function mapping

CREATE TABLE `test` (
  `event_id` varchar(50) NULL COMMENT '',
  `time_stamp` datetime NULL COMMENT '',
  `device_id` varchar(150) NULL DEFAULT "" COMMENT ''
) ENGINE=OLAP
DUPLICATE KEY(`event_id`)
DISTRIBUTED BY HASH(`device_id`) BUCKETS AUTO
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
); 

insert into test(event_id,time_stamp,device_id) values('ad_sdk_request','2024-03-04 00:00:00','a');

create materialized view m_view as select time_stamp, count(device_id) from test group by time_stamp;

CREATE ROUTINE LOAD test.test ON test
COLUMNS(event_id,time_stamp,device_id,time_stamp=from_unixtime(`time_stamp`))
PROPERTIES
(
    "desired_concurrent_number" = "10",
    "max_error_number" = "200000",
    "max_filter_ratio" = "1.0",
    "max_batch_interval" = "10",
    "max_batch_rows" = "200000",
    "max_batch_size" = "104857600",
    "format" = "json",
    "strip_outer_array" = "false",
    "num_as_string" = "false",
    "fuzzy_parse" = "false",
    "strict_mode" = "false",
    "timezone" = "Asia/Shanghai",
    "exec_mem_limit" = "2147483648"
)
FROM KAFKA
(
    "kafka_broker_list" = "localhost:9092",
    "kafka_topic" = "test"
);


mysql [test]>select * from test;
+----------------+---------------------+-----------+
| event_id       | time_stamp          | device_id |
+----------------+---------------------+-----------+
| ad_sdk_request | 2024-03-04 00:00:00 | a         |
| ad_sdk_request | 2007-12-01 00:30:19 | a         |
+----------------+---------------------+-----------+
2 rows in set (0.02 sec)

mysql [test]>select * from test index m_view;
+---------------------+----------------------------------------------------------+
| mv_time_stamp       | mva_SUM__CASE WHEN `device_id` IS NULL THEN 0 ELSE 1 END |
+---------------------+----------------------------------------------------------+
| NULL                |                                                        1 |
| 2024-03-04 00:00:00 |                                                        1 |
+---------------------+----------------------------------------------------------+
2 rows in set (0.02 sec)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37743 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 17df853e10256a4add62cbed770eef2b0162749f, data reload: false

------ Round 1 ----------------------------------
q1	17649	4029	4056	4029
q2	2021	150	153	150
q3	10652	942	931	931
q4	4676	936	951	936
q5	7621	2919	2978	2919
q6	180	122	126	122
q7	1311	834	795	795
q8	9520	2035	2031	2031
q9	7278	6398	6393	6393
q10	8233	2591	2611	2591
q11	421	221	217	217
q12	733	323	307	307
q13	17980	2929	2878	2878
q14	287	248	249	248
q15	490	446	443	443
q16	503	395	393	393
q17	938	863	826	826
q18	6675	5984	5759	5759
q19	1554	1508	1513	1508
q20	545	295	277	277
q21	7500	3713	3695	3695
q22	795	295	299	295
Total cold run time: 107562 ms
Total hot run time: 37743 ms

----- Round 2, with runtime_filter_mode=off -----
q1	3999	3966	3968	3966
q2	313	223	227	223
q3	2929	2946	2896	2896
q4	1842	1810	1838	1810
q5	5225	5205	5207	5205
q6	215	114	122	114
q7	2271	1829	1833	1829
q8	3191	3245	3245	3245
q9	8469	8489	8534	8489
q10	6094	3725	3696	3696
q11	528	447	439	439
q12	683	547	558	547
q13	13503	2788	2774	2774
q14	280	249	256	249
q15	468	433	453	433
q16	451	393	406	393
q17	1678	1656	1667	1656
q18	7897	7371	7286	7286
q19	1740	1596	1591	1591
q20	1958	1719	1743	1719
q21	4920	4818	4783	4783
q22	549	473	477	473
Total cold run time: 69203 ms
Total hot run time: 53816 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 178059 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 17df853e10256a4add62cbed770eef2b0162749f, data reload: false

query1	919	358	339	339
query2	7379	2074	2121	2074
query3	6719	218	206	206
query4	25237	20950	20973	20950
query5	4387	467	489	467
query6	277	181	182	181
query7	4625	312	298	298
query8	251	166	173	166
query9	8450	2195	2198	2195
query10	447	235	236	235
query11	15149	14252	14338	14252
query12	142	89	85	85
query13	1683	440	421	421
query14	9074	6942	6940	6940
query15	247	190	193	190
query16	7306	268	274	268
query17	966	602	585	585
query18	1939	295	284	284
query19	202	166	161	161
query20	101	92	90	90
query21	208	134	131	131
query22	4669	4507	4431	4431
query23	31285	30667	30836	30667
query24	12361	3077	3123	3077
query25	703	402	393	393
query26	1924	165	163	163
query27	3015	357	386	357
query28	6607	1837	1799	1799
query29	1302	634	606	606
query30	311	147	154	147
query31	915	750	726	726
query32	107	62	57	57
query33	765	268	265	265
query34	975	461	480	461
query35	946	793	820	793
query36	916	843	817	817
query37	289	72	69	69
query38	3193	3117	3135	3117
query39	1418	1356	1466	1356
query40	301	121	123	121
query41	60	55	56	55
query42	108	105	111	105
query43	458	410	386	386
query44	1081	722	715	715
query45	206	196	193	193
query46	1043	773	777	773
query47	1642	1576	1571	1571
query48	439	345	344	344
query49	1213	363	348	348
query50	781	381	381	381
query51	6728	6597	6677	6597
query52	109	91	98	91
query53	354	280	284	280
query54	342	251	265	251
query55	86	90	91	90
query56	256	246	267	246
query57	1090	997	1000	997
query58	250	228	230	228
query59	2455	2336	2252	2252
query60	279	311	252	252
query61	123	118	117	117
query62	656	406	431	406
query63	303	279	288	279
query64	6466	3266	3523	3266
query65	3014	3014	3017	3014
query66	1474	352	325	325
query67	15810	14656	14481	14481
query68	13731	561	568	561
query69	729	389	388	388
query70	1378	1129	1098	1098
query71	643	279	272	272
query72	10048	2614	2507	2507
query73	3403	337	335	335
query74	7084	6932	7016	6932
query75	6676	2689	2681	2681
query76	7918	1086	1192	1086
query77	1104	268	250	250
query78	10282	9784	9529	9529
query79	10080	520	515	515
query80	826	420	378	378
query81	480	212	209	209
query82	282	90	90	90
query83	277	145	141	141
query84	280	74	77	74
query85	1132	346	341	341
query86	375	280	296	280
query87	3390	3227	3234	3227
query88	3046	2269	2262	2262
query89	525	395	377	377
query90	2334	188	184	184
query91	162	132	129	129
query92	64	50	49	49
query93	3059	539	520	520
query94	1525	200	187	187
query95	456	337	351	337
query96	602	272	259	259
query97	4016	3922	3920	3920
query98	234	216	216	216
query99	1035	770	742	742
Total cold run time: 311712 ms
Total hot run time: 178059 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.85 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 17df853e10256a4add62cbed770eef2b0162749f, data reload: false

query1	0.05	0.03	0.04
query2	0.06	0.03	0.03
query3	0.23	0.06	0.07
query4	1.64	0.10	0.10
query5	0.50	0.48	0.50
query6	1.27	0.66	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.57	0.53	0.52
query10	0.58	0.56	0.56
query11	0.14	0.10	0.10
query12	0.12	0.10	0.11
query13	0.57	0.57	0.57
query14	0.73	0.76	0.74
query15	0.81	0.82	0.81
query16	0.37	0.37	0.37
query17	0.99	1.01	0.95
query18	0.27	0.26	0.26
query19	1.82	1.74	1.68
query20	0.02	0.01	0.01
query21	15.43	0.66	0.62
query22	3.53	3.71	2.01
query23	17.30	1.13	1.05
query24	2.29	0.61	0.58
query25	0.24	0.03	0.05
query26	0.23	0.14	0.14
query27	0.04	0.03	0.03
query28	11.75	0.84	0.83
query29	12.70	3.44	3.45
query30	0.59	0.56	0.53
query31	2.80	0.34	0.35
query32	3.35	0.43	0.44
query33	2.93	2.93	2.88
query34	15.50	4.30	4.28
query35	4.29	4.30	4.31
query36	1.09	1.01	1.02
query37	0.07	0.05	0.05
query38	0.04	0.03	0.03
query39	0.02	0.02	0.02
query40	0.18	0.13	0.13
query41	0.08	0.02	0.02
query42	0.02	0.01	0.02
query43	0.03	0.02	0.03
Total cold run time: 105.3 s
Total hot run time: 30.85 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 17df853e10256a4add62cbed770eef2b0162749f with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       19.0 seconds inserted 10000000 Rows, about 526K ops/s

starocean999
starocean999 previously approved these changes Mar 5, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Mar 5, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2024

PR approved by anyone and no changes requested.

@xiaokang xiaokang added usercase Important user case type label dev/2.0.x labels Mar 5, 2024
Copy link
Contributor

@morrySnow morrySnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test cases

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Mar 6, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 6, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2024

PR approved by at least one committer and no changes requested.

mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
@BiteTheDDDDt BiteTheDDDDt deleted the dev_0305 branch January 20, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.6-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants