Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions starrocks/_m6i.8xlarge_bluesky_1000m.count
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
count()
997999662
300 changes: 300 additions & 0 deletions starrocks/_m6i.8xlarge_bluesky_1000m.physical_query_plans
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
------------------------------------------------------------------------------------------------------------------------
Physical query plan for query Q1:

Explain String
PLAN FRAGMENT 0
OUTPUT EXPRS:4: get_json_string | 5: count
PARTITION: UNPARTITIONED

RESULT SINK

6:MERGING-EXCHANGE

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 4: get_json_string

STREAM DATA SINK
EXCHANGE ID: 06
UNPARTITIONED

5:SORT
| order by: <slot 5> 5: count DESC
| offset: 0
|
4:AGGREGATE (merge finalize)
| output: count(5: count)
| group by: 4: get_json_string
|
3:EXCHANGE

PLAN FRAGMENT 2
OUTPUT EXPRS:
colocate exec groups: ExecGroup{groupId=3, nodeIds=[0, 1, 2]}
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 4: get_json_string

2:AGGREGATE (update serialize)
| STREAMING
| output: count(*)
| group by: 4: get_json_string
|
1:Project
| <slot 4> : 6: data.commit.collection
|
0:OlapScanNode
TABLE: bluesky
PREAGGREGATION: ON
partitions=1/1
rollup: bluesky
tabletRatio=188/188
tabletList=11523,10757,12293,11272,11784,12553,11019,12043,11534,12304 ...
cardinality=982999671
avgRowSize=2.0
------------------------------------------------------------------------------------------------------------------------
Physical query plan for query Q2:

Explain String
PLAN FRAGMENT 0
OUTPUT EXPRS:4: get_json_string | 6: count | 7: count
PARTITION: UNPARTITIONED

RESULT SINK

6:MERGING-EXCHANGE

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 4: get_json_string

STREAM DATA SINK
EXCHANGE ID: 06
UNPARTITIONED

5:SORT
| order by: <slot 6> 6: count DESC
| offset: 0
|
4:AGGREGATE (merge finalize)
| output: count(6: count), multi_distinct_count(7: count)
| group by: 4: get_json_string
|
3:EXCHANGE

PLAN FRAGMENT 2
OUTPUT EXPRS:
colocate exec groups: ExecGroup{groupId=3, nodeIds=[0, 1, 2]}
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 4: get_json_string

2:AGGREGATE (update serialize)
| STREAMING
| output: count(*), multi_distinct_count(5: get_json_string)
| group by: 4: get_json_string
|
1:Project
| <slot 4> : 10: data.commit.collection
| <slot 5> : 11: data.did
|
0:OlapScanNode
TABLE: bluesky
PREAGGREGATION: ON
PREDICATES: 8: data.kind = 'commit', 9: data.commit.operation = 'create'
partitions=1/1
rollup: bluesky
tabletRatio=188/188
tabletList=11523,10757,12293,11272,11784,12553,11019,12043,11534,12304 ...
cardinality=982999671
avgRowSize=6.0
------------------------------------------------------------------------------------------------------------------------
Physical query plan for query Q3:

Explain String
PLAN FRAGMENT 0
OUTPUT EXPRS:4: get_json_string | 5: hour_from_unixtime | 6: count
PARTITION: UNPARTITIONED

RESULT SINK

6:MERGING-EXCHANGE

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 4: get_json_string, 5: hour_from_unixtime

STREAM DATA SINK
EXCHANGE ID: 06
UNPARTITIONED

5:SORT
| order by: <slot 5> 5: hour_from_unixtime ASC, <slot 4> 4: get_json_string ASC
| offset: 0
|
4:AGGREGATE (merge finalize)
| output: count(6: count)
| group by: 4: get_json_string, 5: hour_from_unixtime
|
3:EXCHANGE

PLAN FRAGMENT 2
OUTPUT EXPRS:
colocate exec groups: ExecGroup{groupId=3, nodeIds=[0, 1, 2]}
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 4: get_json_string, 5: hour_from_unixtime

2:AGGREGATE (update serialize)
| STREAMING
| output: count(*)
| group by: 4: get_json_string, 5: hour_from_unixtime
|
1:Project
| <slot 4> : 9: data.commit.collection
| <slot 5> : hour_from_unixtime(CAST(CAST(10: data.time_us AS DOUBLE) / 1000000.0 AS BIGINT))
|
0:OlapScanNode
TABLE: bluesky
PREAGGREGATION: ON
PREDICATES: 7: data.kind = 'commit', 8: data.commit.operation = 'create', array_contains(['app.bsky.feed.post','app.bsky.feed.repost','app.bsky.feed.like'], 9: data.commit.collection)
partitions=1/1
rollup: bluesky
tabletRatio=188/188
tabletList=11523,10757,12293,11272,11784,12553,11019,12043,11534,12304 ...
cardinality=245749918
avgRowSize=6.0
------------------------------------------------------------------------------------------------------------------------
Physical query plan for query Q4:

Explain String
PLAN FRAGMENT 0
OUTPUT EXPRS:4: get_json_string | 7: to_datetime
PARTITION: UNPARTITIONED

RESULT SINK

7:MERGING-EXCHANGE
limit: 3

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 4: get_json_string

STREAM DATA SINK
EXCHANGE ID: 07
UNPARTITIONED

6:TOP-N
| order by: <slot 7> 7: to_datetime ASC
| offset: 0
| limit: 3
|
5:Project
| <slot 4> : 4: get_json_string
| <slot 7> : to_datetime(6: min, 6)
|
4:AGGREGATE (merge finalize)
| output: min(6: min)
| group by: 4: get_json_string
|
3:EXCHANGE

PLAN FRAGMENT 2
OUTPUT EXPRS:
colocate exec groups: ExecGroup{groupId=3, nodeIds=[0, 1, 2]}
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 4: get_json_string

2:AGGREGATE (update serialize)
| STREAMING
| output: min(5: get_json_int)
| group by: 4: get_json_string
|
1:Project
| <slot 4> : 11: data.did
| <slot 5> : 12: data.time_us
|
0:OlapScanNode
TABLE: bluesky
PREAGGREGATION: ON
PREDICATES: 8: data.kind = 'commit', 9: data.commit.operation = 'create', 10: data.commit.collection = 'app.bsky.feed.post'
partitions=1/1
rollup: bluesky
tabletRatio=188/188
tabletList=11523,10757,12293,11272,11784,12553,11019,12043,11534,12304 ...
cardinality=982999671
avgRowSize=7.0
------------------------------------------------------------------------------------------------------------------------
Physical query plan for query Q5:

Explain String
PLAN FRAGMENT 0
OUTPUT EXPRS:4: get_json_string | 8: date_diff
PARTITION: UNPARTITIONED

RESULT SINK

7:MERGING-EXCHANGE
limit: 3

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 4: get_json_string

STREAM DATA SINK
EXCHANGE ID: 07
UNPARTITIONED

6:TOP-N
| order by: <slot 8> 8: date_diff DESC
| offset: 0
| limit: 3
|
5:Project
| <slot 4> : 4: get_json_string
| <slot 8> : date_diff('millisecond', to_datetime(6: min, 6), to_datetime(7: max, 6))
|
4:AGGREGATE (merge finalize)
| output: min(6: min), max(7: max)
| group by: 4: get_json_string
|
3:EXCHANGE

PLAN FRAGMENT 2
OUTPUT EXPRS:
colocate exec groups: ExecGroup{groupId=3, nodeIds=[0, 1, 2]}
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 4: get_json_string

2:AGGREGATE (update serialize)
| STREAMING
| output: min(5: get_json_int), max(5: get_json_int)
| group by: 4: get_json_string
|
1:Project
| <slot 4> : 12: data.did
| <slot 5> : 13: data.time_us
|
0:OlapScanNode
TABLE: bluesky
PREAGGREGATION: ON
PREDICATES: 9: data.kind = 'commit', 10: data.commit.operation = 'create', 11: data.commit.collection = 'app.bsky.feed.post'
partitions=1/1
rollup: bluesky
tabletRatio=188/188
tabletList=11523,10757,12293,11272,11784,12553,11019,12043,11534,12304 ...
cardinality=982999671
avgRowSize=7.0
5 changes: 5 additions & 0 deletions starrocks/_m6i.8xlarge_bluesky_1000m.results_runtime
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[1.06,0.85,0.84],
[9.15,9.14,20.52],
[2.09,1.96,14.79],
[1.42,1.34,1.40],
[1.47,1.51,
3 changes: 3 additions & 0 deletions starrocks/_m6i.8xlarge_bluesky_1000m.total_size
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
TableName IndexName Size ReplicaCount RowCount
bluesky bluesky 189.606 GB 188 982999671
Total 189.606 GB 188
2 changes: 2 additions & 0 deletions starrocks/_m6i.8xlarge_bluesky_100m.count
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
count()
99999984
Loading