Skip to content

Conversation

@Jibing-Li
Copy link
Contributor

@Jibing-Li Jibing-Li commented Jul 12, 2023

Add more profile information for external table plan time. Including init and finalize scan node time, getSplits time, create scan range time, get all partitions time and get all files for all partitions time. Also modified the Indentation to make it easier to read.

This is an example output of the new profile summary.

    Execution  Summary:
          -  Analysis  Time:  3ms
          -  Plan  Time:  26s885ms
              -  JoinReorder  Time:  N/A
              -  CreateSingleNode  Time:  N/A
              -  QueryDistributed  Time:  N/A
              -  Init  Scan  Node  Time:  1ms
              -  Finalize  Scan  Node  Time:  26s868ms
                  -  Get  Splits  Time:  26s554ms
                      -  Get  PARTITIONS  Time:  20s189ms
                      -  Get  PARTITION  FILES  Time:  6s289ms
                  -  Create  Scan  Range  Time:  314ms
          -  Schedule  Time:  1s67ms
          -  Fetch  Result  Time:  56ms
          -  Write  Result  Time:  0ms
          -  Wait  and  Fetch  Result  Time:  57ms

@github-actions github-actions bot added the area/planner Issues or PRs related to the query planner label Jul 12, 2023
@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 50.76 seconds
stream load tsv: 509 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17168961503 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712150335_clickbench_pr_176969.html

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 50.87 seconds
stream load tsv: 506 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.3 seconds inserted 10000000 Rows, about 341K ops/s
storage size: 17166207912 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712153054_clickbench_pr_176984.html

@Jibing-Li Jibing-Li marked this pull request as ready for review July 12, 2023 07:38
@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 51.12 seconds
stream load tsv: 504 seconds loaded 74807831229 Bytes, about 141 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162147730 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712155307_clickbench_pr_177000.html

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 52.82 seconds
stream load tsv: 505 seconds loaded 74807831229 Bytes, about 141 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17170127097 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712161145_clickbench_pr_177008.html

@Jibing-Li
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 53.85 seconds
stream load tsv: 507 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17169313510 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712183029_clickbench_pr_177170.html

@hello-stephen
Copy link
Contributor

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 49.74 seconds
stream load tsv: 452 seconds loaded 74807831229 Bytes, about 157 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 57 seconds loaded 1101869774 Bytes, about 18 MB/s
stream load parquet: 28 seconds loaded 861443392 Bytes, about 29 MB/s
insert into select: 26.3 seconds inserted 10000000 Rows, about 380K ops/s
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712123952_clickbench_pr_177169.html

@Jibing-Li
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 51.16 seconds
stream load tsv: 506 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17161731114 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230712225936_clickbench_pr_177438.html

morningman pushed a commit to morningman/doris that referenced this pull request Jul 13, 2023
@Jibing-Li
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 51.88 seconds
stream load tsv: 500 seconds loaded 74807831229 Bytes, about 142 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17161673913 Bytes

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 19, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit e4ac52b into apache:master Jul 20, 2023
@morningman morningman added the dev/2.0.0 2.0.0 release label Jul 20, 2023
@xiaokang xiaokang added dev/2.0.0-merged and removed dev/2.0.0 2.0.0 release labels Jul 20, 2023
xiaokang pushed a commit that referenced this pull request Jul 20, 2023
…n profile (#21749)

Add more profile information for external table plan time. Including init and finalize scan node time, getSplits time, create scan range time, get all partitions time and get all files for all partitions time. Also modified the Indentation to make it easier to read.

This is an example output of the new profile summary. 
```
    Execution  Summary:
          -  Analysis  Time:  3ms
          -  Plan  Time:  26s885ms
              -  JoinReorder  Time:  N/A
              -  CreateSingleNode  Time:  N/A
              -  QueryDistributed  Time:  N/A
              -  Init  Scan  Node  Time:  1ms
              -  Finalize  Scan  Node  Time:  26s868ms
                  -  Get  Splits  Time:  26s554ms
                      -  Get  PARTITIONS  Time:  20s189ms
                      -  Get  PARTITION  FILES  Time:  6s289ms
                  -  Create  Scan  Range  Time:  314ms
          -  Schedule  Time:  1s67ms
          -  Fetch  Result  Time:  56ms
          -  Write  Result  Time:  0ms
          -  Wait  and  Fetch  Result  Time:  57ms
```
@Jibing-Li Jibing-Li deleted the profile branch July 21, 2023 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/planner Issues or PRs related to the query planner dev/2.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants