Skip to content

[VL] Add timing for velox shuffle reader#3254

Merged
zuochunwei merged 5 commits into
apache:mainfrom
zuochunwei:sr
Sep 25, 2023
Merged

[VL] Add timing for velox shuffle reader#3254
zuochunwei merged 5 commits into
apache:mainfrom
zuochunwei:sr

Conversation

@zuochunwei
Copy link
Copy Markdown
Contributor

@zuochunwei zuochunwei commented Sep 22, 2023

What changes were proposed in this pull request?

Some jobs of ours consume most of their time for the operator of ColumnarExchange, and the metrics of VeloxShuffleReader don't contain IO(IPC read) and deserialize steps.

So I proposed this PR to complement these missing metrics.

(Please fill in changes proposed in this fix)

(Fixes: #ISSUE-ID)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions
Copy link
Copy Markdown

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@zuochunwei
Copy link
Copy Markdown
Contributor Author

@kecookier Could you finish this PR according to the comment of TODO zhaokuo, please!

zuochunwei and others added 2 commits September 25, 2023 10:15
Co-authored-by: zhaokuo03 <zhaokuo03@meituan.com>
return ipcTime;
}

public void SetDeserializeTime(long ipcTime) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetDeserializeTime -> setDeserializeTime

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetDeserializeTime -> setDeserializeTime

changed

Comment on lines +248 to +250
"decompressTime" -> SQLMetrics.createNanoTimingMetric(sparkContext, "totaltime_decompress"),
"ipcTime" -> SQLMetrics.createNanoTimingMetric(sparkContext, "totaltime_ipc"),
"deserializeTime" -> SQLMetrics.createNanoTimingMetric(sparkContext, "totaltime_deserialize"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the consistent metric name format?
e.g. totaltime_XXX or totaltime XXX

cc @rui-mo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's good to rename to totaltime XXX consistently.

Copy link
Copy Markdown
Contributor

@rui-mo rui-mo Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'd better keep them aligned.

kecookier and others added 2 commits September 25, 2023 10:59
Co-authored-by: zhaokuo03 <zhaokuo03@meituan.com>
@zhztheplayer
Copy link
Copy Markdown
Member

@kecookier @zuochunwei Would you remove "WIP" from PR title?

@zuochunwei
Copy link
Copy Markdown
Contributor Author

@kecookier @zuochunwei Would you remove "WIP" from PR title?

Done

@zuochunwei zuochunwei changed the title [WIP] add timing for velox shuffle reader [VL] add timing for velox shuffle reader Sep 25, 2023
Copy link
Copy Markdown
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After CI passes

@zhztheplayer zhztheplayer changed the title [VL] add timing for velox shuffle reader [VL] Add timing for velox shuffle reader Sep 25, 2023
@zuochunwei zuochunwei merged commit 6115a38 into apache:main Sep 25, 2023
@GlutenPerfBot
Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3254_time.csv log/native_master_09_24_2023_e7152a05a_time.csv difference percentage
q1 43.56 43.08 -0.483 98.89%
q2 24.59 24.37 -0.222 99.10%
q3 38.30 36.77 -1.527 96.01%
q4 41.02 41.00 -0.020 99.95%
q5 67.98 70.15 2.173 103.20%
q6 6.25 5.11 -1.147 81.67%
q7 86.50 86.15 -0.349 99.60%
q8 80.44 78.81 -1.625 97.98%
q9 116.90 115.81 -1.087 99.07%
q10 47.54 46.15 -1.389 97.08%
q11 20.72 19.57 -1.146 94.47%
q12 26.15 25.39 -0.755 97.11%
q13 48.76 50.31 1.554 103.19%
q14 18.36 19.95 1.588 108.65%
q15 34.41 30.64 -3.765 89.06%
q16 15.76 16.03 0.269 101.71%
q17 120.04 120.09 0.050 100.04%
q18 160.83 164.22 3.390 102.11%
q19 12.13 12.51 0.381 103.14%
q20 27.15 30.33 3.183 111.73%
q21 233.77 238.20 4.436 101.90%
q22 15.80 15.33 -0.474 97.00%
total 1286.95 1289.98 3.035 100.24%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants