Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix perf regression in ShuffleRows #5417

Merged
merged 1 commit into from
Oct 6, 2020
Merged

Conversation

eerhardt
Copy link
Member

@eerhardt eerhardt commented Oct 5, 2020

RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix #5416

Results of the added benchmark:

Method Mean Error StdDev Extra Metric
master ShuffleRows 2.911 s 0.0379 s 0.0354 s -
PR ShuffleRows 2.736 ms 0.0530 ms 0.0470 ms -

cc @jwood803 @ogo-adp @stephentoub

RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix dotnet#5416
@eerhardt eerhardt requested a review from a team as a code owner October 5, 2020 17:35
@codecov
Copy link

codecov bot commented Oct 5, 2020

Codecov Report

Merging #5417 into master will decrease coverage by 0.07%.
The diff coverage is 32.00%.

@@            Coverage Diff             @@
##           master    #5417      +/-   ##
==========================================
- Coverage   74.09%   74.01%   -0.08%     
==========================================
  Files        1019     1020       +1     
  Lines      190363   190375      +12     
  Branches    20469    20471       +2     
==========================================
- Hits       141047   140913     -134     
- Misses      43788    43916     +128     
- Partials     5528     5546      +18     
Flag Coverage Δ
#Debug 74.01% <32.00%> (-0.08%) ⬇️
#production 69.77% <80.00%> (-0.10%) ⬇️
#test 87.69% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.ML.Data/Prediction/Calibrator.cs 81.29% <ø> (+0.08%) ⬆️
test/Microsoft.ML.Benchmarks/ShuffleRowsBench.cs 0.00% <0.00%> (ø)
...soft.ML.Data/Transforms/RowShufflingTransformer.cs 73.81% <80.00%> (-0.25%) ⬇️
...osoft.ML.KMeansClustering/KMeansPlusPlusTrainer.cs 83.60% <0.00%> (-7.27%) ⬇️
src/Microsoft.ML.FastTree/Training/StepSearch.cs 57.42% <0.00%> (-4.96%) ⬇️
src/Microsoft.ML.Data/Training/TrainerUtils.cs 66.86% <0.00%> (-3.82%) ⬇️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs 85.23% <0.00%> (-3.33%) ⬇️
src/Microsoft.ML.Sweeper/AsyncSweeper.cs 71.42% <0.00%> (-1.37%) ⬇️
...crosoft.ML.StandardTrainers/Optimizer/Optimizer.cs 71.96% <0.00%> (-1.16%) ⬇️
...oft.ML.StandardTrainers/Standard/SdcaMulticlass.cs 91.46% <0.00%> (-1.03%) ⬇️
... and 5 more

Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@harishsk harishsk merged commit 35d5a47 into dotnet:master Oct 6, 2020
@harishsk
Copy link
Contributor

harishsk commented Oct 6, 2020

@eerhardt Thanks for helping with this issue.

@eerhardt eerhardt deleted the Fix5416 branch October 6, 2020 12:02
frank-dong-ms-zz added a commit that referenced this pull request Oct 8, 2020
* Update to Onnxruntime 1.5.1 (#5406)

* Added variables to tests to control Gpu settings

* Added dependency to prerelease

* Updated to 1.5.1

* Remove prerelease feed

* Nit on GPU variables

* Change the _maxCalibrationExamples default on CalibratorUtils (#5415)

* Change the _maxCalibrationExamples default

* Improving comments

* Fix perf regression in ShuffleRows (#5417)

RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix #5416

Co-authored-by: Antonio Velázquez <38739674+antoniovs1029@users.noreply.github.com>
Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
mstfbl pushed a commit to mstfbl/machinelearning that referenced this pull request Nov 11, 2020
RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix dotnet#5416
mstfbl pushed a commit that referenced this pull request Nov 11, 2020
RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix #5416
mstfbl pushed a commit to mstfbl/machinelearning that referenced this pull request Nov 12, 2020
* Update to Onnxruntime 1.5.1 (dotnet#5406)

* Added variables to tests to control Gpu settings

* Added dependency to prerelease

* Updated to 1.5.1

* Remove prerelease feed

* Nit on GPU variables

* Change the _maxCalibrationExamples default on CalibratorUtils (dotnet#5415)

* Change the _maxCalibrationExamples default

* Improving comments

* Fix perf regression in ShuffleRows (dotnet#5417)

RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix dotnet#5416

Co-authored-by: Antonio Velázquez <38739674+antoniovs1029@users.noreply.github.com>
Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
mstfbl pushed a commit that referenced this pull request Nov 12, 2020
* Update to Onnxruntime 1.5.1 (#5406)

* Added variables to tests to control Gpu settings

* Added dependency to prerelease

* Updated to 1.5.1

* Remove prerelease feed

* Nit on GPU variables

* Change the _maxCalibrationExamples default on CalibratorUtils (#5415)

* Change the _maxCalibrationExamples default

* Improving comments

* Fix perf regression in ShuffleRows (#5417)

RowShufflingTransformer is using ChannelReader incorrectly. It needs to block waiting for items to read and was Thread.Sleeping in order to wait, but not spin the current core. This caused a major perf regression.

The fix is to block synchronously correctly - by calling AsTask() on the ValueTask that is returned from the ChannelReader and block on the Task.

Fix #5416

Co-authored-by: Antonio Velázquez <38739674+antoniovs1029@users.noreply.github.com>
Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
@ghost ghost locked as resolved and limited conversation to collaborators Mar 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ShuffleRows is broken (again) in 1.5.2
4 participants