Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce network communication overhead cost on reduce step for LightGBM learners #869

Merged

Conversation

imatiach-msft
Copy link
Contributor

Reduce the network communication overhead cost in the reduce step for LightGBM learners (LightGBMClassifier, LightGBMRegressor and LightGBMRanker) by only returning the model from one of the nodes (determined by order of nodes initialized). This is a small optimization that should slightly reduce the overall training time as well as make the code more robust by reducing overall network communication during reduce step.

@imatiach-msft imatiach-msft changed the title reduce network communication overhead cost on reduce step for LightGBM learners perf: reduce network communication overhead cost on reduce step for LightGBM learners Jun 2, 2020
@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link

codecov bot commented Jun 2, 2020

Codecov Report

Merging #869 into master will increase coverage by 2.91%.
The diff coverage is 89.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #869      +/-   ##
==========================================
+ Coverage   82.35%   85.26%   +2.91%     
==========================================
  Files         189      189              
  Lines        8693     8708      +15     
  Branches      517      518       +1     
==========================================
+ Hits         7159     7425     +266     
+ Misses       1534     1283     -251     
Impacted Files Coverage Δ
...com/microsoft/ml/spark/lightgbm/LightGBMBase.scala 90.62% <ø> (ø)
...a/com/microsoft/ml/spark/lightgbm/TrainUtils.scala 87.36% <89.47%> (-0.04%) ⬇️
...icrosoft/ml/spark/downloader/ModelDownloader.scala 85.88% <0.00%> (-1.18%) ⬇️
.../microsoft/ml/spark/vw/VowpalWabbitRegressor.scala 70.00% <0.00%> (+10.00%) ⬆️
...a/com/microsoft/ml/spark/vw/VowpalWabbitBase.scala 89.04% <0.00%> (+15.06%) ⬆️
...ala/com/microsoft/ml/spark/image/UnrollImage.scala 77.65% <0.00%> (+17.02%) ⬆️
...icrosoft/ml/spark/lime/SuperpixelTransformer.scala 93.33% <0.00%> (+20.00%) ⬆️
...a/com/microsoft/ml/spark/vw/HasSumcollisions.scala 100.00% <0.00%> (+25.00%) ⬆️
...rosoft/ml/spark/core/schema/BinaryFileSchema.scala 100.00% <0.00%> (+25.00%) ⬆️
.../scala/com/microsoft/ml/spark/vw/VectorUtils.scala 96.42% <0.00%> (+28.57%) ⬆️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 840781a...a081ca5. Read the comment docs.

@mhamilton723 mhamilton723 merged commit 4ae0fe8 into microsoft:master Jun 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants