-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix validation data creation for useSingleDataset mode #1527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #1527 +/- ##
==========================================
+ Coverage 82.84% 84.27% +1.42%
==========================================
Files 297 290 -7
Lines 14942 14819 -123
Branches 728 719 -9
==========================================
+ Hits 12379 12489 +110
+ Misses 2563 2330 -233
Continue to review full report at Codecov.
|
lightgbm/src/main/scala/com/microsoft/azure/synapse/ml/lightgbm/SharedState.scala
Outdated
Show resolved
Hide resolved
lightgbm/src/main/scala/com/microsoft/azure/synapse/ml/lightgbm/dataset/DatasetAggregator.scala
Outdated
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
lightgbm/src/main/scala/com/microsoft/azure/synapse/ml/lightgbm/dataset/DatasetAggregator.scala
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Summary
Fix validation Dataset creation in useSingleDataset mode. Due to shared code with the regular training Dataset, every partition tries to merge its data with the "single" executor Dataset. But for validation data, there is only 1 array of data, so this ends up duplicating it. This causes 2 problems:
Tests
The existing validation Dataset tests still pass.
Dependency changes
If you needed to make any changes to dependencies of this project, please describe them here.
AB#1828018