Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix multiclass training with initial scores #1526

Merged
merged 6 commits into from
Jun 14, 2022

Conversation

svotaw
Copy link
Collaborator

@svotaw svotaw commented Jun 12, 2022

Summary

The use of initial scores with multiclass training was not working. There was 1 test, but it was marked as "ignore" due to being flaky. In actuality, it only passed sometimes out of pure luck since there were several bugs.

  1. The test did not use the "multiclass" objective, so LightGBM was actually using binary under the covers
  2. We were only sending 1 column of initial score data for multiclass, so when marked as true multiclass, the test always failed due to LightGBM reporting the error.
  3. LightGBM represents multiclass initial scores in column format, and we were sending in row format. The data needed to be transposed to be correct.

Tests

I turned the test back on, although it is still flaky for some reason. Could use some advice on this. However, having traced through all the code underneath, we are now appropriately sending multiclass initial scores to LightGBM. One column is sent for every class, and it is transposed from row to column format.

Dependency changes

If you needed to make any changes to dependencies of this project, please describe them here.

AB#1827802

@svotaw svotaw changed the title Fix multiclass training with initial scores fix: Fix multiclass training with initial scores Jun 12, 2022
@svotaw
Copy link
Collaborator Author

svotaw commented Jun 12, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented Jun 12, 2022

Codecov Report

Merging #1526 (715a2bc) into master (0bd3e07) will increase coverage by 0.02%.
The diff coverage is 92.00%.

@@            Coverage Diff             @@
##           master    #1526      +/-   ##
==========================================
+ Coverage   84.36%   84.39%   +0.02%     
==========================================
  Files         297      297              
  Lines       14912    14942      +30     
  Branches      718      728      +10     
==========================================
+ Hits        12581    12610      +29     
- Misses       2331     2332       +1     
Impacted Files Coverage Δ
...oft/azure/synapse/ml/lightgbm/swig/SwigUtils.scala 84.90% <40.00%> (-4.68%) ⬇️
...ynapse/ml/lightgbm/dataset/DatasetAggregator.scala 96.18% <97.77%> (+0.45%) ⬆️
.../execution/streaming/continuous/HTTPSourceV2.scala 92.80% <0.00%> (+0.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bd3e07...715a2bc. Read the comment docs.

imatiach-msft
imatiach-msft previously approved these changes Jun 13, 2022
@svotaw
Copy link
Collaborator Author

svotaw commented Jun 13, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@svotaw
Copy link
Collaborator Author

svotaw commented Jun 14, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@svotaw
Copy link
Collaborator Author

svotaw commented Jun 14, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@svotaw
Copy link
Collaborator Author

svotaw commented Jun 14, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@svotaw svotaw enabled auto-merge (squash) June 14, 2022 14:38
})
}

test("Verify chunked array transpose complex") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests!

@svotaw svotaw merged commit 0a6a728 into microsoft:master Jun 14, 2022
@svotaw svotaw deleted the multiclass-init-score branch June 14, 2022 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants