Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add matrix type parameter and improve auto logic #1052

Merged
merged 1 commit into from
May 17, 2021

Conversation

imatiach-msft
Copy link
Contributor

  • add matrix type parameter to specify whether matrix is sparse, dense, or auto inferred
  • allow user to override default auto logic to specify whether sparse or dense matrix must be constructed in lightgbm native code
  • improve autologic to use first 10 sampled rows instead of just the head row when trying to determine whether constructed matrix should be sparse or dense

@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft imatiach-msft force-pushed the ilmat/matrix-type branch 2 times, most recently from eeab0b2 to 5beeffe Compare May 17, 2021 05:33
@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link

codecov bot commented May 17, 2021

Codecov Report

Merging #1052 (912b6e8) into master (03b8b7d) will increase coverage by 0.02%.
The diff coverage is 95.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1052      +/-   ##
==========================================
+ Coverage   84.93%   84.96%   +0.02%     
==========================================
  Files         203      203              
  Lines        9677     9689      +12     
  Branches      588      558      -30     
==========================================
+ Hits         8219     8232      +13     
+ Misses       1458     1457       -1     
Impacted Files Coverage Δ
.../com/microsoft/ml/spark/lightgbm/TrainParams.scala 100.00% <ø> (ø)
...m/microsoft/ml/spark/lightgbm/LightGBMParams.scala 90.24% <80.00%> (-0.22%) ⬇️
...com/microsoft/ml/spark/lightgbm/LightGBMBase.scala 92.77% <100.00%> (+0.08%) ⬆️
...crosoft/ml/spark/lightgbm/LightGBMClassifier.scala 91.01% <100.00%> (ø)
...m/microsoft/ml/spark/lightgbm/LightGBMRanker.scala 63.07% <100.00%> (ø)
...icrosoft/ml/spark/lightgbm/LightGBMRegressor.scala 72.22% <100.00%> (ø)
...a/com/microsoft/ml/spark/lightgbm/TrainUtils.scala 86.64% <100.00%> (+0.24%) ⬆️
...a/com/microsoft/ml/spark/io/http/HTTPClients.scala 86.66% <0.00%> (+3.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03b8b7d...912b6e8. Read the comment docs.

@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -320,12 +320,15 @@ class VerifyLightGBMClassifier extends Benchmarks with EstimatorFuzzing[LightGBM
test("Verify LightGBM Classifier with dart mode parameters") {
// Assert the dart parameters work without failing and setting them to tuned values improves performance
val Array(train, test) = pimaDF.randomSplit(Array(0.8, 0.2), seed)
val scoredDF1 = baseModel.setBoostingType("dart").fit(train).transform(test)
val scoredDF1 = baseModel.setBoostingType("dart").
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this is just fixing a flaky test in the build, unrelated to this PR

@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft imatiach-msft changed the title add matrix type parameter and improve auto logic feat: add matrix type parameter and improve auto logic May 17, 2021
@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft imatiach-msft merged commit 12cea2d into microsoft:master May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants