feat: add chunk size parameter for copying java data to native #1041

imatiach-msft · 2021-05-03T05:42:31Z

add chunk size parameter for copying java data to native
fix minor memory leak
increase default chunk size from 1000 to 10000, for most real scenarios this is actually still probably more on the smaller side of what it should be set to (the number of rows in the dataset, but we don't want to do a count beforehand since it is expensive)

imatiach-msft · 2021-05-03T05:43:16Z

/azp run

azure-pipelines · 2021-05-03T05:43:25Z

Azure Pipelines successfully started running 1 pipeline(s).

codecov · 2021-05-03T05:49:10Z

Codecov Report

Merging #1041 (0f87a2c) into master (aad223e) will decrease coverage by 0.03%.
The diff coverage is 94.73%.

@@            Coverage Diff             @@
##           master    #1041      +/-   ##
==========================================
- Coverage   84.86%   84.82%   -0.04%     
==========================================
  Files         203      203              
  Lines        9640     9648       +8     
  Branches      559      548      -11     
==========================================
+ Hits         8181     8184       +3     
- Misses       1459     1464       +5

Impacted Files	Coverage Δ
.../com/microsoft/ml/spark/lightgbm/TrainParams.scala	`100.00% <ø> (ø)`
...m/microsoft/ml/spark/lightgbm/LightGBMParams.scala	`89.35% <80.00%> (-0.23%)`	⬇️
...crosoft/ml/spark/lightgbm/LightGBMClassifier.scala	`91.01% <100.00%> (+0.10%)`	⬆️
...m/microsoft/ml/spark/lightgbm/LightGBMRanker.scala	`63.07% <100.00%> (ø)`
...icrosoft/ml/spark/lightgbm/LightGBMRegressor.scala	`72.22% <100.00%> (ø)`
...om/microsoft/ml/spark/lightgbm/LightGBMUtils.scala	`89.58% <100.00%> (+0.22%)`	⬆️
...a/com/microsoft/ml/spark/lightgbm/TrainUtils.scala	`86.40% <100.00%> (ø)`
...a/com/microsoft/ml/spark/io/http/HTTPClients.scala	`73.33% <0.00%> (-10.00%)`	⬇️
...microsoft/ml/spark/cognitive/SpeechToTextSDK.scala	`90.62% <0.00%> (+0.78%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c4691f...0f87a2c. Read the comment docs.

imatiach-msft · 2021-05-06T05:19:18Z

/azp run

azure-pipelines · 2021-05-06T05:19:32Z

Azure Pipelines successfully started running 1 pipeline(s).

imatiach-msft force-pushed the ilmat/chunking-param branch from 2dde42d to ad10270 Compare May 3, 2021 05:42

imatiach-msft changed the title ~~add chunk size parameter for copying java data to native~~ feat: add chunk size parameter for copying java data to native May 3, 2021

feat: added chunk size parameter for copying java data to native

0f87a2c

imatiach-msft force-pushed the ilmat/chunking-param branch from ad10270 to 0f87a2c Compare May 6, 2021 05:19

mhamilton723 approved these changes May 6, 2021

View reviewed changes

imatiach-msft merged commit b7f29e8 into microsoft:master May 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add chunk size parameter for copying java data to native #1041

feat: add chunk size parameter for copying java data to native #1041

imatiach-msft commented May 3, 2021

imatiach-msft commented May 3, 2021

azure-pipelines bot commented May 3, 2021

codecov bot commented May 3, 2021 •

edited

Loading

imatiach-msft commented May 6, 2021

azure-pipelines bot commented May 6, 2021

feat: add chunk size parameter for copying java data to native #1041

feat: add chunk size parameter for copying java data to native #1041

Conversation

imatiach-msft commented May 3, 2021

imatiach-msft commented May 3, 2021

azure-pipelines bot commented May 3, 2021

codecov bot commented May 3, 2021 • edited Loading

Codecov Report

imatiach-msft commented May 6, 2021

azure-pipelines bot commented May 6, 2021

codecov bot commented May 3, 2021 •

edited

Loading