New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typed SDCA binary trainers #2506

Merged
merged 10 commits into from Feb 14, 2019

Conversation

Projects
None yet
4 participants
@wschin
Copy link
Member

wschin commented Feb 11, 2019

Make SdcaBinaryTrainer strongly-typed according to the type it produces. The existing SdcaBinaryTrainer can produce either calibrated linear model or linear model as mentioned in #2469. To fix #2469, we move common functionalities used in both cases to SdcaBinaryTrainerBase<T> where T is CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator for calibrated case and LinearBinaryModelParameters otherwise. Top-level APIs are changed accordingly. For dynamic/static APIs, we have 4 SDCA binary trainers for

  1. Calibrated linear model with simple arguments.
  2. Calibrated linear model with advanced options.
  3. Uncalibrated linear model with simple arguments.
  4. Uncalibrated linear model with advanced options.

In addition, because we don't like auto-calibration, the output schema in uncalibrated cases should not contain a probability column. This PR also fixes this in the two derived trainer classes' ComputeSdcaBinaryClassifierSchemaShape; the only case with a probability column generated is training logistic regression with Sdca. We also have two separated SdcaCalibratedBinaryTrainer.Options and SdcaBinaryTrainer.Options derived from BinaryArgumentBase. Legacy command-line tool and entry points are not affected by having LegacySdcaBinaryTrainer.

@wschin wschin changed the title Typed SDCA binary trainers [WIP] Typed SDCA binary trainers Feb 11, 2019

@wschin wschin force-pushed the wschin:typed-linear-trainers branch from 028d090 to 27536c3 Feb 11, 2019

@wschin wschin changed the title [WIP] Typed SDCA binary trainers Typed SDCA binary trainers Feb 12, 2019

@wschin wschin force-pushed the wschin:typed-linear-trainers branch 2 times, most recently from cc7766d to b6135a9 Feb 12, 2019

@codecov

This comment has been minimized.

Copy link

codecov bot commented Feb 12, 2019

Codecov Report

Merging #2506 into master will decrease coverage by 0.01%.
The diff coverage is 93%.

@@            Coverage Diff             @@
##           master    #2506      +/-   ##
==========================================
- Coverage   71.26%   71.24%   -0.02%     
==========================================
  Files         798      798              
  Lines      141231   141236       +5     
  Branches    16112    16108       -4     
==========================================
- Hits       100642   100627      -15     
- Misses      36119    36143      +24     
+ Partials     4470     4466       -4
Flag Coverage Δ
#Debug 71.24% <93%> (-0.02%) ⬇️
#production 67.57% <90.74%> (-0.02%) ⬇️
#test 85.35% <100%> (ø) ⬆️
#Resolved
@codecov

This comment has been minimized.

Copy link

codecov bot commented Feb 12, 2019

Codecov Report

❗️ No coverage uploaded for pull request base (master@fd30559). Click here to learn what that means.
The diff coverage is 93%.

@@            Coverage Diff            @@
##             master    #2506   +/-   ##
=========================================
  Coverage          ?   71.24%           
=========================================
  Files             ?      798           
  Lines             ?   141236           
  Branches          ?    16108           
=========================================
  Hits              ?   100627           
  Misses            ?    36143           
  Partials          ?     4466
Flag Coverage Δ
#Debug 71.24% <93%> (?)
#production 67.57% <90.74%> (?)
#test 85.35% <100%> (?)

@wschin wschin self-assigned this Feb 12, 2019

@wschin wschin requested review from abgoswam , TomFinley and sfilipi Feb 12, 2019

@wschin wschin added the api label Feb 12, 2019

wschin added some commits Feb 11, 2019

Typed SDCA binary trainers
Fix some tests failed because of duplicated load names
Sdca entry points also strongly-typed based on trained models.
1. Remove arguments to set up calibrator in all SDCA trainer
2. Clean up entry point base on argument's changes.
3. Update test files

@wschin wschin force-pushed the wschin:typed-linear-trainers branch from 407bdde to 267c387 Feb 14, 2019

wschin added some commits Feb 14, 2019


namespace Microsoft.ML.Samples.Dynamic
{
public class SDCASupportVectorMachine

This comment has been minimized.

@TomFinley

TomFinley Feb 14, 2019

Contributor

SDCASupportVectorMachine [](start = 17, length = 24)

This is a static class right? (Only if you happen to post another commit after this.)

This comment has been minimized.

@wschin

wschin Feb 14, 2019

Author Member

Hopefully #2548 can fix it.

@TomFinley
Copy link
Contributor

TomFinley left a comment

Looks good @wschin, this is definitely a strong step in the right direction. (And incidentally one of the last things I need to actually get rid of IPredictorProducing. 😄)

@wschin wschin merged commit 37acb6c into dotnet:master Feb 14, 2019

2 of 3 checks passed

MachineLearning-CodeCoverage #20190213.47 failed
Details
MachineLearning-CI #20190213.47 succeeded
Details
license/cla All CLA requirements met.
Details

@wschin wschin deleted the wschin:typed-linear-trainers branch Feb 14, 2019

@wschin

This comment has been minimized.

Copy link
Member Author

wschin commented Feb 14, 2019

Thank you @TomFinley.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment