Add multi class model and Platt scaling by bytesnake · Pull Request #112 · rust-ml/linfa

bytesnake · 2021-03-30T11:44:33Z

Composing model for binary to multi-class transformation, WIP

finish MultiClassModel
add Platt scaling for proper SVM's probability values
add example
improve documentation

Example

// we have to specify that we want to predict probabilities
// `Svm::<_, bool>` would also be possible and avoid Platt scaling.
let params = Svm::<_, Pr>::params()
  .gaussian_kernel(30.0);

let model = train.one_vs_all()?
    .into_iter()
    .map(|(l, x)| (l, params.fit(&x).unwrap()))
    .collect::<MultiClassModel<_, _>>();

// predict with validation dataset, the prediction has type `usize`
let pred = model.predict(&valid);

// create and print a confusion matrix
let cm = pred.confusion_matrix(&train)?;
println!("{:?}", cm);

Sauro98

Good work, nice to have a unified multi-class model 🚀

I would just add a note about the difference in the SVM params with bool or Pr also in the parameters' documentation so that the user doesn't have to go look in the example.

quietlychris · 2021-04-15T01:43:16Z

I saw that you linked the paper at https://www.csie.ntu.edu.tw/~cjlin/papers/plattprob.pdf in your explanation. It's totally possible that I missed it, but I couldn't find the first half of the

    // avoid numerical problems for large f_apb
    if f_apb >= 0.0 {
        Pr((-f_apb).exp() / (1.0 + (-f_apb).exp()))
    } else {
        Pr(1.0 / (1.0 + f_apb.exp()))
    }

loop that you use for numerical stability with f_apb in the numerator in either that article or on the Wikipedia link. From first principles, it makes sense that it would work, but just wasn't sure if that was something you found in another paper that you might want to link as well. It's probably not necessary either way, but just got me curious. Otherwise, everything else looks great :)

bytesnake · 2021-04-20T14:54:15Z

Good work, nice to have a unified multi-class model rocket

I would just add a note about the difference in the SVM params with bool or Pr also in the parameters' documentation so that the user doesn't have to go look in the example.

good point 👍 I'm not 100% sure why the compiler can't figure out when to use bool vs Pr though

I saw that you linked the paper at https://www.csie.ntu.edu.tw/~cjlin/papers/plattprob.pdf in your explanation. It's totally possible that I missed it, but I couldn't find the first half of the
    // avoid numerical problems for large f_apb
    if f_apb >= 0.0 {
        Pr((-f_apb).exp() / (1.0 + (-f_apb).exp()))
    } else {
        Pr(1.0 / (1.0 + f_apb.exp()))
    }
loop that you use for numerical stability with f_apb in the numerator in either that article or on the Wikipedia link. From first principles, it makes sense that it would work, but just wasn't sure if that was something you found in another paper that you might want to link as well. It's probably not necessary either way, but just got me curious. Otherwise, everything else looks great :)

yes you just have to multiply both sides with (-f_apb).exp(). I don't have a source right now, but probably similar to stable sigmoid implementations. Will find something :)

thank you both, I will complete the PR now and then merge

…ulti_class_model

codecov-commenter · 2021-04-25T10:15:08Z

Codecov Report

Merging #112 (d4dd77f) into master (908efde) will decrease coverage by 0.17%.
The diff coverage is 55.41%.

❗ Current head d4dd77f differs from pull request most recent head 7a6cc20. Consider uploading reports for the commit 7a6cc20 to get more accurate results

@@            Coverage Diff             @@
##           master     #112      +/-   ##
==========================================
- Coverage   58.32%   58.15%   -0.18%     
==========================================
  Files          75       77       +2     
  Lines        6695     6813     +118     
==========================================
+ Hits         3905     3962      +57     
- Misses       2790     2851      +61

Impacted Files	Coverage Δ
algorithms/linfa-svm/src/lib.rs	`22.89% <0.00%> (-2.43%)`	⬇️
algorithms/linfa-svm/src/regression.rs	`75.80% <ø> (ø)`
algorithms/linfa-svm/src/solver_smo.rs	`37.23% <ø> (ø)`
src/composing/multi_target_model.rs	`73.91% <ø> (ø)`
algorithms/linfa-svm/src/classification.rs	`79.68% <51.72%> (-5.17%)`	⬇️
src/composing/multi_class_model.rs	`52.77% <52.77%> (ø)`
src/composing/platt_scaling.rs	`58.02% <58.02%> (ø)`
src/dataset/impl_dataset.rs	`44.23% <100.00%> (+1.37%)`	⬆️
src/dataset/mod.rs	`86.20% <100.00%> (-2.10%)`	⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 908efde...7a6cc20. Read the comment docs.

bytesnake added 13 commits March 30, 2021 13:42

Add multi class model dummy file

d2bec8d

Merge branch 'master' into multi_class_model

0ef67fc

Add tests for multi class composing model

7f15a7c

Add composing models

1540616

Merge branch 'master' into multi_class_model

8df805e

Add platt solver with newton method

9cfb405

Add initial tests to newton solver

b709a98

Fix some errors in newton optimizer

8ef8ef9

Fix error in data generation, tests are working now

f38c8a6

Implement platt scaling for SV classification

f249cde

Add example for multi class SVM composition

df1a621

Improve documentation and run rustfmt

6fe1596

Merge branch 'master' into multi_class_model

82e4d9f

bytesnake marked this pull request as ready for review April 14, 2021 10:02

bytesnake changed the title ~~[WIP] Add multi class model and Platt scaling~~ Add multi class model and Platt scaling Apr 14, 2021

bytesnake requested review from Sauro98 and quietlychris April 14, 2021 10:02

Sauro98 approved these changes Apr 14, 2021

View reviewed changes

Comment thread algorithms/linfa-svm/src/classification.rs Outdated

Comment thread algorithms/linfa-svm/src/classification.rs Outdated

Comment thread algorithms/linfa-svm/examples/winequality_multi.rs

Comment thread algorithms/linfa-svm/examples/winequality_multi.rs Outdated

bytesnake added 2 commits April 14, 2021 16:45

Address review

b7599e4

Merge branch 'master' into multi_class_model

ed7e138

quietlychris approved these changes Apr 15, 2021

View reviewed changes

bytesnake added 3 commits April 25, 2021 11:59

Merge branch 'master' into multi_class_model

aa183fa

Merge branch 'multi_class_model' of github.com:bytesnake/linfa into m…

a090c78

…ulti_class_model

Make clippy happy

6a43837

bytesnake added 3 commits April 25, 2021 12:42

Remove use of reduce for 1.42.0 compability

d4dd77f

Run rustfmt

7a6cc20

Make clippy happy

46a5e83

bytesnake added 2 commits April 25, 2021 16:46

Add comments about types

0d2b672

Merge branch 'master' into multi_class_model

392de05

bytesnake force-pushed the multi_class_model branch from 629914a to b696a25 Compare April 25, 2021 16:37

Pin ndarray-csv to 0.5.0

53a173f

bytesnake force-pushed the multi_class_model branch from b696a25 to 53a173f Compare April 25, 2021 16:47

Pin ndarray-csv to 0.5.0 take two

0f0c4e4

bytesnake merged commit b7c31c5 into rust-ml:master Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add multi class model and Platt scaling#112

Add multi class model and Platt scaling#112
bytesnake merged 25 commits into
rust-ml:masterfrom
bytesnake:multi_class_model

bytesnake commented Mar 30, 2021 •

edited

Loading

Uh oh!

Sauro98 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quietlychris commented Apr 15, 2021

Uh oh!

bytesnake commented Apr 20, 2021

Uh oh!

codecov-commenter commented Apr 25, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

bytesnake commented Mar 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example

Uh oh!

Sauro98 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quietlychris commented Apr 15, 2021

Uh oh!

bytesnake commented Apr 20, 2021

Uh oh!

codecov-commenter commented Apr 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bytesnake commented Mar 30, 2021 •

edited

Loading

codecov-commenter commented Apr 25, 2021 •

edited

Loading