Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow kernel to be callable, fix some sloppy implementation of the MLJ interface; add docstrings #13

Merged
merged 19 commits into from
Mar 31, 2022

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Mar 17, 2022

In this PR we:

  • update the compatibility requirement of LIBSVM.jl to the latest version

  • (enhancement) Allow the kernel hyper-parameter to be a function or other callable

  • (breaking) Overload fitted_params of all models so as to be more user-friendly

  • (breaking) Remove a number of superfluous hyper-parameters (hyper-parameters whose values are ignored internally):

    • p from LinearSVC (alias for ϵ in the cited paper) which is only defined for regression models
    • cost from NuSVC (alias for C in cited paper) which is redundant as nu is a re-parameterization of it
    • cost from OneClassSVM (again, nu is used instead)
  • (breaking) Re-implement OneClassSVM as a UnsupervisedDetector satisfying MLJ's API for such models. In particular, transform(mach, Xnew) now returns raw outlier scores. To extract normalised probabilities or for threshold outlier classification, refer to the new document string.

  • (breaking) Re-implement SVC and LinearSVC so that optional class weight dictionaries are specified along with the training data, and fix bugs in how these were interpreted internally. Remove weight support for NuSVC as LIBSVM.jl does not appear to support it.

  • (enhancement) Add MLJ-compliant document strings to all models, with detailed examples

My apologies for combining so much in a single PR. The original intention was just to update document strings. The issues were revealed only progressively as I worked through these.

@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2022

Codecov Report

Merging #13 (7b17864) into master (8a3693d) will increase coverage by 2.67%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #13      +/-   ##
==========================================
+ Coverage   88.23%   90.90%   +2.67%     
==========================================
  Files           1        1              
  Lines         153      176      +23     
==========================================
+ Hits          135      160      +25     
+ Misses         18       16       -2     
Impacted Files Coverage Δ
src/MLJLIBSVMInterface.jl 90.90% <100.00%> (+2.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a3693d...7b17864. Read the comment docs.

@ablaom ablaom marked this pull request as draft March 18, 2022 03:24
@ablaom ablaom changed the title Allow kernel to be callable Allow kernel to be callable and fix some sloppy implementation of the MLJ interface; add docstrings Mar 25, 2022
@ablaom ablaom changed the title Allow kernel to be callable and fix some sloppy implementation of the MLJ interface; add docstrings Allow kernel to be callable, fix some sloppy implementation of the MLJ interface; add docstrings Mar 25, 2022
@ablaom ablaom marked this pull request as ready for review March 28, 2022 02:07
@ablaom ablaom marked this pull request as draft March 28, 2022 02:14
@ablaom ablaom marked this pull request as ready for review March 28, 2022 03:11
@ablaom
Copy link
Member Author

ablaom commented Mar 28, 2022

Erstwhile maintainers of LIBSVM.jl: @iblis17 @barucden @till-m @mpastell.

This PR adds detailed document strings to the models constituting an MLJ interface to LIBSVM.jl . I wonder if any of you have time in the next week or so to look over these strings, which appear at the end of the file MLJLIBSVMInterface.jl . Familiarity with MLJ would be helpful, but hardly essential. I'm mainly interested in low-level editorial comments and that the descriptions of model hyper-parameters (taken mostly from scikit-learn) are correct. In return, I would be happy to update the LIBSVM.jl readme with guidance on using the MLJ interface. The MLJ doc-string standard is here.

@ablaom
Copy link
Member Author

ablaom commented Mar 28, 2022

@OkonSamuel Perhaps, some time over the next week you can look over the code (don't worry about the doc-strings).

Copy link

@barucden barucden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides that one missing parenthesis, I couldn't spot any apparent problem.

Consider adding a note regarding the serialization of the trained model when a user-defined kernel was used (see, e.g., my comment that shows the limitation). I believe this has not been targeted in LIBSVM.jl yet, so if it's not already covered in MLJ on the general level somehow, I think there should be a note for it.

src/MLJLIBSVMInterface.jl Show resolved Hide resolved
Copy link

@till-m till-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really find anything noteworthy either. Comments are just small things that caught my eye.

I also noticed that attempting to use Kernel.Precomputed now throws an error. Is there a particular reason for not supporting pre-computed kernels?

src/MLJLIBSVMInterface.jl Outdated Show resolved Hide resolved
src/MLJLIBSVMInterface.jl Show resolved Hide resolved
src/MLJLIBSVMInterface.jl Outdated Show resolved Hide resolved
@ablaom
Copy link
Member Author

ablaom commented Mar 31, 2022

@barucden says:

Consider adding a note regarding the serialization of the trained model when a user-defined kernel was used

Done, thanks!

@ablaom
Copy link
Member Author

ablaom commented Mar 31, 2022

@till-m says:

I also noticed that attempting to use Kernel.Precomputed now throws an error. Is there a particular reason for not supporting pre-computed kernels?

No, it is unlikely MLJ will support this soon as it requires adjustments to the API that would be special to SVM (at present).

@ablaom ablaom merged commit 5ed884a into master Mar 31, 2022
@ablaom
Copy link
Member Author

ablaom commented Mar 31, 2022

@till-m @barucden Thanks for your prompt review of the doc-strings!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants