Skip to content

[WIP] Add svdd implementation #3201

Open
wants to merge 8 commits into from

4 participants

@sklef
sklef commented May 26, 2014

Added realization of support vector data description from http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/

@jnothman
scikit-learn member

Thanks! This looks interesting, but requires tests. I'm marking it as a WIP (work in progress) pending the addition of tests and documentation.

@jnothman jnothman changed the title from Add svdd implementation to [WIP] Add svdd implementation May 26, 2014
@coveralls

Coverage Status

Coverage decreased (-0.03%) when pulling 5dffb62 on sklef:SVDD into c1e5f94 on scikit-learn:master.

@jnothman
scikit-learn member

In terms of documentation, I assume this should be discussed under Outlier Detection, and be included in plot_outlier_detection_housing.

The class docstring should refer to the Tax and Duin paper.

@coveralls

Coverage Status

Coverage decreased (-0.03%) when pulling c517d18 on sklef:SVDD into c1e5f94 on scikit-learn:master.

@coveralls

Coverage Status

Coverage decreased (-0.03%) when pulling d885373 on sklef:SVDD into c1e5f94 on scikit-learn:master.

@coveralls

Coverage Status

Coverage increased (+0.1%) when pulling 78b854a on sklef:SVDD into c1e5f94 on scikit-learn:master.

@coveralls

Coverage Status

Coverage increased (+0.1%) when pulling a90ec2f on sklef:SVDD into c1e5f94 on scikit-learn:master.

@coveralls

Coverage Status

Coverage increased (+0.12%) when pulling fc4b417 on sklef:SVDD into c1e5f94 on scikit-learn:master.

@sklef
sklef commented Jul 7, 2014

I've added some tests and documentation. I hope it's Ok. Please say what should be added and changed.

@GaelVaroquaux GaelVaroquaux commented on an outdated diff Jul 15, 2014
doc/modules/svm.rst
@@ -617,6 +617,38 @@ bound of the fraction of support vectors.
It can be shown that the `\nu`-SVC formulation is a reparametrization
of the `C`-SVC and therefore mathematically equivalent.
+SVDD
+----
+
+Given vectors :math:`x_1, \cdots, x_l`, :class:`SVDD` build the smallest sphere
+around them. Solvng the problem:
@GaelVaroquaux
scikit-learn member

Typo : solving

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@GaelVaroquaux GaelVaroquaux commented on an outdated diff Jul 15, 2014
doc/modules/svm.rst
@@ -617,6 +617,38 @@ bound of the fraction of support vectors.
It can be shown that the `\nu`-SVC formulation is a reparametrization
of the `C`-SVC and therefore mathematically equivalent.
+SVDD
+----
+
+Given vectors :math:`x_1, \cdots, x_l`, :class:`SVDD` build the smallest sphere
+around them. Solvng the problem:
+
+.. math::
+
+ \min R^2 + C\sum_{i = 1}^l\xi_i
+
+ \textrm {subject to } & \|x_i - a\| \leq R^2 + \xi_i\\
+ & \xi_i \geq 0, i=1, ..., n
+
+This problem isnot convex, but it can be refolmulated as convex one:
@GaelVaroquaux
scikit-learn member

Typo: is not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@GaelVaroquaux GaelVaroquaux commented on an outdated diff Jul 15, 2014
examples/applications/plot_outlier_detection_housing.py
@@ -19,7 +19,7 @@
able to focus on the main mode of the data distribution, it sticks to the
assumption that the data should be Gaussian distributed, yielding some biased
estimation of the data structure, but yet accurate to some extent.
-The One-Class SVM algorithm
+The One-Class SVM algorithm and Support Vector Data Discription
@GaelVaroquaux
scikit-learn member

I think that there is a typo here too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@sklef sklef Fixed typos in documentation
b5e6bfc
@coveralls

Coverage Status

Coverage decreased (-0.14%) when pulling b5e6bfc on sklef:SVDD into b65e4c8 on scikit-learn:master.

@sklef
sklef commented Jul 15, 2014

Ok, typos are fixed. Anything else?

@sklef
sklef commented Sep 2, 2014

There are some conflicts with new version. What is the right way to deal with them? Take new version of sklean and modify it or there is more elegant way?

@GaelVaroquaux GaelVaroquaux commented on the diff Sep 2, 2014
doc/modules/svm.rst
@@ -617,6 +617,38 @@ bound of the fraction of support vectors.
It can be shown that the `\nu`-SVC formulation is a reparametrization
of the `C`-SVC and therefore mathematically equivalent.
+SVDD
+----
+
+Given vectors :math:`x_1, \cdots, x_l`, :class:`SVDD` build the smallest sphere
@GaelVaroquaux
scikit-learn member
GaelVaroquaux added a note Sep 2, 2014

Less maths and more intuition please: why is this useful? Why is it more/differently useful than a standard SVM? What's the intuition behind it.

We also need a figure, generated from the example (don't check the generated figure in the git, use the fact that it is automatically generated by 'make html').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@GaelVaroquaux
scikit-learn member

To merge you can either do a 'git merge master' after updating master, and then resolve conflicts manually, or reapply the changes. Given that there are fairly beeffy changes to the core libSVM organization, it might be worth reapplying the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.