ENH: Add LinearSVR #1867

Closed
wants to merge 17 commits into
from

Conversation

Projects
None yet
4 participants
@luoq

luoq commented Apr 18, 2013

Fix #1473

sklearn/svm/classes.py
@@ -385,6 +385,104 @@ def __init__(self, nu=0.5, kernel='rbf', degree=3, gamma=0.0,
'nu_svc', kernel, degree, gamma, coef0, tol, 0., nu, 0., shrinking,
probability, cache_size, None, verbose, max_iter)
+
+class LinearSVR(BaseLibLinear,LinearRegressorMixin,SelectorMixin,SparseCoefMixin):

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Style: Insert spaces between all of the parent classes.

@jnothman

jnothman Apr 18, 2013

Member

Style: Insert spaces between all of the parent classes.

sklearn/svm/classes.py
+ def __init__(self, C=1.0, loss="l2", penalty="l2", epsilon=0.1, dual=True, tol=1e-1,
+ fit_intercept=True, intercept_scaling=1,
+ verbose=0, random_state=None):
+ super(LinearSVR,self).__init__(

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Style: Insert a space before self

@jnothman

jnothman Apr 18, 2013

Member

Style: Insert a space before self

sklearn/svm/base.py
@@ -680,7 +692,7 @@ def fit(self, X, y):
# LibLinear wants targets as doubles, even for classification
y = np.asarray(y, dtype=np.float64).ravel()
- self.raw_coef_ = train(X, y, self._get_solver_type(), self.tol,
+ self.raw_coef_ = train(X, y, self._get_solver_type(), self.tol, self.epsilon,

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Is it just me that finds it strange that tol and epsilon here become eps and p in the Cython and C code without any comment?

@jnothman

jnothman Apr 18, 2013

Member

Is it just me that finds it strange that tol and epsilon here become eps and p in the Cython and C code without any comment?

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

Unfortunately naming between cython and python and between dense and sparse versions is inconsistent :-/ I fix what I see.

@amueller

amueller Apr 20, 2013

Member

Unfortunately naming between cython and python and between dense and sparse versions is inconsistent :-/ I fix what I see.

sklearn/svm/tests/test_svm.py
+ for clf in (svm.LinearSVR(C=1.0,epsilon=0.1),
+ svm.LinearSVR(C=10.,epsilon=0.1),
+ svm.LinearSVR(loss="l1",C=10.,epsilon=0.1),
+ svm.LinearSVR(loss="l1",C=100.,epsilon=0.1),

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Is it worth varying epsilon to ensure it is transmitted to liblinear?
Also, is it appropriate to check that the solutions/parameters found are close to those produced by SVR with a linear kernel? (or is that not guaranteed? or is that liblinear's problem, not ours?)
(There should also be spaces after commas here.)

@jnothman

jnothman Apr 18, 2013

Member

Is it worth varying epsilon to ensure it is transmitted to liblinear?
Also, is it appropriate to check that the solutions/parameters found are close to those produced by SVR with a linear kernel? (or is that not guaranteed? or is that liblinear's problem, not ours?)
(There should also be spaces after commas here.)

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

I am not quite sure what to test here. I just tweaked the test for SVR.

The results of SVR with linear kernel and LinearSVR are different. There
is randomness in liblinear solution. I guess SVR is deterministic.

I have checked that the results are the same with original liblinear for
some examples. Maybe I should check this in test.

On 04/18/2013 02:28 PM, jnothman wrote:

In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr():
clf.fit(diabetes.data, diabetes.target)
assert_greater(clf.score(diabetes.data, diabetes.target), 0.02)

+def test_liblinear_svr():

  • """
  • Test Liblinear SVR
  • """
  • diabetes = datasets.load_diabetes()
  • for clf in (svm.LinearSVR(C=1.0,epsilon=0.1),
  •            svm.LinearSVR(C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=100.,epsilon=0.1),
    

Is it worth having a test to ensure epsilon is transmitted to
|liblinear| by varying it?
Also, is it appropriate to check where appropriate that the
solutions/parameters found are close to those produced by |SVR| with a
linear kernel? (or is that not guaranteed? or is that liblinear's
problem, not ours?)
(There should also be spaces after commas here.)


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847554.

Qiang Luo
Department of Mathematical Sciences
Tsinghua University
Beijing 100084
China

@luoq

luoq Apr 18, 2013

I am not quite sure what to test here. I just tweaked the test for SVR.

The results of SVR with linear kernel and LinearSVR are different. There
is randomness in liblinear solution. I guess SVR is deterministic.

I have checked that the results are the same with original liblinear for
some examples. Maybe I should check this in test.

On 04/18/2013 02:28 PM, jnothman wrote:

In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr():
clf.fit(diabetes.data, diabetes.target)
assert_greater(clf.score(diabetes.data, diabetes.target), 0.02)

+def test_liblinear_svr():

  • """
  • Test Liblinear SVR
  • """
  • diabetes = datasets.load_diabetes()
  • for clf in (svm.LinearSVR(C=1.0,epsilon=0.1),
  •            svm.LinearSVR(C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=100.,epsilon=0.1),
    

Is it worth having a test to ensure epsilon is transmitted to
|liblinear| by varying it?
Also, is it appropriate to check where appropriate that the
solutions/parameters found are close to those produced by |SVR| with a
linear kernel? (or is that not guaranteed? or is that liblinear's
problem, not ours?)
(There should also be spaces after commas here.)


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847554.

Qiang Luo
Department of Mathematical Sciences
Tsinghua University
Beijing 100084
China

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

I don't know whether such a test is relevant. But if it were relevant, it
should be done, so I thought it worth suggesting.

On Thu, Apr 18, 2013 at 5:03 PM, Qiang Luo notifications@github.com wrote:

In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr():
clf.fit(diabetes.data, diabetes.target)
assert_greater(clf.score(diabetes.data, diabetes.target), 0.02)

+def test_liblinear_svr():

  • """
  • Test Liblinear SVR
  • """
  • diabetes = datasets.load_diabetes()
  • for clf in (svm.LinearSVR(C=1.0,epsilon=0.1),
  •            svm.LinearSVR(C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=100.,epsilon=0.1),
    

I am not quite sure what to test here. I just tweaked the test for SVR.
The results of SVR with linear kernel and LinearSVR are different. There is
randomness in liblinear solution. I guess SVR is deterministic. I have
checked that the results are the same with original liblinear for some
examples. Maybe I should check this in test.
… <#13e1bf45e4a348b2_>
On 04/18/2013 02:28 PM, jnothman wrote: In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr(): > clf.fit(diabetes.data,
diabetes.target) > assert_greater(clf.score(diabetes.data,
diabetes.target), 0.02) > > + > +def test_liblinear_svr(): > + """ > + Test
Liblinear SVR > + """ > + > + diabetes = datasets.load_diabetes() > + for
clf in (svm.LinearSVR(C=1.0,epsilon=0.1), > +
svm.LinearSVR(C=10.,epsilon=0.1), > +
svm.LinearSVR(loss="l1",C=10.,epsilon=0.1), > +
svm.LinearSVR(loss="l1",C=100.,epsilon=0.1), Is it worth having a test to
ensure epsilon is transmitted to |liblinear| by varying it? Also, is it
appropriate to check where appropriate that the solutions/parameters found
are close to those produced by |SVR| with a linear kernel? (or is that not
guaranteed? or is that liblinear's problem, not ours?) (There should also
be spaces after commas here.) — Reply to this email directly or view it on
GitHub <
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847554>.
-- Qiang Luo Department of Mathematical Sciences Tsinghua University
Beijing 100084 China


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847883
.

@jnothman

jnothman Apr 18, 2013

Member

I don't know whether such a test is relevant. But if it were relevant, it
should be done, so I thought it worth suggesting.

On Thu, Apr 18, 2013 at 5:03 PM, Qiang Luo notifications@github.com wrote:

In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr():
clf.fit(diabetes.data, diabetes.target)
assert_greater(clf.score(diabetes.data, diabetes.target), 0.02)

+def test_liblinear_svr():

  • """
  • Test Liblinear SVR
  • """
  • diabetes = datasets.load_diabetes()
  • for clf in (svm.LinearSVR(C=1.0,epsilon=0.1),
  •            svm.LinearSVR(C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=10.,epsilon=0.1),
    
  •            svm.LinearSVR(loss="l1",C=100.,epsilon=0.1),
    

I am not quite sure what to test here. I just tweaked the test for SVR.
The results of SVR with linear kernel and LinearSVR are different. There is
randomness in liblinear solution. I guess SVR is deterministic. I have
checked that the results are the same with original liblinear for some
examples. Maybe I should check this in test.
… <#13e1bf45e4a348b2_>
On 04/18/2013 02:28 PM, jnothman wrote: In sklearn/svm/tests/test_svm.py:

@@ -175,6 +175,21 @@ def test_svr(): > clf.fit(diabetes.data,
diabetes.target) > assert_greater(clf.score(diabetes.data,
diabetes.target), 0.02) > > + > +def test_liblinear_svr(): > + """ > + Test
Liblinear SVR > + """ > + > + diabetes = datasets.load_diabetes() > + for
clf in (svm.LinearSVR(C=1.0,epsilon=0.1), > +
svm.LinearSVR(C=10.,epsilon=0.1), > +
svm.LinearSVR(loss="l1",C=10.,epsilon=0.1), > +
svm.LinearSVR(loss="l1",C=100.,epsilon=0.1), Is it worth having a test to
ensure epsilon is transmitted to |liblinear| by varying it? Also, is it
appropriate to check where appropriate that the solutions/parameters found
are close to those produced by |SVR| with a linear kernel? (or is that not
guaranteed? or is that liblinear's problem, not ours?) (There should also
be spaces after commas here.) — Reply to this email directly or view it on
GitHub <
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847554>.
-- Qiang Luo Department of Mathematical Sciences Tsinghua University
Beijing 100084 China


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847883
.

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

Are loss and penalty the same? I would have guessed that there is an l2 slack penalty in LinearSVR and a l1 lack penalty in SVR.

Maybe a quick sanity check of the results would be nice, but we can't really expect the results to be the same as SVR.

@amueller

amueller Apr 20, 2013

Member

Are loss and penalty the same? I would have guessed that there is an l2 slack penalty in LinearSVR and a l1 lack penalty in SVR.

Maybe a quick sanity check of the results would be nice, but we can't really expect the results to be the same as SVR.

+ """
+ X = atleast2d_or_csr(X)
+ scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
+ return scores.ravel()

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

The only differences between this class and LinearModel seem to be the ravel and constraining the X validation a bit more. I don't understand why this case needs to be distinguished, what cases require ravel, or why this class cannot merely wrap LinearModel to make the minor differences more apparent. Or is it more to do with this being a Mixin and that an ABC?

@jnothman

jnothman Apr 18, 2013

Member

The only differences between this class and LinearModel seem to be the ravel and constraining the X validation a bit more. I don't understand why this case needs to be distinguished, what cases require ravel, or why this class cannot merely wrap LinearModel to make the minor differences more apparent. Or is it more to do with this being a Mixin and that an ABC?

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

In fact I am not clear about the difference between mixin and abc. I
mimic LogisticRegression because both are based on liblinear. I want
LinearRegressorMixin act as Mixin like LinearClassifierMixin for
LogisticRegression.

predict method of LinearModel will work but _center_data and
_set_intercept seem useless here.

scores.ravel() is not necessary here. I will remove it.

On 04/18/2013 02:51 PM, jnothman wrote:

In sklearn/linear_model/base.py:

  • """
  • def decision_function(self,X):
  •    """Predict using the linear model
    
  •    Parameters
    

  •    X : {array-like, sparse matrix}, shape = [n_samples, n_features]
    
  •    Returns
    

  •    array, shape = [n_samples]
    
  •       Predicted target values per element in X.
    
  •    """
    
  •    X = atleast2d_or_csr(X)
    
  •    scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
    
  •    return scores.ravel()
    

The only differences between this class and |LinearModel| seem to be
the |ravel| and constraining the |X| validation a bit more. I don't
understand why this case needs to be distinguished, what cases require
|ravel|, or why this class cannot merely wrap |LinearModel| to make
the minor differences more apparent. Or is it more to do with this
being a |Mixin| and that an |ABC|?


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847760.

Qiang Luo
Department of Mathematical Sciences
Tsinghua University
Beijing 100084
China

@luoq

luoq Apr 18, 2013

In fact I am not clear about the difference between mixin and abc. I
mimic LogisticRegression because both are based on liblinear. I want
LinearRegressorMixin act as Mixin like LinearClassifierMixin for
LogisticRegression.

predict method of LinearModel will work but _center_data and
_set_intercept seem useless here.

scores.ravel() is not necessary here. I will remove it.

On 04/18/2013 02:51 PM, jnothman wrote:

In sklearn/linear_model/base.py:

  • """
  • def decision_function(self,X):
  •    """Predict using the linear model
    
  •    Parameters
    

  •    X : {array-like, sparse matrix}, shape = [n_samples, n_features]
    
  •    Returns
    

  •    array, shape = [n_samples]
    
  •       Predicted target values per element in X.
    
  •    """
    
  •    X = atleast2d_or_csr(X)
    
  •    scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
    
  •    return scores.ravel()
    

The only differences between this class and |LinearModel| seem to be
the |ravel| and constraining the |X| validation a bit more. I don't
understand why this case needs to be distinguished, what cases require
|ravel|, or why this class cannot merely wrap |LinearModel| to make
the minor differences more apparent. Or is it more to do with this
being a |Mixin| and that an |ABC|?


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/1867/files#r3847760.

Qiang Luo
Department of Mathematical Sciences
Tsinghua University
Beijing 100084
China

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

Maybe LinearModel should inherits from LinearRegressorMixin

@luoq

luoq Apr 18, 2013

Maybe LinearModel should inherits from LinearRegressorMixin

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

We'll probably need the opinion of someone more knowledgeable about the
inheritance hierarchy...

On Thu, Apr 18, 2013 at 5:33 PM, Qiang Luo notifications@github.com wrote:

In sklearn/linear_model/base.py:

  • """
  • def decision_function(self,X):
  •    """Predict using the linear model
    
  •    Parameters
    

  •    X : {array-like, sparse matrix}, shape = [n_samples, n_features]
    
  •    Returns
    

  •    array, shape = [n_samples]
    
  •       Predicted target values per element in X.
    
  •    """
    
  •    X = atleast2d_or_csr(X)
    
  •    scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
    
  •    return scores.ravel()
    

Maybe LinearModel should inherits from LinearRegressorMixin


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867/files#r3848241
.

@jnothman

jnothman Apr 18, 2013

Member

We'll probably need the opinion of someone more knowledgeable about the
inheritance hierarchy...

On Thu, Apr 18, 2013 at 5:33 PM, Qiang Luo notifications@github.com wrote:

In sklearn/linear_model/base.py:

  • """
  • def decision_function(self,X):
  •    """Predict using the linear model
    
  •    Parameters
    

  •    X : {array-like, sparse matrix}, shape = [n_samples, n_features]
    
  •    Returns
    

  •    array, shape = [n_samples]
    
  •       Predicted target values per element in X.
    
  •    """
    
  •    X = atleast2d_or_csr(X)
    
  •    scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
    
  •    return scores.ravel()
    

Maybe LinearModel should inherits from LinearRegressorMixin


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867/files#r3848241
.

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

@larsmans, what's the go on differentiating ABCs and Mixins?

@jnothman

jnothman Apr 18, 2013

Member

@larsmans, what's the go on differentiating ABCs and Mixins?

@jnothman

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Note you're failing a Travis test in the Doctest for sklearn.metrics.metrics.hinge_loss which doesn't expect the epsilon parameter to be printed; perhaps using # doctest: +ELLIPSIS there is advisable.

Member

jnothman commented Apr 18, 2013

Note you're failing a Travis test in the Doctest for sklearn.metrics.metrics.hinge_loss which doesn't expect the epsilon parameter to be printed; perhaps using # doctest: +ELLIPSIS there is advisable.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

+ELLIPSIS cannot remove the error. I will change the docstring to include epsilon

luoq commented Apr 18, 2013

+ELLIPSIS cannot remove the error. I will change the docstring to include epsilon

@jnothman

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

well, the other way to remove it is to use >>> est = est.fit(X, y)
instead of >>> est.fit(X, y). This will silence the output.

On Thu, Apr 18, 2013 at 5:51 PM, Qiang Luo notifications@github.com wrote:

doctest: +ELLIPSIS cannot remove the error. I will change the docstring to
include epsilon


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867#issuecomment-16562293
.

Member

jnothman commented Apr 18, 2013

well, the other way to remove it is to use >>> est = est.fit(X, y)
instead of >>> est.fit(X, y). This will silence the output.

On Thu, Apr 18, 2013 at 5:51 PM, Qiang Luo notifications@github.com wrote:

doctest: +ELLIPSIS cannot remove the error. I will change the docstring to
include epsilon


Reply to this email directly or view it on GitHubhttps://github.com/scikit-learn/scikit-learn/pull/1867#issuecomment-16562293
.

sklearn/metrics/metrics.py
- random_state=0, tol=0.0001, verbose=0)
+ LinearSVC(C=1.0, class_weight=None, dual=True, epsilon=0.1,
+ fit_intercept=True, intercept_scaling=1, loss='l2', multi_class='ovr',
+ penalty='l2', random_state=0, svr=False, tol=0.0001, verbose=0)

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 18, 2013

Member

It doesn't make sense to have a public svr attribute on a LinearSVC.

@larsmans

larsmans Apr 18, 2013

Member

It doesn't make sense to have a public svr attribute on a LinearSVC.

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

+1

sklearn/svm/base.py
- if len(self.classes_) < 2:
- raise ValueError("The number of classes has to be greater than"
- " one.")
+ if not self.svr:

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 18, 2013

Member

This violates OO design principles. SVR logic should go in the LinearSVR class, classifier logic in LinearSVC. (Use the template method pattern.)

@larsmans

larsmans Apr 18, 2013

Member

This violates OO design principles. SVR logic should go in the LinearSVR class, classifier logic in LinearSVC. (Use the template method pattern.)

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

linear_model.LogisticRegression also inherits from svm.BaseLibLinear so perhaps you're actually asking for the logic to be split between LinearSVR and a hypothetical svm.BaseLibLinearClassifier...

@jnothman

jnothman Apr 18, 2013

Member

linear_model.LogisticRegression also inherits from svm.BaseLibLinear so perhaps you're actually asking for the logic to be split between LinearSVR and a hypothetical svm.BaseLibLinearClassifier...

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 18, 2013

Member

Good point. I'd call it BaseLibLinearClf for short, but that is the way to go.

@larsmans

larsmans Apr 18, 2013

Member

Good point. I'd call it BaseLibLinearClf for short, but that is the way to go.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

I don't think this is necessary. Currently BaseLibLinear handle both regression and classification like liblinear C library. We can view it as a direct python represetation of liblinear library. LogisticRegression, LinearSVC, LinearSVR just inhert from it and hide some parameters.

@luoq

luoq Apr 18, 2013

I don't think this is necessary. Currently BaseLibLinear handle both regression and classification like liblinear C library. We can view it as a direct python represetation of liblinear library. LogisticRegression, LinearSVC, LinearSVR just inhert from it and hide some parameters.

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

Maybe this is why @larsmans singled out the data validation: it's not part of LibLinear.

Even so, the current design is unfortunate in storing attributes that are irrelevant (epsilon, class_weight, multi_class, perhaps loss). One solution to this -- cutting back BaseLibLinear's __init__ parameters to precisely the base parameters -- might involve the call to train looking more like:

train(X, y, eps=self.tol, bias=self._get_bias(), C=self.C,
        random_seed=rnd.randint(np.iinfo('i').max),
        **self._get_train_kwargs())

but I'm not convinced that's especially neat, and the real mess is in _get_solver_type and its shared error handling.

@jnothman

jnothman Apr 18, 2013

Member

Maybe this is why @larsmans singled out the data validation: it's not part of LibLinear.

Even so, the current design is unfortunate in storing attributes that are irrelevant (epsilon, class_weight, multi_class, perhaps loss). One solution to this -- cutting back BaseLibLinear's __init__ parameters to precisely the base parameters -- might involve the call to train looking more like:

train(X, y, eps=self.tol, bias=self._get_bias(), C=self.C,
        random_seed=rnd.randint(np.iinfo('i').max),
        **self._get_train_kwargs())

but I'm not convinced that's especially neat, and the real mess is in _get_solver_type and its shared error handling.

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 18, 2013

Member

In any case, self.class_weight_ should not be set on LinearSVR.

@jnothman

jnothman Apr 18, 2013

Member

In any case, self.class_weight_ should not be set on LinearSVR.

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 18, 2013

Member

We shouldn't simulate the LibLinear authors' architectural mistakes in our Python code. I know the LibSVM wrappers do that, but only because I never got round to refactoring that out.

@larsmans

larsmans Apr 18, 2013

Member

We shouldn't simulate the LibLinear authors' architectural mistakes in our Python code. I know the LibSVM wrappers do that, but only because I never got round to refactoring that out.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

Is it not acceptable to set a attribute not responding to a init argument? I'm not sure of this by reading Coding guidelines. It looks like setting a redundant ( add solver_type in BaseLibLinear) or unused (class_weight_ in LinearSVR and epsilon in LinearSVC) will not cause trouble for methods of BaseEstimator.

@luoq

luoq Apr 19, 2013

Is it not acceptable to set a attribute not responding to a init argument? I'm not sure of this by reading Coding guidelines. It looks like setting a redundant ( add solver_type in BaseLibLinear) or unused (class_weight_ in LinearSVR and epsilon in LinearSVC) will not cause trouble for methods of BaseEstimator.

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

Parameters have to be init parameters for set_params and get_params to work.
I am not sure what you are trying to say, though.

@amueller

amueller Apr 20, 2013

Member

Parameters have to be init parameters for set_params and get_params to work.
I am not sure what you are trying to say, though.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 18, 2013

LogisticRegression, LinearSVC, LinearSVR are the top level classes. Parameter loss, epsilon, class_weight are not used in at least one class.

LogisticRegression : -loss -epsilon
LinearSVC : -epsilon
LinearSVR : -class_weight

luoq commented Apr 18, 2013

LogisticRegression, LinearSVC, LinearSVR are the top level classes. Parameter loss, epsilon, class_weight are not used in at least one class.

LogisticRegression : -loss -epsilon
LinearSVC : -epsilon
LinearSVR : -class_weight

sklearn/svm/base.py
@@ -612,8 +617,12 @@ def _get_solver_type(self):
if self.multi_class != 'ovr':
raise ValueError("`multi_class` must be one of `ovr`, "
"`crammer_singer`")
- solver_type = "P%s_L%s_D%d" % (
- self.penalty.upper(), self.loss.upper(), int(self.dual))
+ if self.svr:

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 18, 2013

Member

Same remark as below. This should be handled in subclasses; a base class should not be aware of this kind of details.

@larsmans

larsmans Apr 18, 2013

Member

Same remark as below. This should be handled in subclasses; a base class should not be aware of this kind of details.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

The name BaseLibLinear is misleading because it's an estimator in itself currently. In fact, we can expose it to toplevel like LinearSVR. Maybe the name LibLinear is more appropriate.

@luoq

luoq Apr 19, 2013

The name BaseLibLinear is misleading because it's an estimator in itself currently. In fact, we can expose it to toplevel like LinearSVR. Maybe the name LibLinear is more appropriate.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

If we just want BaseLibLinear to be base, we need to move _get_solver_type, code for data validation to subclass and leave only fit(). We also need to handle epsilon, class_weight specially in different subclasses. Maybe we can build a dict containing all parameters in fit() of subclass and pass X,y,solver_type,dict to fit() in BaseLibLinear.

@luoq

luoq Apr 19, 2013

If we just want BaseLibLinear to be base, we need to move _get_solver_type, code for data validation to subclass and leave only fit(). We also need to handle epsilon, class_weight specially in different subclasses. Maybe we can build a dict containing all parameters in fit() of subclass and pass X,y,solver_type,dict to fit() in BaseLibLinear.

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 19, 2013

Member

Something that might help this is moving the default train_wrap argument
values (for parameters that not all solvers require to set) to the pyx.

@jnothman

jnothman Apr 19, 2013

Member

Something that might help this is moving the default train_wrap argument
values (for parameters that not all solvers require to set) to the pyx.

sklearn/svm/base.py
+ self.penalty.upper(), self.loss.upper(), int(self.dual))
+ else:
+ solver_type = "P%s_L%s_D%d" % (
+ self.penalty.upper(), self.loss.upper(), int(self.dual))
if not solver_type in self._solver_type_dict:

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

Is it ok to add an attribute self.solver_type_ here ?

@luoq

luoq Apr 19, 2013

Is it ok to add an attribute self.solver_type_ here ?

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 19, 2013

Member

Yes, perhaps it's a useful attribute for users who might want reference to the Liblinear API....? In any case, I would think it's better to add it within fit().

@jnothman

jnothman Apr 19, 2013

Member

Yes, perhaps it's a useful attribute for users who might want reference to the Liblinear API....? In any case, I would think it's better to add it within fit().

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

solver_type is determinated by loss, penalty, dual, multi_class and is independent of data. Besides we want to check if the solver is implemented in liblinear at init.

@luoq

luoq Apr 19, 2013

solver_type is determinated by loss, penalty, dual, multi_class and is independent of data. Besides we want to check if the solver is implemented in liblinear at init.

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 19, 2013

Member

Even if so, you still need to check if it is implemented in fit: sklearn assumes parameters can change between construction and fit.

@jnothman

jnothman Apr 19, 2013

Member

Even if so, you still need to check if it is implemented in fit: sklearn assumes parameters can change between construction and fit.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

Is this done in BaseEstimator.set_params? If so, setting solver_type in init is inappropriate because set_params will not update it. Checking solver_type is also inappropriate in init because no check is done when changing parameters. Maybe we should only check type in fit.

@luoq

luoq Apr 19, 2013

Is this done in BaseEstimator.set_params? If so, setting solver_type in init is inappropriate because set_params will not update it. Checking solver_type is also inappropriate in init because no check is done when changing parameters. Maybe we should only check type in fit.

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 19, 2013

Member

Yes, validation should be done in fit (sorry, hadn't seen this yet).

@larsmans

larsmans Apr 19, 2013

Member

Yes, validation should be done in fit (sorry, hadn't seen this yet).

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

@jnothman How would I incorporate your Pull Request #1873 ? I'm not clear about the workflow.

luoq commented Apr 19, 2013

@jnothman How would I incorporate your Pull Request #1873 ? I'm not clear about the workflow.

@jnothman

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 19, 2013

Member

Well, I'm not certain it'll get accepted. It may be best just to ignore it for the moment, and when it (or this PR) gets accepted, the later-accepted change will have to run git rebase master and sort out any conflicts introduced in the meantime.

Member

jnothman commented Apr 19, 2013

Well, I'm not certain it'll get accepted. It may be best just to ignore it for the moment, and when it (or this PR) gets accepted, the later-accepted change will have to run git rebase master and sort out any conflicts introduced in the meantime.

@larsmans

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 19, 2013

Member

@luoq Do you keep force-pushing commits? I get remarks by email but I can't find them in the GitHub web ui.

Member

larsmans commented Apr 19, 2013

@luoq Do you keep force-pushing commits? I get remarks by email but I can't find them in the GitHub web ui.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 19, 2013

No,I haven't used push -f .

luoq commented Apr 19, 2013

No,I haven't used push -f .

@larsmans

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Apr 19, 2013

Member

Ah, I see now, the comments were on the complete diff.

Member

larsmans commented Apr 19, 2013

Ah, I see now, the comments were on the complete diff.

@amueller

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

@luoq yes, chaning the tests for input validation seems the right thing to do. The reason is that when using grid search, the parameter is not passed through init.
I'm not sure what github messed up this time but I can't see any comments on the diff :-/

Member

amueller commented Apr 20, 2013

@luoq yes, chaning the tests for input validation seems the right thing to do. The reason is that when using grid search, the parameter is not passed through init.
I'm not sure what github messed up this time but I can't see any comments on the diff :-/

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 20, 2013

@amueller From e-mail notice, you have posted 5 comments. Only one is displayed in github web. Others seem talking about old commit that has been updated.

penalty and loss are different. From example in lasso we have l2 loss and l1 penalty. penalty means regularization here.

luoq commented Apr 20, 2013

@amueller From e-mail notice, you have posted 5 comments. Only one is displayed in github web. Others seem talking about old commit that has been updated.

penalty and loss are different. From example in lasso we have l2 loss and l1 penalty. penalty means regularization here.

sklearn/svm/base.py
- " one.")
-
- X = atleast2d_or_csr(X, dtype=np.float64, order="C")
+ X, y, class_weight_ = self._transform_data(X, y)

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 20, 2013

Member

This is relevant to the classifier subclasses only, and class_weight doesn't need an underscore on the end. The underscore on class attributes is used to indicate output from the fit process (and thus, you should still be setting self.class_weight_ for users to access).

@jnothman

jnothman Apr 20, 2013

Member

This is relevant to the classifier subclasses only, and class_weight doesn't need an underscore on the end. The underscore on class attributes is used to indicate output from the fit process (and thus, you should still be setting self.class_weight_ for users to access).

sklearn/svm/base.py
self.raw_coef_ = train(X, y, self._get_solver_type(), self.tol,
+ # LinearSVC, LogisticRegression have no epsilon attribute
+ self.epsilon if hasattr(self, 'epsilon') else 0.1,

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 20, 2013

Member

You can use getattr(self, 'epsilon', 0.1).

@jnothman

jnothman Apr 20, 2013

Member

You can use getattr(self, 'epsilon', 0.1).

sklearn/svm/base.py
+
+class LibLinearRegressorMixin(LibLinearMixin):
+ def _transform_data(self, X, y):
+ X = atleast2d_or_csr(X, dtype=np.float64, order="C")

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 20, 2013

Member

This is also validation. Why not just one _validate_data if this is the way to go about it?

Personally, I think this is not a good use of the "template method" pattern (@larsmans?): the subclasses should just pass their superclass X and y validated for their purposes.

@jnothman

jnothman Apr 20, 2013

Member

This is also validation. Why not just one _validate_data if this is the way to go about it?

Personally, I think this is not a good use of the "template method" pattern (@larsmans?): the subclasses should just pass their superclass X and y validated for their purposes.

- fit_intercept=fit_intercept, intercept_scaling=intercept_scaling,
- class_weight=class_weight, random_state=None)
+ self.penalty = penalty
+ self.dual = dual

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 20, 2013

Member

I'm not sure what I think of this. It's a bit too much the opposite of what you first proposed: all the liblinear estimators have parameters tol, C, fit_intercept, intercept_scaling, random_state. This implementation would suggest they share nothing, and then when you come to LibLinearMixin.fit it assumes the presence of all of these attributes, but you have to look in a completely different place to see that they were set. I think that's also an indicator of bad OOP design.

But I think @larsmans should comment on design patterns.

@jnothman

jnothman Apr 20, 2013

Member

I'm not sure what I think of this. It's a bit too much the opposite of what you first proposed: all the liblinear estimators have parameters tol, C, fit_intercept, intercept_scaling, random_state. This implementation would suggest they share nothing, and then when you come to LibLinearMixin.fit it assumes the presence of all of these attributes, but you have to look in a completely different place to see that they were set. I think that's also an indicator of bad OOP design.

But I think @larsmans should comment on design patterns.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 21, 2013

LibLinearMixin is used just to reuse the code of fit method for LinearSVR, LinearSVC, LogisticRegression. We can set common attributes in LibLinearMixin, and then add other attribute in subclass. But I think that's quite unclear because we need to manually calculate what is common and what need to be added.

@luoq

luoq Apr 21, 2013

LibLinearMixin is used just to reuse the code of fit method for LinearSVR, LinearSVC, LogisticRegression. We can set common attributes in LibLinearMixin, and then add other attribute in subclass. But I think that's quite unclear because we need to manually calculate what is common and what need to be added.

sklearn/svm/base.py
print('[LibLinear]', end='')
- # LibLinear wants targets as doubles, even for classification
- y = np.asarray(y, dtype=np.float64).ravel()
+ # epsilon, tol are named p, eps respectively in struct parameter of C source code
self.raw_coef_ = train(X, y, self._get_solver_type(), self.tol,

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 20, 2013

Member

I think using something like self._train_kwargs() to get a dict of the estimator-specific arguments would be neater; you would then put all the default values in the pyx functions.

@jnothman

jnothman Apr 20, 2013

Member

I think using something like self._train_kwargs() to get a dict of the estimator-specific arguments would be neater; you would then put all the default values in the pyx functions.

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 21, 2013

If we return all parameters for liblinear in _train_kwargs which is neat, then there will be much duplication of code in subclass, like _get_bias got called in every subclass.

We can also only return( or not return) parameters that need special treatment, which I think is what you suggest. But I think it is dirty to treat parameters unequally in the first place.

Of cause, current implementation is also dirty in the same sense. And return class_weight_ in _transform_data is dirty too.

@luoq

luoq Apr 21, 2013

If we return all parameters for liblinear in _train_kwargs which is neat, then there will be much duplication of code in subclass, like _get_bias got called in every subclass.

We can also only return( or not return) parameters that need special treatment, which I think is what you suggest. But I think it is dirty to treat parameters unequally in the first place.

Of cause, current implementation is also dirty in the same sense. And return class_weight_ in _transform_data is dirty too.

@amueller

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Apr 20, 2013

Member

@luoq That is why I said slack penalty ;) Maybe a bad wording in context of the liblinear parameters. Anyhow, what I meant was that liblinear uses the squared hinge loss by default and libsvm the standard hinge-loss. I was wondering whether something similar holds true for the epsilon insensitive case.

Member

amueller commented Apr 20, 2013

@luoq That is why I said slack penalty ;) Maybe a bad wording in context of the liblinear parameters. Anyhow, what I meant was that liblinear uses the squared hinge loss by default and libsvm the standard hinge-loss. I was wondering whether something similar holds true for the epsilon insensitive case.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 21, 2013

liblinear do not have a default for regression separately. The default is just for SVC. It looks like l1 loss SVR in liblinear and epsilon-SVR have the same formulation.

luoq commented Apr 21, 2013

liblinear do not have a default for regression separately. The default is just for SVC. It looks like l1 loss SVR in liblinear and epsilon-SVR have the same formulation.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq Apr 21, 2013

For LinearSVR, LinearSVC, LogisticRegression, there should be no unrelevant( epsilon for LinearSVC) and redundant( loss in LogisticRegression) __init__ parameters. Currently the parameters in __init__almost satisfy this condition. Except in LinearSVC when multi_class is crammer_singer, penalty, loss, dual is unrelevant.

In the other end, to call underlying liblinear library. We must

  1. map some attributes to solver_type
  2. convert some attributes to parameters needed by liblinear. This is complicated by some factors
    1. different name( epsilon -> p) and different representation ( fit_intercept=False -> bias<0)
    2. handle unrelevant parameters( add p in LinearSVC).
    3. For classification, calculate class_weight_ from class_weight and y
  3. convert X,y to double

I think the sole point of using inheritance here is to put common code in base class, and the conversions in the list before in subclass. As for OOP design, I don't have much pratice in it. I suspect somewhere is always dirty when you want to write less code in this case.

luoq commented Apr 21, 2013

For LinearSVR, LinearSVC, LogisticRegression, there should be no unrelevant( epsilon for LinearSVC) and redundant( loss in LogisticRegression) __init__ parameters. Currently the parameters in __init__almost satisfy this condition. Except in LinearSVC when multi_class is crammer_singer, penalty, loss, dual is unrelevant.

In the other end, to call underlying liblinear library. We must

  1. map some attributes to solver_type
  2. convert some attributes to parameters needed by liblinear. This is complicated by some factors
    1. different name( epsilon -> p) and different representation ( fit_intercept=False -> bias<0)
    2. handle unrelevant parameters( add p in LinearSVC).
    3. For classification, calculate class_weight_ from class_weight and y
  3. convert X,y to double

I think the sole point of using inheritance here is to put common code in base class, and the conversions in the list before in subclass. As for OOP design, I don't have much pratice in it. I suspect somewhere is always dirty when you want to write less code in this case.

@jnothman

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman Apr 29, 2013

Member

You can rebase on master now if you want to take advantage of the pyx refactor.

Member

jnothman commented Apr 29, 2013

You can rebase on master now if you want to take advantage of the pyx refactor.

luoq added some commits Apr 19, 2013

solver_type check for liblinear based estimators
1. solver_type check is done only in fit
2. more informative error message
3. add test for parameter combinations for LinearSVR
4. clean parameter combination test for LinearSVC and LogisticRegression
mv _solver_type_dict to subclass
Before this, you can set loss = "lr", "l1r", "lr2" in LinearSVC to do
logistic regression and linear svr.
clean preparation for fitting
1. combine _transfrom_data and _validate_data to _prepare_fit
2. rely on self.class_weight_ for class weight
@jnothman

This comment has been minimized.

Show comment Hide comment
@jnothman

jnothman May 12, 2013

Member

You can simplify the call to liblinear.train_wrap just a little by moving the default values for to the pyx... This should make it possible for the Python side to be more neatly object-oriented.

Member

jnothman commented May 12, 2013

You can simplify the call to liblinear.train_wrap just a little by moving the default values for to the pyx... This should make it possible for the Python side to be more neatly object-oriented.

@luoq

This comment has been minimized.

Show comment Hide comment
@luoq

luoq May 12, 2013

I am not clear with your point here.
If the call to `train_wrap' is maded in LibLinearMixin, we will always need to check if the object has epsilon, class_weight or not. And the default values set in getattr are just placehoiders, you can change them to any value with the same type.
And the other default values are all set in LinearSVR, LinearSVC, LogisticRegression. Currently there are no default values in train_wrap in liblinear.pyx.

Maybe you can provide a patch to explain your point.

luoq commented May 12, 2013

I am not clear with your point here.
If the call to `train_wrap' is maded in LibLinearMixin, we will always need to check if the object has epsilon, class_weight or not. And the default values set in getattr are just placehoiders, you can change them to any value with the same type.
And the other default values are all set in LinearSVR, LinearSVC, LogisticRegression. Currently there are no default values in train_wrap in liblinear.pyx.

Maybe you can provide a patch to explain your point.

@larsmans

This comment has been minimized.

Show comment Hide comment
@larsmans

larsmans Dec 3, 2014

Member

Closing in favor of #3877; sorry @luoq!

Member

larsmans commented Dec 3, 2014

Closing in favor of #3877; sorry @luoq!

@larsmans larsmans closed this Dec 3, 2014

@luoq luoq deleted the luoq:liblinear branch Dec 4, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment