Better sample_weight support in Ridge #1190

Closed
mblondel opened this Issue Sep 29, 2012 · 2 comments

Projects

None yet

2 participants

@mblondel
Owner

Currently only the dense_cholesky solver in Ridge supports sample_weight. To support it consistently in all solvers one can use the following trick (extract from my post on the ML):

We want to minimize \sum_i mu_i (w^T x_i - y_i)^2 where mu_i is the sample weight. This should be equivalent to \sum_i (sqrt(mu_i) w^T x_i - sqrt(mu_i) y_i)^2. So, we obtain the same result by multiplying each y_i and x_i by sqrt(mu_i).

In the dense case, it is trivial to implement but in the sparse case there's a bit of work to do as scipy sparse matrices do not support element-by-element multiplication with a vector (here the vector size is equal to n_samples). One should add an inplace_csr_row_scale utility to sparsefuncs.pyx.

The test coverage of sample_weight needs to be greatly improved too.

Owner
arjoly commented Jul 18, 2014

@mblondel is this done in #3034?

Owner

Fixed by #4116.

@mblondel mblondel closed this Jan 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment