Compute X given Y #52

cstjean · 2016-04-20T19:02:17Z

Once a model has been fit to a matrix A, is there any way to fit it to another matrix holding Y constant? For example, if factor analysis is part of a pipeline that ends with an SVM classifier, the cross-validation code should learn the feature matrix Y on the training set, and compute the data matrix X on the test set, given Y.

madeleineudell · 2016-04-20T20:08:37Z

Yes, you can use the FixedLatentFeaturesConstraint
https://github.com/madeleineudell/LowRankModels.jl/blob/master/src/regularizers.jl#L157
as
your regularizer on Y when fitting the second model:

ry = [FixedLatentFeaturesConstraint(Y[i]) for i=1:n]

Sorry that's not yet documented!

On Wed, Apr 20, 2016 at 12:02 PM, Cédric St-Jean notifications@github.com
wrote:

Once a model has been fit to a matrix A, is there any way to fit it to
another matrix holding Y constant? For example, if factor analysis is part
of a pipeline that ends with an SVM classifier, the cross-validation code
should learn the feature matrix Y on the training set, and compute the data
matrix X on the test set, given Y.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#52

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8
(415) 729-4115

cstjean · 2016-04-21T12:27:29Z

That worked, thanks! For reference:

ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)]
glrm_B = GLRM(B,losses,rx,ry_B,k);
X_B, Y_B, ch = fit!(glrm_B);
Y_B == glrm.Y  # true

I'm looking for libraries to add to ScikitLearn.jl. Are you interested in supporting the scikit-learn interface? If so, I would make a PR like this.

madeleineudell · 2016-04-22T17:55:57Z

Yes, I'd be very happy to have LowRankModels included in ScikitLearn.jl.
I'm not sure what the best interface would be; some people will want to be
able to ask for, say, NMF or PCA or Robust PCA by name, whereas others may
want to specify a more nuanced model.

If you want to go ahead and wrap it, I suggest starting your PR from the
dataframe-ux branch, which will be merged into master in the next few
weeks. There are a few (small) breaking changes to the interface.

On Thu, Apr 21, 2016 at 5:27 AM, Cédric St-Jean notifications@github.com
wrote:

That worked, thanks! For reference:

ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)]
glrm_B = GLRM(B,losses,rx,ry_B,k);
X_B, Y_B, ch = fit!(glrm_B);
Y_B == glrm.Y # true

I'm looking for libraries to add to ScikitLearn.jl. Are you interested in
supporting the scikit-learn interface? If so, I would make a PR like this
davidavdav/GaussianMixtures.jl#18.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#52 (comment)

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8
(415) 729-4115

cstjean · 2016-04-23T11:49:13Z

Scikitlearn needs to store all hyperparameters in the type to support clone, and GLRM is missing fit!'s params.

I can add a fit_params field to the type definition, with a default value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM, ::Matrix), transform(::GLRM, ::Matrix) etc. I'll also need to add some pure-kwargs constructors, like scikit does.
Or I can create some brand new types, SkGLRM, PCA, NNMF, etc. that each contain a GLRM object

Option 2 is less intrusive, but it's more types to maintain and tell users about. Any preference? I like option 1 in general, but it's not a great match for your codebase.

madeleineudell · 2016-04-26T04:15:45Z

I aesthetically prefer keeping the model separate from the algorithmic
parameters. So I would prefer making a new type if Scikitlearn needs to
store all the hyperparameters in the type. The simplest option is probably
to make a new type SkGLRM <: AbstractGLRM. I don't think the code would
require too much extra tooling to make all of the GLRM functionality
accessible to SkGLRM in that case.

PCA and NNMF need not be extra types; but there could be specialized
functions to instantiate SkGLRMs corresponding to those specialized models.

Are there other problems with this approach?

Madeleine

On Sat, Apr 23, 2016 at 4:49 AM, Cédric St-Jean notifications@github.com
wrote:

Scikitlearn needs to store all hyperparameters in the type to support
clone, and GLRM is missing fit!'s params.

I can add a fit_params field to the type definition, with a default
value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM,
::Matrix), transform(::GLRM, ::Matrix) etc. I'll also need to add some
pure-kwargs constructors, like scikit does
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
.

Or I can create some brand new types, SkGLRM, PCA, NNMF, etc. that
each contain a GLRM object

Option 2 is less intrusive, but it's more types to maintain and tell users
about. Any preference? I like option 1 in general, but it's not a great
match for your codebase.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#52 (comment)

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8
(415) 729-4115

cstjean closed this as completed Apr 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute X given Y #52

Compute X given Y #52

cstjean commented Apr 20, 2016

madeleineudell commented Apr 20, 2016

cstjean commented Apr 21, 2016

madeleineudell commented Apr 22, 2016

cstjean commented Apr 23, 2016

madeleineudell commented Apr 26, 2016

Compute X given Y #52

Compute X given Y #52

Comments

cstjean commented Apr 20, 2016

madeleineudell commented Apr 20, 2016

cstjean commented Apr 21, 2016

madeleineudell commented Apr 22, 2016

cstjean commented Apr 23, 2016

madeleineudell commented Apr 26, 2016