Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute X given Y #52

Closed
cstjean opened this issue Apr 20, 2016 · 5 comments
Closed

Compute X given Y #52

cstjean opened this issue Apr 20, 2016 · 5 comments

Comments

@cstjean
Copy link
Collaborator

cstjean commented Apr 20, 2016

Once a model has been fit to a matrix A, is there any way to fit it to another matrix holding Y constant? For example, if factor analysis is part of a pipeline that ends with an SVM classifier, the cross-validation code should learn the feature matrix Y on the training set, and compute the data matrix X on the test set, given Y.

@madeleineudell
Copy link
Owner

Yes, you can use the FixedLatentFeaturesConstraint
https://github.com/madeleineudell/LowRankModels.jl/blob/master/src/regularizers.jl#L157
as
your regularizer on Y when fitting the second model:

ry = [FixedLatentFeaturesConstraint(Y[i]) for i=1:n]

Sorry that's not yet documented!

On Wed, Apr 20, 2016 at 12:02 PM, Cédric St-Jean notifications@github.com
wrote:

Once a model has been fit to a matrix A, is there any way to fit it to
another matrix holding Y constant? For example, if factor analysis is part
of a pipeline that ends with an SVM classifier, the cross-validation code
should learn the feature matrix Y on the training set, and compute the data
matrix X on the test set, given Y.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#52

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8

(415) 729-4115

@cstjean
Copy link
Collaborator Author

cstjean commented Apr 21, 2016

That worked, thanks! For reference:

ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)]
glrm_B = GLRM(B,losses,rx,ry_B,k);
X_B, Y_B, ch = fit!(glrm_B);
Y_B == glrm.Y  # true

I'm looking for libraries to add to ScikitLearn.jl. Are you interested in supporting the scikit-learn interface? If so, I would make a PR like this.

@madeleineudell
Copy link
Owner

Yes, I'd be very happy to have LowRankModels included in ScikitLearn.jl.
I'm not sure what the best interface would be; some people will want to be
able to ask for, say, NMF or PCA or Robust PCA by name, whereas others may
want to specify a more nuanced model.

If you want to go ahead and wrap it, I suggest starting your PR from the
dataframe-ux branch, which will be merged into master in the next few
weeks. There are a few (small) breaking changes to the interface.

On Thu, Apr 21, 2016 at 5:27 AM, Cédric St-Jean notifications@github.com
wrote:

That worked, thanks! For reference:

ry_B = [LowRankModels.FixedLatentFeaturesConstraint(glrm.Y[:, i]) for i=1:size(glrm.Y, 2)]
glrm_B = GLRM(B,losses,rx,ry_B,k);
X_B, Y_B, ch = fit!(glrm_B);
Y_B == glrm.Y # true

I'm looking for libraries to add to ScikitLearn.jl. Are you interested in
supporting the scikit-learn interface? If so, I would make a PR like this
davidavdav/GaussianMixtures.jl#18.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#52 (comment)

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8

(415) 729-4115

@cstjean
Copy link
Collaborator Author

cstjean commented Apr 23, 2016

Scikitlearn needs to store all hyperparameters in the type to support clone, and GLRM is missing fit!'s params.

  • I can add a fit_params field to the type definition, with a default value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM, ::Matrix), transform(::GLRM, ::Matrix) etc. I'll also need to add some pure-kwargs constructors, like scikit does.
  • Or I can create some brand new types, SkGLRM, PCA, NNMF, etc. that each contain a GLRM object

Option 2 is less intrusive, but it's more types to maintain and tell users about. Any preference? I like option 1 in general, but it's not a great match for your codebase.

@madeleineudell
Copy link
Owner

I aesthetically prefer keeping the model separate from the algorithmic
parameters. So I would prefer making a new type if Scikitlearn needs to
store all the hyperparameters in the type. The simplest option is probably
to make a new type SkGLRM <: AbstractGLRM. I don't think the code would
require too much extra tooling to make all of the GLRM functionality
accessible to SkGLRM in that case.

PCA and NNMF need not be extra types; but there could be specialized
functions to instantiate SkGLRMs corresponding to those specialized models.

Are there other problems with this approach?

Madeleine

On Sat, Apr 23, 2016 at 4:49 AM, Cédric St-Jean notifications@github.com
wrote:

Scikitlearn needs to store all hyperparameters in the type to support
clone, and GLRM is missing fit!'s params.

  • I can add a fit_params field to the type definition, with a default
    value that maintains the current behaviour. Then I'll define ScikitLearnBase.fit!(::GLRM,
    ::Matrix), transform(::GLRM, ::Matrix) etc. I'll also need to add some
    pure-kwargs constructors, like scikit does
    http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
    .
  • Or I can create some brand new types, SkGLRM, PCA, NNMF, etc. that
    each contain a GLRM object

Option 2 is less intrusive, but it's more types to maintain and tell users
about. Any preference? I like option 1 in general, but it's not a great
match for your codebase.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#52 (comment)

Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
https://courses2.cit.cornell.edu/mru8
https://courses2.cit.cornell.edu/mru8

(415) 729-4115

@cstjean cstjean closed this as completed Apr 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants