Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor laplacian #2212

Closed
wants to merge 2 commits into from
Closed

Refactor laplacian #2212

wants to merge 2 commits into from

Conversation

yorkerlin
Copy link
Member

@karlnapf take a look at this.
I will send the link for the notebook tomorrow.

Note that the original implementation of LaplacianInferenceMethod in Shogun used log(lu.determinant()) to compute the log_determinant is not numerical stable. (In fact, this implementation do not follow the GPML code)

Maybe MatrixOperations.h will be merged in Math.h.
However, I think in that case the Math.h file need to include the Eigen3 header.

Another issue is currently I use MatrixXd and VectorXd to pass variables in MatrixOperations.h.
maybe SGVector and SGMatrix will be better. (should I use "SGVector &" or "SGVector")
I do now know whether passing SGVector to a function is to copy elements in the SGVector.

@karlnapf
Copy link
Member

karlnapf commented May 6, 2014

To answer your questions:

  • SGVector/SGMatrix can be passed around copy by value. The objects share the same memory and do count references automatically.
  • did you encounter problems with the LU determinant? We had a discussion on this with @votjakovr maybe he can comment why this is done?
  • Math.h cannot include eigen3 headers. We want to move towards a Shogun internal linear algebra interface, see Implement heterogeneous (GPU+CPU) dot product computation routines (Deep learning project) #1973 feel free to participate in the discussion. I have the feeling you could add valuable comments there
  • It is fine to pass eigen3 objects around, as long as these methods are not exposed to the outside world. I.e. private/protected helper methods.

@@ -1 +1 @@
Subproject commit 06d38c0751ec6450c33a85fbe15ceb8543c6cc65
Subproject commit 3165600ed1d43ad630b367311e648716125ab686
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this again for?

@karlnapf
Copy link
Member

karlnapf commented May 6, 2014

The matrix operations class is a great idea. We are working on this. It should be a bit more general than your ideas here - but you are totally right with pulling things like log-determinats out.
Pls discuss with @lambday in #1973

@lambday it would be great if you could also think about adding the things that @yorkerlin needs for the GPs. We then cover many many things at once.

@lambday @yorkerlin this is a great example of synergy effects of GSoC and is perfect for the pre-GSoC time. Having those problems solved in a general way will massively benefit the rest of Shogun

@lambday
Copy link
Member

lambday commented May 6, 2014

@yorkerlin I'm open for discussion :) We're aiming at separating Shogun's linear algebra frontend from any particular backend dependency. So, in linalg/internal we can provide different implementations for most commonly used linear algebra operations in Shogun with different backends and using a common interface Shogun classes can choose to use any of them for those tasks. We'll always have some global settings (some default backend, say, Eigen3) for these and also if we want we can have module specific settings. All these things can be done via cmake options. If a user wants to use a particular backend for his algorithm that's also possible. I have made a prototype implementation here. Please check the README.

Also, could you please let me know whether your requirements fall under the modules I mentioned in the above README? What exactly are your requirements (what's the input/output of the operation that you're trying to do using Eigen3)? This will help a lot in further polishing and discover faults in the plan.

@karlnapf yeah I am quite excited about this :D Lets hope that we get the basics integrated within next week (I'll add Eigen3 sum and dot first). I gotta check some cmake stuffs also.

@yorkerlin
Copy link
Member Author

@lambday
Hi, You can check the src/shogun/machine/gp/MatrixOperations.cpp file.
In fact, I use several features of Eigen3.
I am not sure whether library support these features or not.
For example:
(eigen_s2.replicate(1,eigen_l.rows()).array().transpose().colwise() + eigen_l.array().pow(2)).matrix();

MatrixXd eigen_V = eigen_L.triangularView().adjoint().solve(MatrixXd::Identity(eigen_L.rows(),eigen_L.cols()));

eigen_v.block(0, 1, n-1, n-1).diagonal() = (0.5*ArrayXd::LinSpaced(n-1,1,n-1)).sqrt();

EigenSolver eig(eigen_v);

@yorkerlin
Copy link
Member Author

@karlnapf
I have written another variational class for logit and I have changed the name of the original variational class to get a good naming.
I am testing the result (compiling is time-consuming). Once it passed the unit test, I will send another PR.
Please merge that one (renaming PR) first so that I can send more than three PRs, which depend on the renaming one.

@yorkerlin
Copy link
Member Author

@karlnapf
Do you know how to disable most of the unit tests and enable elected unit tests in the compling time

@yorkerlin yorkerlin closed this May 8, 2014
@yorkerlin
Copy link
Member Author

@karlnapf
Do you know how to disable non-selected unit tests and enable selected unit tests to compile during the compiling time in my Laptop?

@yorkerlin yorkerlin reopened this May 8, 2014
@yorkerlin
Copy link
Member Author

sorry for closing the PR accidentally.

@yorkerlin
Copy link
Member Author

@karlnapf it seems travis fails due to the python module.
Do you know why it fails?

@lambday
Copy link
Member

lambday commented May 8, 2014

@yorkerlin alright it fits nicely in the linalg internal library I was planning. So according to me it should be best to have it like

template <class Scalar, class Vector, class Matrix, enum Backend>
struct get_cholesky
{
    // may be use better naming for variables here? like W and sW?
    static Matrix compute(Vector W, Vector sW, Matrix Kernel, Scalar scale)
    {
        // something default
    }
};

and then partial specialization for your Eigen3 implementation like

template <class Scalar>
struct get_cholesky<Scalar, Matrix<Scalar, Dynamic, 1>, Matrix<Scalar, Dynamic, Dynamic>, Backend::Eigen3>
{
    typedef Matrix<Scalar, Dynamic, 1> VectorXt;
    typedef Matrix<Scalar, Dynamic, Dynamic> MatrixXt;
    static MatrixXt compute(VectorXt W, VectorXt sW, MatrixXt Kernel, Scalar scale)
    {
        // add your implementation that you have in MatrixOperations.h
    }
};

please check out this and this. This way you can directly work with Eigen3 vectors and matrices as per your need as its all internal. And also, we can have a backend independent implementation like (see this)

template <class Scalar, class Vector, class Matrix>
Matrix get_cholesky(Vector W, Vector sW, Matrix Kernel, Scalar scale)
{
    return impl::get_cholesky<Scalar, Vector, Matrix, linalg_traits<Factorization>::backend>::compute(W, sW, Kernel, scale);
}

Then the use case would be as simple as (check this) with your default Eigen3 backend

// W, sW, kernel are Eigen3 objects
linalg::get_cholesky<float64_t, VectorXd, MatrixXd>(W, sW, kernel, scale);

Similar way you can add other methods. I'll add the basic stuffs to shogun as soon as the design gets approved :)

@lambday
Copy link
Member

lambday commented May 8, 2014

Please ping me on irc if you have any questions or doubt regarding this :)

@karlnapf
Copy link
Member

karlnapf commented May 8, 2014

@yorkerlin

  • travis had problems, now fixed
  • let's just start with putting a few very fundamental operations into the linalg framework. Like the triangular solve, factorisation, etc. Complicated linear algebra operations can stay in eigen3 for now. GPs are strongly coupled with eigen anyways, so thats fine.

@lambday could you push this hard this week? Then @yorkerlin can use at least the Cholesky solver and log-determinants.

Its looking good guys! :)

@lambday
Copy link
Member

lambday commented May 8, 2014

@karlnapf absolutely. So the latest design is finalized, right? I was checking some cmake things regarding how to use this. Just figured it out. So it would work like

cmake -DSetLinalgBackend=Eigen3/ViennaCL ..

that sets the USE_EIGEN3/USE_VIENNACL flags. For module specific settings I am not finding better var names than

cmake -DSet<ModuleName>LinalgBackend=Eigen3 ..

@karlnapf
Copy link
Member

karlnapf commented May 8, 2014

That sounds good to me, but out of my expertise. I guess @vigsterkr has a comment on this too

@vigsterkr
Copy link
Member

@lambday do we really want to do this compile time?
i mean it would be more desirable for me to be able to switch the backend during runtime. at least when one calls init_shogun_with_defaults()

@yorkerlin
Copy link
Member Author

@karlnapf @lambday and @vigsterkr
I think we can focus on some critical operations (eg, vector-matrix product, matrix product, Cholesky, LU, Eigen, LDLT solver in GPU) for now.

@yorkerlin
Copy link
Member Author

@lambday
I will try to separate some critical matrix operations at GP into the MatrixOperations class.

@karlnapf
Copy link
Member

karlnapf commented May 8, 2014

@yorkerlin no pls do not add stuff to the matrix operations class. This class might be used for very GP specific operations (which I dont think exist). However, methods like the ones you mentioned are supposed to go to linear algebra framework.

@lambday
Copy link
Member

lambday commented May 8, 2014

@yorkerlin yeah as @karlnapf said, we should aim at doing these things in a better way. I think your methods are already suited nicely in the linalg framework that we have planned (thanks to @lisitsyn for his further suggestions, we're trying to make the API super simplified). You just have this method and then as soon as I add the basics, you can add these methods in shogun/mathematics/linalg/internal/ (which doesn't exist right now).

@lambday
Copy link
Member

lambday commented May 8, 2014

@vigsterkr as per our discussion on irc, the other runtime alternative is far more painful to maintain according to me. We can, however, choose to use any backend irrespective of what global backend was set, even as shogun users. This compile time option is leading to much smaller and manageable code for these tasks I believe.

@yorkerlin
Copy link
Member Author

@karlnapf
The CMatrixOperations in GP will only include GP-specific matrix methods used for inference methods. (eg, Laplacian inference and Variational inference).

@lambday
Please let me know once your implement is done. I plan to use the CMatrixOperations class as a wrapper to bridge the gap between your linear algebra interface and the existing GP-specific matrix operations.

@lambday
Copy link
Member

lambday commented May 9, 2014

@yorkerlin are you sure that these methods that you want to add are way too specific and won't be used anywhere else but gp? I think methods like log_det (we already have a naive version in CStatistics::log_det()), get_cholesky can be used in other places as well. In that case they themselves can go straight into linalg/. But may be you can tell better because I don't understand what all the other params are for.

I'll surely let you know when the basics are added. Trying to finish within this week.

@yorkerlin
Copy link
Member Author

@lambday
The Laplacian class and Variational class use some common GP-specific helper methods.
My plan is to pull these methods out to some class(es) such as the CMatrixOperations class.

For log_det

  • I will use your log_det function once it is available.

For get_cholesky

  • It is a GP-specific function because I have to do some transformations before I call the cholesky method.
  • the method name may confuse you. Maybe I come out another good name for this method.

Let me know your thought
Let's work together to make the GP GPU-accelerated.

@karlnapf
Copy link
Member

@yorkerlin thats sounds good. You do your GP-specific transformations inside the helper class, and then call the linalg framework from within once you have reduced your tasks to standard problems.

BTW have a look into the documentation of googletest how to select certain tests

@yorkerlin
Copy link
Member Author

working on extending Laplace method for multi-classification.
Currently, I create a new Laplace class for multi-classification since some underlying structure is different from binary Laplace. However, I think it is possible to extend the binary Laplace class to do multi-classification.
Do you think I should create another class for multi-classification or just use the same class?

@karlnapf
Copy link
Member

i think it should be the same, a user doesnt care, he just wants to use one class.
However, in the class itself, you should structure things properly

@yorkerlin
Copy link
Member Author

@karlnapf
currently I created a new class for the Laplace method for multi classification.
This class of Laplace method ONLY work for soft-max likelihood because the special structure of the Hessian matrix of soft-max is used to make the inference more efficient.
What is more, multinomial probit does NOT have the same structure of Hessian matrix.
For now, I just created a new class for Laplace method for soft-max likelihood.

@yorkerlin
Copy link
Member Author

However, variatoinal method can work for multinomial probit with the help of some variational bound.
see http://eprints.gla.ac.uk/3813/.
This may be a future feature for GPC.

@karlnapf
Copy link
Member

@yorkerlin I agree with you, things are different under the hood. However, a user should have the possibility to just say "Laplace" and then the corresponding Laplace method is used internally. I think this can be solved via introducing a wrapper class CLaplacianInference that checks the likelihood and then instanciates the corresponding inference method object internally. Then we could even hide the other classes from the modular interfaces and things might be cleaner. This is in particular interesting for users who are not familiar with too many details about these things.

@karlnapf
Copy link
Member

Yeah the Girolami thing would be neat to have. He is my former supervisor and we in fact already talked about having this in Shogun. Though @emtiyaz had some not so promising results with this I believe

@emtiyaz
Copy link

emtiyaz commented Jul 31, 2014

Girolami's method works reasonably well for prediction accuracy, but not for marginal likelihood approximation. I have the results in fig. 2 of the following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to differ from others, and it might take you longer than 1 week to implement it. Better invest time on the things that you already have.

1 similar comment
@emtiyaz
Copy link

emtiyaz commented Jul 31, 2014

Girolami's method works reasonably well for prediction accuracy, but not for marginal likelihood approximation. I have the results in fig. 2 of the following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to differ from others, and it might take you longer than 1 week to implement it. Better invest time on the things that you already have.

@emtiyaz
Copy link

emtiyaz commented Jul 31, 2014

Girolami's method works reasonably well for prediction accuracy, but not
for marginal likelihood approximation. I have the results in fig. 2 of the
following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to
differ from others, and it might take you longer than 1 week to implement
it. Better invest time on the things that you already have.

On Wed, Jul 30, 2014 at 9:44 PM, Heiko Strathmann notifications@github.com
wrote:

Yeah the Girolami thing would be neat to have. He is my former supervisor
and we in fact already talked about having this in Shogun. Though @emtiyaz
https://github.com/emtiyaz had some not so promising results with this
I believe


Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

@yorkerlin
Copy link
Member Author

@emtiyaz @karlnapf
Thanks your feedback!
Since stick breaking is a powerful tool in Bayesian non-parametrics (ie, DP), I will first implement @emtiyaz 's stick breaking method for multi-classification in post-GSoC period.

@emtiyaz
Copy link

emtiyaz commented Jul 31, 2014

Hi Wu,
I will highly recommend not to do that. Stick-breaking also has a problem
that it depends on the ordering of the categories. It may not work well in
general.

Thanks
emt

On Thu, Jul 31, 2014 at 4:01 PM, Wu Lin notifications@github.com wrote:

@emtiyaz https://github.com/emtiyaz @karlnapf
https://github.com/karlnapf
Thanks your feedback!
Since stick breaking is a powerful tool in Bayesian non-parametrics (ie,
DP), I will first implement @emtiyaz https://github.com/emtiyaz 's
stick breaking method for multi-classification in post-GSoC period.


Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

@yorkerlin
Copy link
Member Author

@emtiyaz
OK. I know the stick breaking approach has an order-bias for Dirichlet process. And MCMC samplers also in general have the label switch issue for Dirichlet process. It seems the issue also occurs in GP.
Do you have any suggestion for implementing another method(s) for GP multi-classification? (fast dual method?)
BTW, the Laplace method for multi classification done now. I am writing some document for it.

@yorkerlin
Copy link
Member Author

@emtiyaz
For large-scale GPC inference, do you have any suggestion about implementing some method(s).
Currently, I know there are at least three methods for GPC I may implemented in Shogun in post-GSoC period.
Could you tell me which one(s) are best in term of speed or accuracy or robustness? (I need the priority order)
Sparse Gaussian Processes using Pseudo-inputs (FITC)
Sparse On-Line Gaussian Processes
Gaussian processes for big data at http://auai.org/uai2013/prints/papers/244.pdf

@karlnapf
Copy link
Member

karlnapf commented Aug 1, 2014

we should focus on the writeup now
@yorkerlin the notebook really has to be improved - we want nice intuition, some text, cool examples, pictures - this should be possible to understand for people who have no idea about variational methods .... i will give some more feedback soon

could you please send a pull request with the notebook only?

@yorkerlin
Copy link
Member Author

@karlnapf
got it.
I will send a PR for the notebook and another PR for the Laplace method today.

@emtiyaz
Copy link

emtiyaz commented Aug 1, 2014

I have few suggestions on that, but I am busy with a paper submission until
Aug. 4. When does GSoC end?

emt

On Fri, Aug 1, 2014 at 4:36 PM, Wu Lin notifications@github.com wrote:

@karlnapf https://github.com/karlnapf
got it.
I will send a PR for the notebook and another PR for the Laplace method
today.


Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

@karlnapf
Copy link
Member

karlnapf commented Aug 4, 2014

August 11, but the last week is reserved for other things. We want to finish implementing/writing things within the next days

@karlnapf
Copy link
Member

@yorkerlin whats the state of this one?

@iglesias
Copy link
Collaborator

@lambday, @karlnapf, @yorkerlin, what should be done about this one?

@lambday
Copy link
Member

lambday commented Oct 24, 2014

The code is too tightly coupled with Eigen3. Even if cholesky is there in linalg, we'd have to use specific Eigen3 backend for this so I think its okay for now to keep it like this way. Many of these operations are specific to GP and I'm afraid there is no better way to manage all these operations with generic linalg apart from being dependent on Eigen3. Even in future linalg won't be (and not intended to be) able to generalize all of what Eigen3 does!

Just a few things that I'd do differently for this PR

  • Try to make use of the SGMatrix/SGVector Eigen3 constructor/cast operator that Khaled added. They are neater.
  • Maybe I won't keep Eigen matrices and vectors as members and use SG* instead in the function arguments (although not much important since it's all internal)
  • Put MatrixOperators inside a nested internal namespace (shogun::gp::internal::MatrixOperations or so) and remove the C (since its not a CSGObject implementation).
  • Move its implementation from header to cpp.

@yorkerlin could you please give it a couple of mins to review this once? If you think the it's ready please let us know :)

@karlnapf
Copy link
Member

@lambday I second your thoughts on generality of linalg actually. However, it would still be cool to have expensive and simple operations in lingalg, like Cholesky, linear solve, etc. These are also used everywhere in Shogun, so we get a lot from generalising them.

@yorkerlin could you address the points that @lambday mentioned, I think they are really good

@yorkerlin
Copy link
Member Author

I will working on it at the beginning of next week.
currently I am working on developing a new feature.

On Fri, Oct 24, 2014 at 6:48 PM, Heiko Strathmann notifications@github.com
wrote:

@lambday https://github.com/lambday I second your thoughts on
generality of linalg actually. However, it would still be cool to have
expensive and simple operations in lingalg, like Cholesky, linear solve,
etc. These are also used everywhere in Shogun, so we get a lot from
generalising them.

@yorkerlin https://github.com/yorkerlin could you address the points
that @lambday https://github.com/lambday mentioned, I think they are
really good


Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

@iglesias
Copy link
Collaborator

Cool, @yorkerlin. However, before adding new stuff, it is more relevant to take care of the existing features. At least, that is my opinion ;-)

@yorkerlin
Copy link
Member Author

Working on it

On Tuesday, October 28, 2014, Fernando Iglesias notifications@github.com
wrote:

Cool, @yorkerlin https://github.com/yorkerlin. However, before adding
new stuff, it is more relevant to take care of the existing features. At
least, that is my opinion ;-)


Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

@yorkerlin
Copy link
Member Author

I will first clean up the existing GP codes in order to use the
Shogun matrix operations while implementing the FITC Laplace method for
binary classification.

@karlnapf
Will do refactor the SingleLaplace class and the SingleLaplaceWithBFGS tomorrow and clean up the MatrixOperations class.

@lambday
Yes. GP uses a lot of features of Eigen3. I will try to use the Shogun's linear algebra classes when implementing new features instead of using Eigen3. Do you have any suggestion to use the Shogun's linear algebra classes?

On Wednesday, October 29, 2014, jaster yorker yorkerjaster@gmail.com
wrote:

Working on it

On Tuesday, October 28, 2014, Fernando Iglesias <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Cool, @yorkerlin https://github.com/yorkerlin. However, before adding
new stuff, it is more relevant to take care of the existing features. At
least, that is my opinion ;-)


Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

best,

wu lin

@karlnapf
Copy link
Member

karlnapf commented Nov 1, 2014

@yorkerlin @lambday first step: Cholesky factorisation and linear solves. Maybe matrix matrix product, but only if the same matrix has to be multiplied many many times (otherwise makes no sense to use GPU)

@iglesias
Copy link
Collaborator

@karlnapf, @yorkerlin, ping :-)

@yorkerlin
Copy link
Member Author

Further clean-up work will be done once I complete the FITC stuff.
For now, I close this first.

@yorkerlin yorkerlin closed this Feb 19, 2015
@karlnapf
Copy link
Member

ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants