Refactor laplacian #2212

yorkerlin · 2014-05-06T04:06:05Z

@karlnapf take a look at this.
I will send the link for the notebook tomorrow.

Note that the original implementation of LaplacianInferenceMethod in Shogun used log(lu.determinant()) to compute the log_determinant is not numerical stable. (In fact, this implementation do not follow the GPML code)

Maybe MatrixOperations.h will be merged in Math.h.
However, I think in that case the Math.h file need to include the Eigen3 header.

Another issue is currently I use MatrixXd and VectorXd to pass variables in MatrixOperations.h.
maybe SGVector and SGMatrix will be better. (should I use "SGVector &" or "SGVector")
I do now know whether passing SGVector to a function is to copy elements in the SGVector.

karlnapf · 2014-05-06T11:14:47Z

To answer your questions:

SGVector/SGMatrix can be passed around copy by value. The objects share the same memory and do count references automatically.
did you encounter problems with the LU determinant? We had a discussion on this with @votjakovr maybe he can comment why this is done?
Math.h cannot include eigen3 headers. We want to move towards a Shogun internal linear algebra interface, see Implement heterogeneous (GPU+CPU) dot product computation routines (Deep learning project) #1973 feel free to participate in the discussion. I have the feeling you could add valuable comments there
It is fine to pass eigen3 objects around, as long as these methods are not exposed to the outside world. I.e. private/protected helper methods.

karlnapf · 2014-05-06T11:15:06Z

data

@@ -1 +1 @@
-Subproject commit 06d38c0751ec6450c33a85fbe15ceb8543c6cc65
+Subproject commit 3165600ed1d43ad630b367311e648716125ab686


what is this again for?

karlnapf · 2014-05-06T11:18:53Z

The matrix operations class is a great idea. We are working on this. It should be a bit more general than your ideas here - but you are totally right with pulling things like log-determinats out.
Pls discuss with @lambday in #1973

@lambday it would be great if you could also think about adding the things that @yorkerlin needs for the GPs. We then cover many many things at once.

@lambday @yorkerlin this is a great example of synergy effects of GSoC and is perfect for the pre-GSoC time. Having those problems solved in a general way will massively benefit the rest of Shogun

lambday · 2014-05-06T18:50:45Z

@yorkerlin I'm open for discussion :) We're aiming at separating Shogun's linear algebra frontend from any particular backend dependency. So, in linalg/internal we can provide different implementations for most commonly used linear algebra operations in Shogun with different backends and using a common interface Shogun classes can choose to use any of them for those tasks. We'll always have some global settings (some default backend, say, Eigen3) for these and also if we want we can have module specific settings. All these things can be done via cmake options. If a user wants to use a particular backend for his algorithm that's also possible. I have made a prototype implementation here. Please check the README.

Also, could you please let me know whether your requirements fall under the modules I mentioned in the above README? What exactly are your requirements (what's the input/output of the operation that you're trying to do using Eigen3)? This will help a lot in further polishing and discover faults in the plan.

@karlnapf yeah I am quite excited about this :D Lets hope that we get the basics integrated within next week (I'll add Eigen3 sum and dot first). I gotta check some cmake stuffs also.

yorkerlin · 2014-05-08T00:52:48Z

@lambday
Hi, You can check the src/shogun/machine/gp/MatrixOperations.cpp file.
In fact, I use several features of Eigen3.
I am not sure whether library support these features or not.
For example:
(eigen_s2.replicate(1,eigen_l.rows()).array().transpose().colwise() + eigen_l.array().pow(2)).matrix();

MatrixXd eigen_V = eigen_L.triangularView().adjoint().solve(MatrixXd::Identity(eigen_L.rows(),eigen_L.cols()));

eigen_v.block(0, 1, n-1, n-1).diagonal() = (0.5*ArrayXd::LinSpaced(n-1,1,n-1)).sqrt();

EigenSolver eig(eigen_v);

yorkerlin · 2014-05-08T01:04:17Z

@karlnapf
I have written another variational class for logit and I have changed the name of the original variational class to get a good naming.
I am testing the result (compiling is time-consuming). Once it passed the unit test, I will send another PR.
Please merge that one (renaming PR) first so that I can send more than three PRs, which depend on the renaming one.

yorkerlin · 2014-05-08T01:06:16Z

@karlnapf
Do you know how to disable most of the unit tests and enable elected unit tests in the compling time

yorkerlin · 2014-05-08T01:07:26Z

@karlnapf
Do you know how to disable non-selected unit tests and enable selected unit tests to compile during the compiling time in my Laptop?

yorkerlin · 2014-05-08T01:08:29Z

sorry for closing the PR accidentally.

yorkerlin · 2014-05-08T02:15:06Z

@karlnapf it seems travis fails due to the python module.
Do you know why it fails?

lambday · 2014-05-08T04:14:10Z

@yorkerlin alright it fits nicely in the linalg internal library I was planning. So according to me it should be best to have it like

template <class Scalar, class Vector, class Matrix, enum Backend>
struct get_cholesky
{
    // may be use better naming for variables here? like W and sW?
    static Matrix compute(Vector W, Vector sW, Matrix Kernel, Scalar scale)
    {
        // something default
    }
};

and then partial specialization for your Eigen3 implementation like

template <class Scalar>
struct get_cholesky<Scalar, Matrix<Scalar, Dynamic, 1>, Matrix<Scalar, Dynamic, Dynamic>, Backend::Eigen3>
{
    typedef Matrix<Scalar, Dynamic, 1> VectorXt;
    typedef Matrix<Scalar, Dynamic, Dynamic> MatrixXt;
    static MatrixXt compute(VectorXt W, VectorXt sW, MatrixXt Kernel, Scalar scale)
    {
        // add your implementation that you have in MatrixOperations.h
    }
};

please check out this and this. This way you can directly work with Eigen3 vectors and matrices as per your need as its all internal. And also, we can have a backend independent implementation like (see this)

template <class Scalar, class Vector, class Matrix>
Matrix get_cholesky(Vector W, Vector sW, Matrix Kernel, Scalar scale)
{
    return impl::get_cholesky<Scalar, Vector, Matrix, linalg_traits<Factorization>::backend>::compute(W, sW, Kernel, scale);
}

Then the use case would be as simple as (check this) with your default Eigen3 backend

// W, sW, kernel are Eigen3 objects
linalg::get_cholesky<float64_t, VectorXd, MatrixXd>(W, sW, kernel, scale);

Similar way you can add other methods. I'll add the basic stuffs to shogun as soon as the design gets approved :)

lambday · 2014-05-08T04:22:39Z

Please ping me on irc if you have any questions or doubt regarding this :)

karlnapf · 2014-05-08T10:56:52Z

@yorkerlin

travis had problems, now fixed
let's just start with putting a few very fundamental operations into the linalg framework. Like the triangular solve, factorisation, etc. Complicated linear algebra operations can stay in eigen3 for now. GPs are strongly coupled with eigen anyways, so thats fine.

@lambday could you push this hard this week? Then @yorkerlin can use at least the Cholesky solver and log-determinants.

Its looking good guys! :)

lambday · 2014-05-08T11:02:21Z

@karlnapf absolutely. So the latest design is finalized, right? I was checking some cmake things regarding how to use this. Just figured it out. So it would work like

cmake -DSetLinalgBackend=Eigen3/ViennaCL ..

that sets the USE_EIGEN3/USE_VIENNACL flags. For module specific settings I am not finding better var names than

cmake -DSet<ModuleName>LinalgBackend=Eigen3 ..

karlnapf · 2014-05-08T11:14:34Z

That sounds good to me, but out of my expertise. I guess @vigsterkr has a comment on this too

vigsterkr · 2014-05-08T11:18:07Z

@lambday do we really want to do this compile time?
i mean it would be more desirable for me to be able to switch the backend during runtime. at least when one calls init_shogun_with_defaults()

yorkerlin · 2014-05-08T14:15:11Z

@karlnapf @lambday and @vigsterkr
I think we can focus on some critical operations (eg, vector-matrix product, matrix product, Cholesky, LU, Eigen, LDLT solver in GPU) for now.

yorkerlin · 2014-05-08T14:24:30Z

@lambday
I will try to separate some critical matrix operations at GP into the MatrixOperations class.

karlnapf · 2014-05-08T15:53:03Z

@yorkerlin no pls do not add stuff to the matrix operations class. This class might be used for very GP specific operations (which I dont think exist). However, methods like the ones you mentioned are supposed to go to linear algebra framework.

lambday · 2014-05-08T19:33:11Z

@yorkerlin yeah as @karlnapf said, we should aim at doing these things in a better way. I think your methods are already suited nicely in the linalg framework that we have planned (thanks to @lisitsyn for his further suggestions, we're trying to make the API super simplified). You just have this method and then as soon as I add the basics, you can add these methods in shogun/mathematics/linalg/internal/ (which doesn't exist right now).

lambday · 2014-05-08T19:47:45Z

@vigsterkr as per our discussion on irc, the other runtime alternative is far more painful to maintain according to me. We can, however, choose to use any backend irrespective of what global backend was set, even as shogun users. This compile time option is leading to much smaller and manageable code for these tasks I believe.

yorkerlin · 2014-05-09T15:48:45Z

@karlnapf
The CMatrixOperations in GP will only include GP-specific matrix methods used for inference methods. (eg, Laplacian inference and Variational inference).

@lambday
Please let me know once your implement is done. I plan to use the CMatrixOperations class as a wrapper to bridge the gap between your linear algebra interface and the existing GP-specific matrix operations.

lambday · 2014-05-09T16:49:24Z

@yorkerlin are you sure that these methods that you want to add are way too specific and won't be used anywhere else but gp? I think methods like log_det (we already have a naive version in CStatistics::log_det()), get_cholesky can be used in other places as well. In that case they themselves can go straight into linalg/. But may be you can tell better because I don't understand what all the other params are for.

I'll surely let you know when the basics are added. Trying to finish within this week.

yorkerlin · 2014-05-09T17:08:05Z

@lambday
The Laplacian class and Variational class use some common GP-specific helper methods.
My plan is to pull these methods out to some class(es) such as the CMatrixOperations class.

For log_det

I will use your log_det function once it is available.

For get_cholesky

It is a GP-specific function because I have to do some transformations before I call the cholesky method.
the method name may confuse you. Maybe I come out another good name for this method.

Let me know your thought
Let's work together to make the GP GPU-accelerated.

karlnapf · 2014-05-11T11:34:47Z

@yorkerlin thats sounds good. You do your GP-specific transformations inside the helper class, and then call the linalg framework from within once you have reduced your tasks to standard problems.

BTW have a look into the documentation of googletest how to select certain tests

yorkerlin · 2014-07-28T03:38:53Z

working on extending Laplace method for multi-classification.
Currently, I create a new Laplace class for multi-classification since some underlying structure is different from binary Laplace. However, I think it is possible to extend the binary Laplace class to do multi-classification.
Do you think I should create another class for multi-classification or just use the same class?

karlnapf · 2014-07-29T15:05:49Z

i think it should be the same, a user doesnt care, he just wants to use one class.
However, in the class itself, you should structure things properly

yorkerlin · 2014-07-30T18:21:56Z

@karlnapf
currently I created a new class for the Laplace method for multi classification.
This class of Laplace method ONLY work for soft-max likelihood because the special structure of the Hessian matrix of soft-max is used to make the inference more efficient.
What is more, multinomial probit does NOT have the same structure of Hessian matrix.
For now, I just created a new class for Laplace method for soft-max likelihood.

yorkerlin · 2014-07-30T18:39:04Z

However, variatoinal method can work for multinomial probit with the help of some variational bound.
see http://eprints.gla.ac.uk/3813/.
This may be a future feature for GPC.

karlnapf · 2014-07-30T19:42:14Z

@yorkerlin I agree with you, things are different under the hood. However, a user should have the possibility to just say "Laplace" and then the corresponding Laplace method is used internally. I think this can be solved via introducing a wrapper class CLaplacianInference that checks the likelihood and then instanciates the corresponding inference method object internally. Then we could even hide the other classes from the modular interfaces and things might be cleaner. This is in particular interesting for users who are not familiar with too many details about these things.

karlnapf · 2014-07-30T19:44:16Z

Yeah the Girolami thing would be neat to have. He is my former supervisor and we in fact already talked about having this in Shogun. Though @emtiyaz had some not so promising results with this I believe

emtiyaz · 2014-07-31T08:58:24Z

Girolami's method works reasonably well for prediction accuracy, but not for marginal likelihood approximation. I have the results in fig. 2 of the following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to differ from others, and it might take you longer than 1 week to implement it. Better invest time on the things that you already have.

emtiyaz · 2014-07-31T08:58:31Z

Girolami's method works reasonably well for prediction accuracy, but not for marginal likelihood approximation. I have the results in fig. 2 of the following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to differ from others, and it might take you longer than 1 week to implement it. Better invest time on the things that you already have.

emtiyaz · 2014-07-31T08:59:04Z

Girolami's method works reasonably well for prediction accuracy, but not
for marginal likelihood approximation. I have the results in fig. 2 of the
following paper.
http://www.cs.ubc.ca/~emtiyaz/papers/paper-AISTATS2012.pdf
I will recommend not to implement it because its implementation is going to
differ from others, and it might take you longer than 1 week to implement
it. Better invest time on the things that you already have.

On Wed, Jul 30, 2014 at 9:44 PM, Heiko Strathmann notifications@github.com
wrote:

Yeah the Girolami thing would be neat to have. He is my former supervisor
and we in fact already talked about having this in Shogun. Though @emtiyaz
https://github.com/emtiyaz had some not so promising results with this
I believe

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

yorkerlin · 2014-07-31T14:01:11Z

@emtiyaz @karlnapf
Thanks your feedback!
Since stick breaking is a powerful tool in Bayesian non-parametrics (ie, DP), I will first implement @emtiyaz 's stick breaking method for multi-classification in post-GSoC period.

emtiyaz · 2014-07-31T17:05:48Z

Hi Wu,
I will highly recommend not to do that. Stick-breaking also has a problem
that it depends on the ordering of the categories. It may not work well in
general.

Thanks
emt

On Thu, Jul 31, 2014 at 4:01 PM, Wu Lin notifications@github.com wrote:

@emtiyaz https://github.com/emtiyaz @karlnapf
https://github.com/karlnapf
Thanks your feedback!
Since stick breaking is a powerful tool in Bayesian non-parametrics (ie,
DP), I will first implement @emtiyaz https://github.com/emtiyaz 's
stick breaking method for multi-classification in post-GSoC period.

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

yorkerlin · 2014-08-01T02:06:42Z

@emtiyaz
OK. I know the stick breaking approach has an order-bias for Dirichlet process. And MCMC samplers also in general have the label switch issue for Dirichlet process. It seems the issue also occurs in GP.
Do you have any suggestion for implementing another method(s) for GP multi-classification? (fast dual method?)
BTW, the Laplace method for multi classification done now. I am writing some document for it.

yorkerlin · 2014-08-01T02:13:30Z

@emtiyaz
For large-scale GPC inference, do you have any suggestion about implementing some method(s).
Currently, I know there are at least three methods for GPC I may implemented in Shogun in post-GSoC period.
Could you tell me which one(s) are best in term of speed or accuracy or robustness? (I need the priority order)
Sparse Gaussian Processes using Pseudo-inputs (FITC)
Sparse On-Line Gaussian Processes
Gaussian processes for big data at http://auai.org/uai2013/prints/papers/244.pdf

karlnapf · 2014-08-01T09:33:35Z

we should focus on the writeup now
@yorkerlin the notebook really has to be improved - we want nice intuition, some text, cool examples, pictures - this should be possible to understand for people who have no idea about variational methods .... i will give some more feedback soon

could you please send a pull request with the notebook only?

yorkerlin · 2014-08-01T14:36:41Z

@karlnapf
got it.
I will send a PR for the notebook and another PR for the Laplace method today.

emtiyaz · 2014-08-01T16:42:10Z

I have few suggestions on that, but I am busy with a paper submission until
Aug. 4. When does GSoC end?

emt

On Fri, Aug 1, 2014 at 4:36 PM, Wu Lin notifications@github.com wrote:

@karlnapf https://github.com/karlnapf
got it.
I will send a PR for the notebook and another PR for the Laplace method
today.

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

Emtiyaz
EPFL
http://www.cs.ubc.ca/~emtiyaz/

karlnapf · 2014-08-04T22:18:09Z

August 11, but the last week is reserved for other things. We want to finish implementing/writing things within the next days

karlnapf · 2014-08-20T13:37:32Z

@yorkerlin whats the state of this one?

iglesias · 2014-10-24T12:18:52Z

@lambday, @karlnapf, @yorkerlin, what should be done about this one?

lambday · 2014-10-24T21:36:13Z

The code is too tightly coupled with Eigen3. Even if cholesky is there in linalg, we'd have to use specific Eigen3 backend for this so I think its okay for now to keep it like this way. Many of these operations are specific to GP and I'm afraid there is no better way to manage all these operations with generic linalg apart from being dependent on Eigen3. Even in future linalg won't be (and not intended to be) able to generalize all of what Eigen3 does!

Just a few things that I'd do differently for this PR

Try to make use of the SGMatrix/SGVector Eigen3 constructor/cast operator that Khaled added. They are neater.
Maybe I won't keep Eigen matrices and vectors as members and use SG* instead in the function arguments (although not much important since it's all internal)
Put MatrixOperators inside a nested internal namespace (shogun::gp::internal::MatrixOperations or so) and remove the C (since its not a CSGObject implementation).
Move its implementation from header to cpp.

@yorkerlin could you please give it a couple of mins to review this once? If you think the it's ready please let us know :)

karlnapf · 2014-10-24T22:48:55Z

@lambday I second your thoughts on generality of linalg actually. However, it would still be cool to have expensive and simple operations in lingalg, like Cholesky, linear solve, etc. These are also used everywhere in Shogun, so we get a lot from generalising them.

@yorkerlin could you address the points that @lambday mentioned, I think they are really good

yorkerlin · 2014-10-26T18:50:31Z

I will working on it at the beginning of next week.
currently I am working on developing a new feature.

On Fri, Oct 24, 2014 at 6:48 PM, Heiko Strathmann notifications@github.com
wrote:

@lambday https://github.com/lambday I second your thoughts on
generality of linalg actually. However, it would still be cool to have
expensive and simple operations in lingalg, like Cholesky, linear solve,
etc. These are also used everywhere in Shogun, so we get a lot from
generalising them.

@yorkerlin https://github.com/yorkerlin could you address the points
that @lambday https://github.com/lambday mentioned, I think they are
really good

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

iglesias · 2014-10-28T09:08:47Z

Cool, @yorkerlin. However, before adding new stuff, it is more relevant to take care of the existing features. At least, that is my opinion ;-)

yorkerlin · 2014-10-30T02:58:48Z

Working on it

On Tuesday, October 28, 2014, Fernando Iglesias notifications@github.com
wrote:

Cool, @yorkerlin https://github.com/yorkerlin. However, before adding
new stuff, it is more relevant to take care of the existing features. At
least, that is my opinion ;-)

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

yorkerlin · 2014-10-30T03:02:49Z

I will first clean up the existing GP codes in order to use the
Shogun matrix operations while implementing the FITC Laplace method for
binary classification.

@karlnapf
Will do refactor the SingleLaplace class and the SingleLaplaceWithBFGS tomorrow and clean up the MatrixOperations class.

@lambday
Yes. GP uses a lot of features of Eigen3. I will try to use the Shogun's linear algebra classes when implementing new features instead of using Eigen3. Do you have any suggestion to use the Shogun's linear algebra classes?

On Wednesday, October 29, 2014, jaster yorker yorkerjaster@gmail.com
wrote:

Working on it

On Tuesday, October 28, 2014, Fernando Iglesias <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Cool, @yorkerlin https://github.com/yorkerlin. However, before adding
new stuff, it is more relevant to take care of the existing features. At
least, that is my opinion ;-)

—
Reply to this email directly or view it on GitHub
#2212 (comment)
.

best,

wu lin

best,

wu lin

karlnapf · 2014-11-01T12:57:29Z

@yorkerlin @lambday first step: Cholesky factorisation and linear solves. Maybe matrix matrix product, but only if the same matrix has to be multiplied many many times (otherwise makes no sense to use GPU)

iglesias · 2014-12-24T15:23:09Z

@karlnapf, @yorkerlin, ping :-)

yorkerlin · 2015-02-19T17:24:08Z

Further clean-up work will be done once I complete the FITC stuff.
For now, I close this first.

karlnapf · 2015-02-19T17:38:19Z

ok!

karlnapf reviewed May 6, 2014
View reviewed changes

yorkerlin added 2 commits May 6, 2014 09:40

helper class used for KL methods and Laplacian method

20a8e45

refactor Laplaican method

51bc143

yorkerlin closed this May 8, 2014

yorkerlin reopened this May 8, 2014

yorkerlin closed this Feb 19, 2015

		@@ -1 +1 @@
		Subproject commit 06d38c0751ec6450c33a85fbe15ceb8543c6cc65
		Subproject commit 3165600ed1d43ad630b367311e648716125ab686

Refactor laplacian #2212

Refactor laplacian #2212

Conversation

yorkerlin commented May 6, 2014

karlnapf commented May 6, 2014

karlnapf May 6, 2014

Choose a reason for hiding this comment

karlnapf commented May 6, 2014

lambday commented May 6, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

lambday commented May 8, 2014

lambday commented May 8, 2014

karlnapf commented May 8, 2014

lambday commented May 8, 2014

karlnapf commented May 8, 2014

vigsterkr commented May 8, 2014

yorkerlin commented May 8, 2014

yorkerlin commented May 8, 2014

karlnapf commented May 8, 2014

lambday commented May 8, 2014

lambday commented May 8, 2014

yorkerlin commented May 9, 2014

lambday commented May 9, 2014

yorkerlin commented May 9, 2014

karlnapf commented May 11, 2014

yorkerlin commented Jul 28, 2014

karlnapf commented Jul 29, 2014

yorkerlin commented Jul 30, 2014

yorkerlin commented Jul 30, 2014

karlnapf commented Jul 30, 2014

karlnapf commented Jul 30, 2014

emtiyaz commented Jul 31, 2014

emtiyaz commented Jul 31, 2014

emtiyaz commented Jul 31, 2014

yorkerlin commented Jul 31, 2014

emtiyaz commented Jul 31, 2014

yorkerlin commented Aug 1, 2014

yorkerlin commented Aug 1, 2014

karlnapf commented Aug 1, 2014

yorkerlin commented Aug 1, 2014

emtiyaz commented Aug 1, 2014

karlnapf commented Aug 4, 2014

karlnapf commented Aug 20, 2014

iglesias commented Oct 24, 2014

lambday commented Oct 24, 2014

karlnapf commented Oct 24, 2014

yorkerlin commented Oct 26, 2014

iglesias commented Oct 28, 2014

yorkerlin commented Oct 30, 2014

yorkerlin commented Oct 30, 2014

karlnapf commented Nov 1, 2014

iglesias commented Dec 24, 2014

yorkerlin commented Feb 19, 2015

karlnapf commented Feb 19, 2015