Implement Nystrom approximation for approximate kernel methods #3058

karlnapf · 2016-03-10T22:47:52Z

This is another important way to approximate kernel methods

See here for a reference implementation

This paper described the original method

This task is closely related to the random fourier features counterpart, #2768
and the incomplete cholesky one #3057

fredhallgren · 2016-03-14T23:01:42Z

Hi I started looking at this. I have a pretty good idea of how to proceed. I can check in the irc if I have any questions.

karlnapf · 2016-03-14T23:37:11Z

Great. Let's start with kernel ridge regression and uniform sub-sampling.

We can add leverage scores for non-uniform sub-sampling later.

For now I guess we have to think a bit on how to best implement the nystrom + KRR. This needs some changes to the kernel machine I guess. Maybe we can add a Nystrom kernel machine class. This one can deal with the matrix inversions that are simpler for the approximated models. It would be good to do this high up in the hierarchy....but not 100% sure how to do it myself yet. Looking forward to your input :)

The same is actually true for the random fourier features #2768

karlnapf · 2016-03-15T10:52:17Z

Both this and Nystrom would start with KRR. KRR should put the method to solve the (K+\lambdaI)^-1 system in a method. A subclass would then overload this method and implement the cheaper matrix inversion for Nystrom or incomplete cholesky

fredhallgren · 2016-03-16T23:58:29Z

Sounds like a good plan. I'll get back in a few days with hopefully some constructive input :)

fredhallgren · 2016-03-20T23:14:09Z

You mean we refactor the current 'train_machine_pinv' method to put the matrix inversion in another method, which either calls lapack directly or carries out Nystrom in our subclass?

I wrote a short example implementing KRR. Getting somewhat more familiar with the code base.

Unfortunately I'll miss the reading group tomorrow since I'm moving house :'( Hoping to attend next time :)

karlnapf · 2016-03-21T11:55:19Z

There currently is a PR on changing the KRR internals a bit. The nystrom would just overload a single method in there.... #3098
Let me know if you need help and good luck with the move

karlnapf · 2016-03-22T09:31:22Z

this is a very good reference for the Nystrom method.
We should aim to reproduce the experiments (uniform Nystrom part for now) on the same datasets as in the paper. In addition, we should offer to compute the regularization path over the number of sub-samples as in Algorithm1 in the paper. This requires Cholesyk up and downdates, see here

fredhallgren · 2016-03-30T23:13:10Z

Thanks I'm starting to settle into my new place. I hope you had a nice Easter break.

Great thanks for the pointer I'll overload the new "solve_krr_system" method in my code.

karlnapf · 2016-04-05T19:38:29Z

Some inspiration https://github.com/LCSL/NystromCoRe

fredhallgren · 2016-04-07T20:58:56Z

Thanks :) I implemented most of the class (called it 'CKrrNystrom' we can change that if you come up with a better name) now I'm going to override the 'solve_krr_system' method. I'll read in the data from the article as well and run some examples comparing with the original krr implementation.

karlnapf · 2016-04-08T10:32:52Z

A few test cases that are useful:

m=n should reproduce exactly the KRR result
m=0.9n should lead to a very small deviation compared to the true result

fredhallgren · 2016-04-09T11:45:06Z

Good idea. I could add the first one (or both) to the unit tests.

Should the example I'm writing where I run both krr and nystrom be added to the 'examples' directory? Read that the example setup is about to change soon though.

karlnapf · 2016-04-09T16:54:57Z

Both in separate unit tests is best.
You can add an example, best would be a meta example I think. Check the cookbook readme.
Or you can modify the ipython notebook (or start a new one)

fredhallgren · 2016-04-14T10:33:30Z

Interesting seminar at gatsby yesterday.

I have finished a first implementation of the solve_krr_system method. It compiles and runs.

My code does the following:

Add tau to the original kernel matrix (had to make m_tau protected instead of private in base class)
Create random vector through SGVector::random(0,n-1)
Sort vector
Picks out these columns/(rows) from the original kernel matrix to create K_{n,m} and K_{m,m} matrices (adjusting for the sampling not having been done without replacement).
Calculate eigenvalues and eigenvectors of K_{m,m} using Eigen3::SelfAdjointEigenSolver
Adjusting the eigenvectors to the approximate eigendecomposition for K_{n,n} as explained in Williams and Seeger
Invert the diagonal eigenvalue matrix to solve for alpha

Some pending items:

run examples on data from article (add data to shogun-data repository?)
various cleanup and check of code
write unit tests
write documentation
something else I haven't thought of

Other comments:

Shouldn't CKernelRidgeRegression::solve_krr_system() be virtual?

Happy to come by at some point :)

karlnapf · 2016-04-14T11:27:30Z

Good news!

Yes, the solve_krr should be virtual! :)

Could you send a PR? I am not sure I 100% get what you are doing there. I think we can do this without an (expensive) Eigenvalue decomposition, but just via LLT solving the K_mm.
What dimension does alpha have in your code? The Nystrom basically restricts optimization over the regressor so the space of functions \sum_i a_i k(x_i, .) where i=1...m -- so only the sub-sampled elements. See http://arxiv.org/abs/1507.04717

The apply methods also have to be adjusted as the support vectors (data that spans the regression function), etc. But we can start from the PR....

Looking forward to it!

fredhallgren · 2016-04-14T19:18:13Z

Thanks for your comments.

Will do some minor cleanup then send a PR :)

My alpha vector currently has dimenstion n.

fredhallgren · 2016-04-17T15:03:59Z

Hey I made a PR (don't know if I should connect it to this issue somehow).

To comment on your previous comment I am attempting to create an approximation to the entire original kernel matrix through an adjusted eigendecomposition of the reduced matrix, and then solve the system with all y's.

I haven't looked in detail yet at the other article by Rudi et al but was mostly using Seeger & Williams.

karlnapf · 2016-04-18T13:19:18Z

Thanks for the patch #3172
Looks good, don't get scared by the number of iteration we have to do -- that is totally normal :)

fredhallgren · 2016-04-18T18:06:15Z

Thanks for the comments :) I'll look at it tonight.

karlnapf · 2018-03-12T11:04:12Z

Done for regression

karlnapf added the good first issue label Mar 10, 2016

yorkerlin added the ML: Kernel Methods label Sep 7, 2016

karlnapf closed this as completed Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Nystrom approximation for approximate kernel methods #3058

Implement Nystrom approximation for approximate kernel methods #3058

karlnapf commented Mar 10, 2016

fredhallgren commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 15, 2016

fredhallgren commented Mar 16, 2016

fredhallgren commented Mar 20, 2016

karlnapf commented Mar 21, 2016

karlnapf commented Mar 22, 2016

fredhallgren commented Mar 30, 2016

karlnapf commented Apr 5, 2016

fredhallgren commented Apr 7, 2016

karlnapf commented Apr 8, 2016

fredhallgren commented Apr 9, 2016

karlnapf commented Apr 9, 2016

fredhallgren commented Apr 14, 2016

karlnapf commented Apr 14, 2016

fredhallgren commented Apr 14, 2016

fredhallgren commented Apr 17, 2016

karlnapf commented Apr 18, 2016

fredhallgren commented Apr 18, 2016

karlnapf commented Mar 12, 2018

Implement Nystrom approximation for approximate kernel methods #3058

Implement Nystrom approximation for approximate kernel methods #3058

Comments

karlnapf commented Mar 10, 2016

fredhallgren commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 15, 2016

fredhallgren commented Mar 16, 2016

fredhallgren commented Mar 20, 2016

karlnapf commented Mar 21, 2016

karlnapf commented Mar 22, 2016

fredhallgren commented Mar 30, 2016

karlnapf commented Apr 5, 2016

fredhallgren commented Apr 7, 2016

karlnapf commented Apr 8, 2016

fredhallgren commented Apr 9, 2016

karlnapf commented Apr 9, 2016

fredhallgren commented Apr 14, 2016

karlnapf commented Apr 14, 2016

fredhallgren commented Apr 14, 2016

fredhallgren commented Apr 17, 2016

karlnapf commented Apr 18, 2016

fredhallgren commented Apr 18, 2016

karlnapf commented Mar 12, 2018