Clone this wiki locally
Large-scale Approximate Kernel Methods
Focussing on one of Shogun's main strengths, this project is to polish and unify existing methods, and add a few new players.
Difficulty & Requirements
Easy to difficult - depends on what we aim for. Let's discuss this if you're interested. You need knowledge of
- The kernel trick (any of SVM, KRR, K-PCA, etc)
- The relation between the dual and the primal formulation when using kernels
- desirable: knowledge of approximate kernel methods (random features, etc)
- Basic ML (what are the interfaces)
- Basic Shogun (again, what are the interfaces)
- Basic Linear Algebra
- Optional: Knowledge of Shogun's
Approximate kernel methods are becoming more and more popular, because they can "scale up" traditional kernel-based algorithms. The idea is to perform computations in an explicit approximate feature space, which has a simpler structure and an explicit (data independent) basis. Mathematically, this means that the underlying optimization problem is solved in its primal representation, rather than the dual one. Large-scale kernel methods are already a main focus of Shogun and therefor this project aims to polish, unify, and extend Shogun's kernel method implementations. The project is divided into two parts: improving our existing algorithms, and adding some state-of-the-art methods.
Part I: Unify and speed-up existing implementations
- Re-factor the interface of CKernelMachine to support both primal and dual formulations.
- Re-work the framework for approximate features to work well with the primal formulation above.
- Put the (existing) random Fourier features code into that framework and polish it.
- Work on the numerical code: Benchmark / speed up / parallelise / Use
- Improve the documentation, write an IPython notebook that covers all implemented methods.
- Algorithms to consider: KRR, K-PCA, SVM.
Part II: Adding new methods
- Implement random binning features
- Low rank approximations: Incomplete Cholesky factorisation of kernel matrices, and its embedding into the primal interface above
- Sparse approximations: Nystrom -- random sub-sampling of entries of the kernel matrix. Leverage scores for choosing sampling weights.
- more to come ...
Why this is cool
Kernel methods are sexy, but don't scale up well. This project collects tools to solve this problem. It allows you to learn about kernel basics and how to scale them up mathematically / from an implementation point of view. In addition, there are some interesting software-design questions to answer.
- Benchmark and improve KRR issue
- Start an Ipython notebook about random fourier features
- Implement incomplete Cholesky algorithm (see below for Python reference)
- Draft a more detailed schedule:
- A good point to start would be to collect information about all implemented approximate kernel methods in Shogun.
- Another good point is to look at scikit's implementation of the random fourier features and Nystrom methods. See how can we learn from their design.
- Original paper on random features (random Fourier & random binning)
- Nystrom approximation with fast leverage scores
- Slides on incomplete Cholesky