Skip to content

timnugent/kpca-eigen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kernel PCA using the Eigen linear algebra library
-------------------------------------------------
Tim Nugent 2014

Uses the covariance/eigendecomposition method:

http://en.wikipedia.org/wiki/Principal_component_analysis#Computing_PCA_using_the_covariance_method

Current kernels are RBF and polynomial.

Build
-----

Install Eigen, e.g.:

sudo apt-get install libeigen3-dev

or download from http://eigen.tuxfamily.org/.

Make sure the include path is set right in the Makefile.

Compile with 'make'.

Run
---

Run with 'make test', or:

bin/kpca

Input file format is a comma delimited n X p data matrix (n observations, p variables), e.g.:

32.0,26.0,51.0,12.0
17.0,13.0,34.0,35.0
10.0,94.0,83.0,45.0
3.0,72.0,72.0,67.0
10.0,63.0,35.0,34.0

Output
------

If R is installed, see the following plots:

wikipedia_data.png
wikipedia_RBF_transformed.png
wikipedia_Polynomial_transformed.png

stdout:

Regular PCA (data/test.data):
Input data:
32 26 51 12
17 13 34 35
10 94 83 45
 3 72 72 67
10 63 35 34

Centered data:
 17.6 -27.6    -4 -26.6
  2.6 -40.6   -21  -3.6
 -4.4  40.4    28   6.4
-11.4  18.4    17  28.4
 -4.4   9.4   -20  -4.6

Covariance matrix:
  97.04 -204.04   -70.8 -161.84
-204.04  893.84   443.8  323.64
  -70.8   443.8     386   187.2
-161.84  323.64   187.2  317.84

Eigenvalues:
1347.01
211.581
135.402
0.721744

Eigenvectors:
 0.201491 -0.426243  0.157741 -0.867662
-0.790673 -0.243348 -0.537975 -0.161871
-0.450549 -0.390018  0.770297  0.227011
-0.362276  0.779092   0.30388 -0.411616

Sorted eigenvalues:
PC 1: Eigenvalue: 1347.01	(0.795 of variance, cumulative =  0.795)
PC 2: Eigenvalue: 211.581	(0.125 of variance, cumulative =  0.920)
PC 3: Eigenvalue: 135.402	(0.080 of variance, cumulative =  1.000)
PC 4: Eigenvalue: 0.721744	(0.000 of variance, cumulative =  1.000)

Sorted eigenvectors:
 0.201491 -0.426243  0.157741 -0.867662
-0.790673 -0.243348 -0.537975 -0.161871
-0.450549 -0.390018  0.770297  0.227011
-0.362276  0.779092   0.30388 -0.411616

Transformed data:
-41.4351 -30.5086  33.9921 -25.3357
-34.8517  3.59798  32.5138 -23.5428
-126.006 -24.4494  28.6171 -23.5733
-113.036  5.31814  37.5604 -25.4912
-75.8841 -6.75481  4.97735  -24.924

Kernel PCA (data/wikipedia.data) - RBF kernel, gamma = 0.001:
Sorted eigenvalues:
PC 1: Eigenvalue: 145.3	(0.931 of variance, cumulative =  0.931)
PC 2: Eigenvalue: 6.68255	(0.043 of variance, cumulative =  0.974)
PC 3: Eigenvalue: 3.66329	(0.023 of variance, cumulative =  0.998)
PC 4: Eigenvalue: 0.141679	(0.001 of variance, cumulative =  0.999)
PC 5: Eigenvalue: 0.133171	(0.001 of variance, cumulative =  0.999)
PC 6: Eigenvalue: 0.0729222	(0.000 of variance, cumulative =  1.000)
...
PC 99: Eigenvalue: 8.36113e-18	(0.000 of variance, cumulative =  1.000)
PC 100: Eigenvalue: 8.36113e-18	(0.000 of variance, cumulative =  1.000)

Written file data/eigenvectors_RBF_data.csv
Written file data/transformed_RBF_data.csv

Kernel PCA (data/wikipedia.data) - Polynomial kernel, order = 2, constant = 1:
Sorted eigenvalues:
PC 1: Eigenvalue: 176542	(0.548 of variance, cumulative =  0.548)
PC 2: Eigenvalue: 79802.3	(0.248 of variance, cumulative =  0.795)
PC 3: Eigenvalue: 54358.5	(0.169 of variance, cumulative =  0.964)
PC 4: Eigenvalue: 7483.66	(0.023 of variance, cumulative =  0.987)
PC 5: Eigenvalue: 4058.28	(0.013 of variance, cumulative =  1.000)
...
PC 77: Eigenvalue: 9.0432e-17	(0.000 of variance, cumulative =  1.000)
PC 78: Eigenvalue: 9.0432e-17	(0.000 of variance, cumulative =  1.000)

Written file data/eigenvectors_Polynomial_data.csv
Written file data/transformed_Polynomial_data.csv

Plotting data using R...
src/plot.R

Bugs
----

timnugent@gmail.com

About

Kernel principal component analysis using the Eigen linear algebra library [machine learning]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published