Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Updated K-means clustering for Nystroem #3126

Open
wants to merge 4 commits into from

6 participants

@nateyoder

Because I wanted to try K-means clustering as the basis for Nystroem approximation and it appeared as though pull request #2591 might be stalled I created a slightly modified version. I also tried to address @amueller comment about the effectiveness of the method by including it in the plot_kernel_approximation example and @dougalsutherland comment concerning the possible singularity of the sub-sampled kernel matrix using the same approach as scipy does in pinv2.

Since it is my first commit to the project (hopefully the first of many) any feedback or suggestions you have would be appreciated.

@coveralls

Coverage Status

Coverage remained the same when pulling 0b139b4 on nateyoder:kmeans-nystroem into 48e2b13 on scikit-learn:master.

@nateyoder nateyoder changed the title from Implemented to Updated K-means clustering for Nystroem
@amueller
Owner

Hi @nateyoder.
Thanks for tackling this. Could you maybe post the plot from the example?
Have you experimented with some datasets and seen an improvement?

Cheers,
Andy

doc/modules/kernel_approximation.rst
@@ -35,9 +35,15 @@ Nystroem Method for Kernel Approximation
The Nystroem method, as implemented in :class:`Nystroem` is a general method
for low-rank approximations of kernels. It achieves this by essentially subsampling
the data on which the kernel is evaluated.
+The subsampling methodology used to generate the approximate kernel is specified by
+the parameter ``basis_method`` which can either be ``random`` or ``clustered``.
@amueller Owner
amueller added a note

I would call it kmeans instead of clustered, to be more specific.

@amueller Owner
amueller added a note

Maybe also basis_sampling or basis_selection?

Great suggestions. They are incorporated in the new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@amueller amueller commented on the diff
examples/plot_kernel_approximation.py
@@ -149,7 +167,7 @@
[kernel_svm_time, kernel_svm_time], '--', label='rbf svm')
# vertical line for dataset dimensionality = 64
-accuracy.plot([64, 64], [0.7, 1], label="n_features")
+accuracy.plot([64, 64], accuracy.get_ylim(), label="n_features")
@amueller Owner
amueller added a note

nice :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@nateyoder

As far as performance it seems to help a bit, but not quite as much as I had hoped. I think the results would be bigger if the random selection method happened to select an outlier as part of the basis sampling set but didn't try different random seeds to make that occur.

accuracyandtrainingtimeforkernelapproximationmethods

@coveralls

Coverage Status

Coverage remained the same when pulling 5f313f8 on nateyoder:kmeans-nystroem into 48e2b13 on scikit-learn:master.

@nateyoder nateyoder closed this
@nateyoder nateyoder deleted the nateyoder:kmeans-nystroem branch
@nateyoder nateyoder restored the nateyoder:kmeans-nystroem branch
@nateyoder

Sorry I accidentally deleted the branch and I think doing this closed the issue. Sorry!!

@nateyoder nateyoder reopened this
@amueller
Owner

Have you tried it on a different dataset? This above is digits, right? Maybe try MNIST? Or is there some other dataset where RBF works well?

@amueller
Owner

I think this should help but I also think we should make sure that it actually does ;)

@ogrisel
Owner

Have you tried it on a different dataset? This above is digits, right? Maybe try MNIST? Or is there some other dataset where RBF works well?

You could also try on Olivetti faces with RandomizedPCA preprocessing: http://scikit-learn.org/stable/auto_examples/applications/face_recognition.html

To try on a bigger dataset you can use LFW instead of Olivetti.

@nateyoder

Sounds great guys thanks for the suggestions. I'll give them a shot this week and post the results.

Also I noticed my build failed but it failed because of errors in OrthogonalMatchingPursuitCV. Do you guys know if this an intermitant test or something I should look into?

@ogrisel
Owner

The travis failure is unrelated, you can ignore it.

@nateyoder

Sorry for the long layoff guys.

Finally got a chance to run amueller's MINST example with k-means and random. As the graph shows k-means does show some minor improvement but nothing big. However, since it seems to almost always be a little better in the examples I tried it seems like it might still be worth adding it?

I briefly tried on Olivetti but I think because of the limited amount of faces saw a lot of variance in the output and didn't really get anything useful other than k-means definitely isn't a silver bullet. I didn't have time to look into LFW.

minst_example

@kastnerkyle
Owner

It seems consistent from the little I have seen thus far - I will try to run some tests as well. Looks pretty nice!

@ogrisel
Owner
@dougalsutherland

At first these results seemed at odds to me with the MNIST line in Table 2 of Kumar, Mohri and Talwalkar, Sampling Methods for the Nyström Method, JMLR 2012. But actually, that table is showing the kernel reconstruction "accuracy" || K - K_k ||_F / || K - \tilde{K}_k ||_F * 100}, where K_k is the optimal rank-k reconstruction (the truncated SVD), and \tilde{K}_k is the rank-k Nyström approximation. I guess the kernel isn't as well-approximated by the uniform reconstruction, but it's still good enough to do classification with. Might be good to make sure that's the case.

Also, it might be better to use kmeans++ initialization rather than random; did you try that?

@nateyoder

Brief update. I ran MINST again to compare "better" clustering with k-means [KMeans++ initialization, max_iter=300, and n_init=10] vs k-means as suggested in the literature ['random' initialization, max_iter=5, n_init=1] vs random Nystroem. As shown below the much more time intensive clustering has almost no impact on the classification performance while significantly increasing the time needed to train the model.

kmeans_vs_k

I also did the same on LFW and the results are below. In this case k-means appears to little to no consistent improvement over random selection. If you are interested I used the parameters found in http://nbviewer.ipython.org/github/jakevdp/sklearn_scipy2013/blob/master/rendered_notebooks/05.1_application_to_face_recognition.ipynb other than doing my own RBF grid search to find the optimal RBF parameters.

flw_kmeans_vs_random

I'll try to do the covertype test later this week if I get time and you guys think it is still needed.

@ogrisel
Owner

Can you please rebase your branch on master and try with MinibatchKMeans? This might be master to converge while giving good enough centroids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
11 doc/modules/kernel_approximation.rst
@@ -35,9 +35,15 @@ Nystroem Method for Kernel Approximation
The Nystroem method, as implemented in :class:`Nystroem` is a general method
for low-rank approximations of kernels. It achieves this by essentially subsampling
the data on which the kernel is evaluated.
+The subsampling methodology used to generate the approximate kernel is specified by
+the parameter ``basis_sampling`` which can either be ``random`` or ``kmeans``.
+If the ``random`` method is specified randomly selected data will be utilized in
+the approximation while the ``kmeans`` method uses the cluster centers found via
+k-means clustering. Further details concerning the subsampling methods can be found
+in [ZK2010]_.
By default :class:`Nystroem` uses the ``rbf`` kernel, but it can use any
kernel function or a precomputed kernel matrix.
-The number of samples used - which is also the dimensionality of the features computed -
+The number of bases used - which is also the dimensionality of the features computed -
is given by the parameter ``n_components``.
@@ -197,3 +203,6 @@ or store training examples.
.. [VVZ2010] `"Generalized RBF feature maps for Efficient Detection"
<http://eprints.pascal-network.org/archive/00007024/01/inproceedings.pdf.8a865c2a5421e40d.537265656b616e7468313047656e6572616c697a65642e706466.pdf>`_
Vempati, S. and Vedaldi, A. and Zisserman, A. and Jawahar, CV - 2010
+ .. [ZK2010] `"Clustered Nystroem method for large scale manifold learning and dimension reduction"
+ <http://www.cs.ust.hk/~jamesk/papers/tnn10b.pdf>`_
+ Zhang, K. and Kwok, J.T. - Neural Networks, IEEE Transactions on 21, no. 10 2010
View
106 examples/plot_kernel_approximation.py
@@ -48,7 +48,7 @@
# License: BSD 3 clause
# Standard scientific Python imports
-import pylab as pl
+import matplotlib.pyplot as plt
import numpy as np
from time import time
@@ -75,22 +75,29 @@
data_test, targets_test = data[n_samples / 2:], digits.target[n_samples / 2:]
#data_test = scaler.transform(data_test)
+
+kernel_gamma = 0.2
+
# Create a classifier: a support vector classifier
-kernel_svm = svm.SVC(gamma=.2)
+kernel_svm = svm.SVC(gamma=kernel_gamma)
linear_svm = svm.LinearSVC()
# create pipeline from kernel approximation
# and linear svm
-feature_map_fourier = RBFSampler(gamma=.2, random_state=1)
-feature_map_nystroem = Nystroem(gamma=.2, random_state=1)
+feature_map_fourier = RBFSampler(gamma=kernel_gamma, random_state=0)
+feature_map_random_nystroem = Nystroem(gamma=kernel_gamma, random_state=0, basis_sampling='random')
+feature_map_clusted_nystroem_ = Nystroem(gamma=kernel_gamma, random_state=0, basis_sampling='kmeans')
+
fourier_approx_svm = pipeline.Pipeline([("feature_map", feature_map_fourier),
("svm", svm.LinearSVC())])
-nystroem_approx_svm = pipeline.Pipeline([("feature_map", feature_map_nystroem),
- ("svm", svm.LinearSVC())])
+random_nystroem_svm = pipeline.Pipeline([("feature_map", feature_map_random_nystroem),
+ ("svm", svm.LinearSVC())])
-# fit and predict using linear and kernel svm:
+clustered_nystroem_svm = pipeline.Pipeline([("feature_map", feature_map_clusted_nystroem_),
+ ("svm", svm.LinearSVC())])
+# fit and predict using linear and kernel svm:
kernel_svm_time = time()
kernel_svm.fit(data_train, targets_train)
kernel_svm_score = kernel_svm.score(data_test, targets_test)
@@ -101,37 +108,48 @@
linear_svm_score = linear_svm.score(data_test, targets_test)
linear_svm_time = time() - linear_svm_time
-sample_sizes = 30 * np.arange(1, 10)
+sample_sizes = 10 * np.arange(1, 30)
fourier_scores = []
-nystroem_scores = []
+random_scores = []
+clustered_scores = []
fourier_times = []
-nystroem_times = []
+random_times = []
+clustered_times = []
for D in sample_sizes:
fourier_approx_svm.set_params(feature_map__n_components=D)
- nystroem_approx_svm.set_params(feature_map__n_components=D)
+ random_nystroem_svm.set_params(feature_map__n_components=D)
+ clustered_nystroem_svm.set_params(feature_map__n_components=D)
+
+ start = time()
+ random_nystroem_svm.fit(data_train, targets_train)
+ random_times.append(time() - start)
+
start = time()
- nystroem_approx_svm.fit(data_train, targets_train)
- nystroem_times.append(time() - start)
+ clustered_nystroem_svm.fit(data_train, targets_train)
+ clustered_times.append(time() - start)
start = time()
fourier_approx_svm.fit(data_train, targets_train)
fourier_times.append(time() - start)
- fourier_score = fourier_approx_svm.score(data_test, targets_test)
- nystroem_score = nystroem_approx_svm.score(data_test, targets_test)
- nystroem_scores.append(nystroem_score)
- fourier_scores.append(fourier_score)
+ fourier_scores.append(fourier_approx_svm.score(data_test, targets_test))
+ random_scores.append(random_nystroem_svm.score(data_test, targets_test))
+ clustered_scores.append(clustered_nystroem_svm.score(data_test, targets_test))
# plot the results:
-pl.figure(figsize=(8, 8))
-accuracy = pl.subplot(211)
-# second y axis for timeings
-timescale = pl.subplot(212)
+plt.figure(figsize=(8, 8))
+accuracy = plt.subplot(211)
+# second y axis for timings
+timescale = plt.subplot(212)
-accuracy.plot(sample_sizes, nystroem_scores, label="Nystroem approx. kernel")
-timescale.plot(sample_sizes, nystroem_times, '--',
- label='Nystroem approx. kernel')
+accuracy.plot(sample_sizes, random_scores, label="Random Nystroem approx. kernel")
+timescale.plot(sample_sizes, random_times, '--',
+ label='Random Nystroem approx. kernel')
+
+accuracy.plot(sample_sizes, clustered_scores, label="K-means Nystroem approx. kernel")
+timescale.plot(sample_sizes, clustered_times, '--',
+ label='K-means Nystroem approx. kernel')
accuracy.plot(sample_sizes, fourier_scores, label="Fourier approx. kernel")
timescale.plot(sample_sizes, fourier_times, '--',
@@ -149,7 +167,7 @@
[kernel_svm_time, kernel_svm_time], '--', label='rbf svm')
# vertical line for dataset dimensionality = 64
-accuracy.plot([64, 64], [0.7, 1], label="n_features")
+accuracy.plot([64, 64], accuracy.get_ylim(), label="n_features")
@amueller Owner
amueller added a note

nice :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
# legends and labels
accuracy.set_title("Classification accuracy")
@@ -165,7 +183,7 @@
# visualize the decision surface, projected down to the first
# two principal components of the dataset
-pca = PCA(n_components=8).fit(data_train)
+pca = PCA(n_components=2).fit(data_train)
X = pca.transform(data_train)
@@ -179,32 +197,40 @@
grid = first[np.newaxis, :, :] + second[:, np.newaxis, :]
flat_grid = grid.reshape(-1, data.shape[1])
+n_components_to_plot = 100
+
# title for the plots
titles = ['SVC with rbf kernel',
- 'SVC (linear kernel)\n with Fourier rbf feature map\n'
- 'n_components=100',
- 'SVC (linear kernel)\n with Nystroem rbf feature map\n'
- 'n_components=100']
+ 'SVC with linear kernel',
+ 'SVC (linear kernel)\n with Fourier rbf approx\n'
+ 'n_components={}'.format(n_components_to_plot),
+ 'SVC (linear kernel)\n with K-means Nystroem rbf approx\n'
+ 'n_components={}'.format(n_components_to_plot)]
+
+plt.tight_layout()
+plt.figure(figsize=(14, 4))
-pl.tight_layout()
-pl.figure(figsize=(12, 5))
+clustered_nystroem_svm.set_params(feature_map__n_components=n_components_to_plot)
+clustered_nystroem_svm.fit(data_train, targets_train)
+fourier_approx_svm.set_params(feature_map__n_components=n_components_to_plot)
+fourier_approx_svm.fit(data_train, targets_train)
# predict and plot
-for i, clf in enumerate((kernel_svm, nystroem_approx_svm,
+for i, clf in enumerate((kernel_svm, linear_svm, clustered_nystroem_svm,
fourier_approx_svm)):
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
- pl.subplot(1, 3, i + 1)
+ plt.subplot(1, 4, i + 1)
Z = clf.predict(flat_grid)
# Put the result into a color plot
Z = Z.reshape(grid.shape[:-1])
- pl.contourf(multiples, multiples, Z, cmap=pl.cm.Paired)
- pl.axis('off')
+ plt.contourf(multiples, multiples, Z, cmap=plt.cm.Paired)
+ plt.axis('off')
# Plot also the training points
- pl.scatter(X[:, 0], X[:, 1], c=targets_train, cmap=pl.cm.Paired)
+ plt.scatter(X[:, 0], X[:, 1], c=targets_train, cmap=plt.cm.Paired)
- pl.title(titles[i])
-pl.tight_layout()
-pl.show()
+ plt.title(titles[i])
+plt.tight_layout()
+plt.show()
View
41 sklearn/kernel_approximation.py
@@ -15,6 +15,7 @@
from .base import BaseEstimator
from .base import TransformerMixin
+from sklearn.cluster import k_means
from .utils import (array2d, atleast2d_or_csr, check_random_state,
as_float_array)
from .utils.extmath import safe_sparse_dot
@@ -376,6 +377,10 @@ class Nystroem(BaseEstimator, TransformerMixin):
If int, random_state is the seed used by the random number generator;
if RandomState instance, random_state is the random number generator.
+ basis_sampling : string "random" or "kmeans"
+ Form approximation using randomly sampled columns or k-means
+ cluster centers to construct the Nystrom Approximation
+
Attributes
----------
@@ -401,6 +406,10 @@ class Nystroem(BaseEstimator, TransformerMixin):
Comparison",
Advances in Neural Information Processing Systems 2012
+ * Zhang, Kai, and James T. Kwok.
+ "Clustered Nystroem method for large scale manifold learning and
+ dimension reduction",
+ Neural Networks, IEEE Transactions on 21, no. 10 2010
See also
--------
@@ -410,7 +419,8 @@ class Nystroem(BaseEstimator, TransformerMixin):
sklearn.metric.pairwise.kernel_metrics : List of built-in kernels.
"""
def __init__(self, kernel="rbf", gamma=None, coef0=1, degree=3,
- kernel_params=None, n_components=100, random_state=None):
+ kernel_params=None, n_components=100, random_state=None,
+ basis_sampling="random"):
self.kernel = kernel
self.gamma = gamma
self.coef0 = coef0
@@ -418,6 +428,7 @@ def __init__(self, kernel="rbf", gamma=None, coef0=1, degree=3,
self.kernel_params = kernel_params
self.n_components = n_components
self.random_state = random_state
+ self.basis_sampling = basis_sampling
def fit(self, X, y=None):
"""Fit estimator to data.
@@ -446,10 +457,18 @@ def fit(self, X, y=None):
else:
n_components = self.n_components
- n_components = min(n_samples, n_components)
- inds = rnd.permutation(n_samples)
- basis_inds = inds[:n_components]
- basis = X[basis_inds]
+
+ if self.basis_sampling == "random":
+ inds = rnd.permutation(n_samples)
+ basis_inds = inds[:n_components]
+ basis = X[basis_inds]
+ elif self.basis_sampling == "kmeans":
+ # Zhang and Kwok use 5 in their paper so lets do that
+ basis, _, _ = k_means(X, n_components, init='random', max_iter=5, n_init=1, random_state=rnd)
+ #If we are using k_means centers as input, cannot record basis_inds
+ basis_inds = None
+ else:
+ raise NameError('{0} is not a supported basis_sampling method'.format(self.basis_sampling))
basis_kernel = pairwise_kernels(basis, metric=self.kernel,
filter_params=True,
@@ -457,9 +476,17 @@ def fit(self, X, y=None):
# sqrt of kernel matrix on basis vectors
U, S, V = svd(basis_kernel)
- self.normalization_ = np.dot(U * 1. / np.sqrt(S), V)
+
+ # Handle possible matrix singularity like scipy does in pinv2
+ t = U.dtype.char.lower()
+ factor = {'f': 1E3, 'd': 1E6}
+ cond = factor[t] * np.finfo(t).eps
+ rank = np.sum(S > cond * np.max(S))
+
+ self.normalization_ = np.dot(U[:, : rank] * 1. / np.sqrt(S[: rank]), V[: rank])
+
self.components_ = basis
- self.component_indices_ = inds
+ self.component_indices_ = basis_inds
return self
def transform(self, X):
View
133 sklearn/tests/test_kernel_approximation.py
@@ -138,32 +138,106 @@ def test_input_validation():
RBFSampler().fit(X).transform(X)
-def test_nystroem_approximation():
+def test_nystroem_approximation_with_number_samples_is_exact():
# some basic tests
rnd = np.random.RandomState(0)
X = rnd.uniform(size=(10, 4))
# With n_components = n_samples this is exact
- X_transformed = Nystroem(n_components=X.shape[0]).fit_transform(X)
+ ny_random = Nystroem(n_components=X.shape[0], basis_sampling='random')
+ X_transformed_random = ny_random.fit_transform(X)
K = rbf_kernel(X)
- assert_array_almost_equal(np.dot(X_transformed, X_transformed.T), K)
+ assert_array_equal(np.sort(ny_random.component_indices_), np.arange(X.shape[0]))
+ assert_array_almost_equal(np.dot(X_transformed_random, X_transformed_random.T), K)
+
+ ny_clustered = Nystroem(n_components=X.shape[0], basis_sampling='kmeans')
+ X_transformed_clustered = ny_clustered.fit_transform(X)
+ K = rbf_kernel(X)
+ # No component indicies to report for k-means
+ assert_equal(ny_clustered.component_indices_, None)
+ assert_array_almost_equal(np.dot(X_transformed_clustered, X_transformed_clustered.T), K)
- trans = Nystroem(n_components=2, random_state=rnd)
- X_transformed = trans.fit(X).transform(X)
- assert_equal(X_transformed.shape, (X.shape[0], 2))
- # test callable kernel
- linear_kernel = lambda X, Y: np.dot(X, Y.T)
- trans = Nystroem(n_components=2, kernel=linear_kernel, random_state=rnd)
- X_transformed = trans.fit(X).transform(X)
+def test_nystroem_approximation_returns_appropriate_indices():
+ rnd = np.random.RandomState(0)
+ X = rnd.uniform(size=(10, 4))
+
+ ny_random = Nystroem(n_components=2, basis_sampling='random')
+ X_transformed = ny_random.fit_transform(X)
assert_equal(X_transformed.shape, (X.shape[0], 2))
+ assert_equal(len(ny_random.component_indices_), 2)
+ assert_array_almost_equal(ny_random.components_, X[ny_random.component_indices_])
+
+ ny_clustered = Nystroem(n_components=2, basis_sampling='kmeans')
+ ny_clustered.fit_transform(X)
+ # No component indicies to report for k-means
+ assert_equal(ny_clustered.component_indices_, None)
- # test that available kernels fit and transform
- kernels_available = kernel_metrics()
- for kern in kernels_available:
- trans = Nystroem(n_components=2, kernel=kern, random_state=rnd)
- X_transformed = trans.fit(X).transform(X)
- assert_equal(X_transformed.shape, (X.shape[0], 2))
+
+def test_nystroem_approximation_with_singular_kernel_matrix():
+ rnd = np.random.RandomState(0)
+ X = rnd.uniform(size=(10, 4))
+ X = np.concatenate((X, X[-2:, :]), axis=0)
+
+ K = rbf_kernel(X)
+ assert_equal(np.linalg.matrix_rank(K), 10)
+
+ ny_random = Nystroem(n_components=X.shape[0], basis_sampling='random')
+ X_transformed = ny_random.fit_transform(X)
+ assert_equal(X_transformed.shape, (X.shape[0], 12))
+ assert_array_almost_equal(np.dot(X_transformed, X_transformed.T), K)
+
+
+def test_nystroem_approximation_for_multiple_kernels():
+ """test that Nystroem approximates kernel on random data"""
+ rnd = np.random.RandomState(0)
+ X = rnd.uniform(size=(10, 4))
+ trans_not_valid = Nystroem(n_components=2, random_state=rnd,
+ basis_sampling="not_a_valid_basis_sampling")
+ assert_raises(NameError, trans_not_valid.fit, X)
+
+ # Kernel tests to perform with each basis method used
+ def test_nystroem_approximation_with_basis(tested_basis):
+ # Test default kernel
+ trans = Nystroem(n_components=2, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # test callable kernel
+ linear_kernel = lambda X, Y: np.dot(X, Y.T)
+ trans = Nystroem(n_components=2, kernel=linear_kernel, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # test that available kernels fit and transform
+ kernels_available = kernel_metrics()
+ for kern in kernels_available:
+ trans = Nystroem(n_components=2, kernel=kern, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # Test default kernel
+ trans = Nystroem(n_components=2, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # test callable kernel
+ linear_kernel = lambda X, Y: np.dot(X, Y.T)
+ trans = Nystroem(n_components=2, kernel=linear_kernel, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # test that available kernels fit and transform
+ kernels_available = kernel_metrics()
+ for kern in kernels_available:
+ trans = Nystroem(n_components=2, kernel=kern, random_state=rnd, basis_sampling=tested_basis)
+ transformed = trans.fit(X).transform(X)
+ assert_equal(transformed.shape, (X.shape[0], 2))
+
+ # Go through all the kernels with each basis_sampling method
+ basis_sampling_methods = ("random", "kmeans")
+ for current_basis in basis_sampling_methods:
+ yield test_nystroem_approximation_with_basis, current_basis
def test_nystroem_poly_kernel_params():
@@ -172,10 +246,18 @@ def test_nystroem_poly_kernel_params():
X = rnd.uniform(size=(10, 4))
K = polynomial_kernel(X, degree=3.1, coef0=.1)
- nystroem = Nystroem(kernel="polynomial", n_components=X.shape[0],
- degree=3.1, coef0=.1)
- X_transformed = nystroem.fit_transform(X)
- assert_array_almost_equal(np.dot(X_transformed, X_transformed.T), K)
+ nystroem_random = Nystroem(kernel="polynomial", n_components=X.shape[0],
+ degree=3.1, coef0=.1, basis_sampling="random")
+ nystroem_k_means = Nystroem(kernel="polynomial", n_components=X.shape[0],
+ degree=3.1, coef0=.1, basis_sampling="kmeans")
+
+ transformed_k_means = nystroem_k_means.fit_transform(X)
+ transformed_random = nystroem_random.fit_transform(X)
+
+ assert_array_almost_equal(np.dot(transformed_k_means,
+ transformed_k_means.T), K)
+ assert_array_almost_equal(np.dot(transformed_random,
+ transformed_random.T), K)
def test_nystroem_callable():
@@ -190,8 +272,15 @@ def logging_histogram_kernel(x, y, log):
return np.minimum(x, y).sum()
kernel_log = []
- X = list(X) # test input validation
Nystroem(kernel=logging_histogram_kernel,
n_components=(n_samples - 1),
- kernel_params={'log': kernel_log}).fit(X)
+ kernel_params={'log': kernel_log}, basis_sampling="kmeans").fit(X)
+
+ assert_equal(len(kernel_log), n_samples * (n_samples - 1) / 2)
+
+ kernel_log = []
+ Nystroem(kernel=logging_histogram_kernel,
+ n_components=(n_samples - 1),
+ kernel_params={'log': kernel_log}, basis_sampling="random").fit(X)
+
assert_equal(len(kernel_log), n_samples * (n_samples - 1) / 2)
Something went wrong with that request. Please try again.