/
NearestPSDKernel.m
179 lines (169 loc) · 7.32 KB
/
NearestPSDKernel.m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
%% Nearest positive semidefinite kernel
% Behnam Hashemi, February 2016
%%
% (Chebfun Example approx2/NearestPDKernel.m)
% [Tags: #Chebfun2, #SVD, #eigenvalue, #quasimatrix]
function NearestPSDKernel()
%% Introduction
% Bivariate kernels are functions which play an important role in the context
% of integral equations. Symmetric positive definite kernels are ubiquitous
% e.g., in radial basis function (RBF) methods [Ch. 3-4, 2]. Suppose
% that $K(x,y)$ is a given symmetric kernel defined on a domain
% $[a, b] \times [a, b]$, where symmetric means $K(x,y) = K(y,x)$.
% Then $K$ has a spectral expansion
%
% $$ K(x,y) = \sum_{i=1}^{\infty} \lambda_i\ q_i(x)\ q_i(y),\ \ \ \ \ (1)$$
%
% where $\{ q_1, q_2, \cdots \}$ are orthonormal eigenfunctions and
% $|\lambda_1| \geq |\lambda_2| \geq ... $ are eigenvalues of $K$. Although
% mathematically, $K(x,y)$ has an infinite number of eigenvalues, our focus is
% on kernels that numerically have a finite number of eigenvalues. Note
% that the eigenvalues of an analytic kernel on a finite interval tend to
% zero at least as fast as $R^{-n}$ for some fixed $R<1$ which depends on
% the domain of analyticity of the kernel [5]. We consider the problem of
% finding a nearest symmetric positive semidefinite (PSD) kernel $\hat{K}(x,y)$,
% i.e., one that minimizes $||K-\hat{K}||_2$. A solution to this problem is
%
% $$ \hat{K}(x,y) = \sum_{i=1}^{\infty} \max ( \lambda_i,0 ) \ q_i(x)\ q_i(y). $$
%
% In other words, in order to get a nearest positive semidefinite
% kernel, we just need to remove terms in the spectral expansion of $K$
% that correspond to negative eigenvalues. The function $\hat{K}$ is also
% called the _positive semidefinite part_ of $K$. This is a continuous
% analogue of the problem of finding a nearest positive semidefinite matrix
% [1]. See also [4], where the problem is solved for a general (not necessarily
% symmetric) matrix.
%
% In practice, we assume that $K$ can be approximated to accuracy
% $\epsilon_M$ by a finite rank function [8-9]. One can easily compute such
% an approximation in Chebfun, even though Chebfun2 has no _eig_ command
% currently. The point is that the singular value expansion
%
% $$ K(x,y) = U \Sigma V^T = \sum_{i=1}^{\infty} \sigma_i\ u_i(x)\ v_i(y), $$
%
% is related to the spectral expansion (1) as follows:
%
% $$ \Sigma = \mbox{diag}(|\lambda_1|, |\lambda_2|, \cdots), \mbox{ and } V
% = (\mbox{sign}(\lambda_1) u_1, \mbox{sign}(\lambda_2) u_2, \cdots ). $$
%
% See e.g., [p. 31, 3] for details in the discrete case. Let us try some
% symmetric indefinite kernels in Chebfun. The function
% `KHat = nearestPSD(K)` in the following computes $\hat{K}$.
%
% Let us recall that even though spectral expansion of an operator is
% classic [7], it makes sense to talk also about spectral expansion of a
% bivariate function in the context of approximation. But it is not our aim
% here to make this connection precise.
%% A multiquadric kernel
% One of the famous kernels used in the field of RBFs is the multiquadric.
% This kernel is indefinite in general [6], so its approximation by a
% positive semidefinite kernel is nontrivial. More specifically, as Grady
% Wright noticed, the multiquadric is a conditionally negative definite
% kernel, which means that it has one positive eigenvalue and the rest are
% negative. In the following we try $-1$ times the multiquadric, which is a
% conditionally positive definite kernel, and the nearest PSD kernel would
% be of rank one less than the negative of the multiquadric.
c = 0.01;
mq = @(x,y) -sqrt((x.^2+y.^2) + c^2);
K = chebfun2(mq,[-2 2 -2 2])
KHat = nearestPSD(K)
%%
% As we see, one negative eigenvalue of $K$ has been removed to form
% $\hat{K}$.
subplot(1,2,1), contourf(K), axis equal, title('Multiquadric kernel')
subplot(1,2,2), contourf(KHat), axis equal, title('Nearest PSD kernel')
%% A symmetric positive semidefinite kernel of Gaussian bumps
% Here we add $20$ Gaussian bumps with centers at the diagonal of the
% domain, so that the kernel itself is positive semidefinite and therefore
% we expect $\hat{K}$ to be the same as $K$.
gam = 50;
K = chebfun2(0);
rng(1)
for n = 1:20
x0 = 2*rand-1;
dK = chebfun2(@(x,y) exp(-gam*((x-x0).^2+(y-x0).^2)));
K = K + dK;
end
K, KHat = nearestPSD(K)
subplot(1,2,1), plot(K), title('A PSD sum of 20 bumps')
subplot(1,2,2), plot(KHat), title('Nearest PSD kernel')
snapnow
subplot(1,2,1), contourf(K), axis equal, title('A PSD sum of 20 bumps')
subplot(1,2,2), contourf(KHat), axis equal, title('Nearest PSD kernel')
%%
% Let's see how different $K$ and $\hat{K}$ are:
norm(K-KHat,'fro')
%% A symmetric indefinite kernel of Gaussian bumps
% Next we add $20$ Gaussian bumps with centers off the diagonal of the
% domain such that each two of them make a symmetric kernel.
gam = 100;
K = chebfun2(0);
rng(3)
for n = 1:10
x0 = 2*rand-1; y0 = 2*rand-1;
dK = chebfun2(@(x,y) exp(-gam*((x-x0).^2+(y-y0).^2)) ...
+ exp(-gam*((x-y0).^2+(y-x0).^2)));
K = K + dK;
end
K, KHat = nearestPSD(K)
subplot(1,2,1), plot(K), title('An indefinite sum of bumps')
subplot(1,2,2), plot(KHat), title('Nearest PSD kernel')
snapnow
subplot(1,2,1), contourf(K), axis equal, title('An indefinite sum of bumps')
subplot(1,2,2), contourf(KHat), axis equal, title('Nearest PSD kernel')
%% A function with horizontal and vertical ridges
K = cheb.gallery2('waffle')
KHat = nearestPSD(K)
%%
% Here, $14$ negative eigenvalues of $K$ are removed to compute a nearest
% symmetric positive semidefinite kernel.
subplot(1,2,1), plot(K), title('Waffle')
subplot(1,2,2), plot(KHat), title('Nearest PSD kernel')
snapnow
subplot(1,2,1), contourf(K), axis equal, title('Waffle')
subplot(1,2,2), contourf(KHat), axis equal, title('Nearest PSD kernel')
%%
function KHat = nearestPSD(K)
[U,S,V] = svd(K);
s = zeros(size(U,2),1);
for i=1:size(U,2)
s(i) = U(:,i)'*V(:,i); % inner product of eigenfunctions
end
Lambda = sign(s).*diag(S); % eigenvalues of the kernel K
% Nearest positive semidefinite function to K
ind = find(~(Lambda<0)); % indices of nonnegative eigenvalues
LambdaHat = Lambda(ind); % keep only nonnegative eigenvalues
U = U(:,ind); % keep only the corresponding eigefunctions
V = V(:,ind);
KHat = U*diag(LambdaHat)*V'; % form the solution
end
%% References
%
% 1. S. Boyd, Low rank approximation and extremal gain problems, Lecture
% notes for Introduction to Linear Dynamical Systems, Stanford
% University, 2015.
%
% 2. G. Fasshauer, _Meshfree Approximation Methods with MATLAB_, World
% Scientific, 2007.
%
% 3. P.C. Hansen, _Discrete Inverse Problems: Insight and Algorithms_,
% SIAM, 2010.
%
% 4. N.J. Higham, Computing a nearest symmetric positive semidefinite
% matrix, _Linear Algebra and its Applications_, 103 (1988) 103-118.
%
% 5. G. Little and J.B. Reade, Eigenvalues of analytic kernels, _SIAM
% Journal on Mathematical Analysis_ 15 (1984) 133-136.
%
% 6. C. S. Ong, X. Mary, S. Canu and A. J. Smola, Learning with
% non-positive kernels, _Proceedings of the 21st International
% Conference on Machine Learning_, Banff, Canada, 2004.
%
% 7. F. Riesz and B. Sz.-Nagy, _Functional Analysis_, Dover, 1990.
%
% 8. A. Townsend, _Computing with functions in two dimensions_, DPhil
% Thesis, University of Oxford, 2014.
%
% 9. A. Townsend and L.N. Trefethen, An extension of Chebfun to two
% dimensions, _SIAM Journal on Scientific Computing_ 35 (2013) C495-C518.
end