Skip to content

Commit

Permalink
Take advantage of broadcasting/implicit expansion in chebtech.clensha…
Browse files Browse the repository at this point in the history
…w().

This has two benefits:

 - There is a measurable speedup when evaluating a high-degree
   array-valued chebtech with many columns at many points.

 - It is no longer necessary to maintain duplicate versions of the code
   for the scalar and vector cases.

To expand on the second:  with implicit expansion, the *only* difference
between the scalar and vector codes is the indexing of the coefficient
array as c(k) vs. c(k,:).  I am unable to measure a speed difference
between these for the case when c has only one column.

This breaks compatibility with MATLAB versions older than R2016b.
  • Loading branch information
aaustin141 committed Oct 10, 2023
1 parent db207bc commit 2601ab1
Showing 1 changed file with 12 additions and 79 deletions.
91 changes: 12 additions & 79 deletions @chebtech/clenshaw.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,10 @@
%
% See also CHEBTECH.FEVAL, CHEBTECH.BARY.

% Copyright 2017 by The University of Oxford and The Chebfun Developers.
% Copyright 2017 by The University of Oxford and The Chebfun Developers.
% See http://www.chebfun.org/ for Chebfun information.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Developer note: Clenshaw is not typically called directly, but by FEVAL().
%
% Developer note: The algorithm is implemented both for scalar and for vector
% inputs. Of course, the vector implementation could also be used for the scalar
% case, but the additional overheads make it a factor of 2-4 slower. Since the
% code is short, we live with the minor duplication.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% X should be a column vector.
if ( size(x, 2) > 1 )
Expand All @@ -40,91 +32,32 @@
x = x(:);
end

% Scalar or array-valued function?
if ( size(c, 2) == 1 )
y = clenshaw_scl(x, c);
else
y = clenshaw_vec(x, c);
end

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% function y = clenshaw_scl(x, c)
% % Clenshaw scheme for scalar-valued functions.
% bk1 = 0*x;
% bk2 = bk1;
% x = 2*x;
% for k = (size(c,1)-1):-1:1
% bk = c(k) + x.*bk1 - bk2;
% bk2 = bk1;
% bk1 = bk;
% end
% y = c(1) + .5*x.*bk1 - bk2;
% end

% function y = clenshaw_vec(x, c)
% % Clenshaw scheme for array-valued functions.
% x = repmat(x(:), 1, size(c, 2));
% bk1 = zeros(size(x, 1), size(c, 2));
% bk1 = zeros(size(x, 1), size(c, 2));
% bk2 = bk1;
% e = ones(size(x, 1), 1);
% x = 2*x;
% for k = (size(c,1)-1):-1:1
% bk = e*c(k,:) + x.*bk1 - bk2;
% bk2 = bk1;
% bk = c(k,:) + x.*bk1 - bk2;
% bk2 = bk1;
% bk1 = bk;
% end
% y = e*c(1,:) + .5*x.*bk1 - bk2;
% end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% y = c(1,:) + .5*x.*bk1 - bk2;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% These code are the same as above with the exception that it does two steps of
% This code is the same as above with the exception that it does two steps of
% the algorithm at once. This means that the intermediate variables do not need
% to be updated at each step, and can result in a non-trivial speedup. NH 02/14
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function y = clenshaw_scl(x, c)
% Clenshaw scheme for scalar-valued functions.
bk1 = 0*x;
bk2 = bk1;
x = 2*x;
n = size(c,1)-1;
for k = (n+1):-2:3
bk2 = c(k) + x.*bk1 - bk2;
bk1 = c(k-1) + x.*bk2 - bk1;
end
if ( mod(n, 2) )
tmp = bk1;
bk1 = c(2) + x.*bk1 - bk2;
bk2 = tmp;
end
y = c(1) + .5*x.*bk1 - bk2;
end

function y = clenshaw_vec(x, c)
% Clenshaw scheme for array-valued functions.
x = repmat(x(:), 1, size(c, 2));
bk1 = zeros(size(x, 1), size(c, 2));
bk1 = zeros(length(x), size(c, 2));
bk2 = bk1;
e = ones(size(x, 1), 1);
x = 2*x;
n = size(c, 1)-1;
for k = (n+1):-2:3
bk2 = e*c(k,:) + x.*bk1 - bk2;
bk1 = e*c(k-1,:) + x.*bk2 - bk1;
bk2 = c(k,:) + x.*bk1 - bk2;
bk1 = c(k-1,:) + x.*bk2 - bk1;
end
if ( mod(n, 2) )
tmp = bk1;
bk1 = e*c(2,:) + x.*bk1 - bk2;
bk1 = c(2,:) + x.*bk1 - bk2;
bk2 = tmp;
end
y = e*c(1,:) + .5*x.*bk1 - bk2;
y = c(1,:) + .5*x.*bk1 - bk2;

end

0 comments on commit 2601ab1

Please sign in to comment.