You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In GMRES(), a matrix V is defined with dimensions xkrylovlen by (m+1), where m is the number of iterations per restart and is chosen in the code to be 20.
The columns of this matrix are accessed by calls to getColumn() and putColumn(), which gather/scatter the data from/to a vector. This is likely to be expensive, since the large size of the Krylov vectors means that these accesses will usually be cache misses; such scatter/gather can also be a computational bottleneck when a vector unit is available. Transposing V would eliminate the need for any copying of data. We can then make v and w simple pointers:
// don't need this any more:
//double *v = new double[xkrylovlen];
//double *w = new double[xkrylovlen];
double *v,*w;
and change the getColumn() calls to set the pointer:
// don't need this any more:
//getColumn(v, V, k, xkrylovlen);
v = V[k]; // use this instead
and simply delete the putColumn() calls.
This change also allows us to vectorize the use of V to modify the Krylov vector at the end of the loop:
// the new code
//
for (register int j = 0; j < k; j++)
{
const double yj = y[j];
double* Vj = V[j];
// this will vectorize nicely
for (int i = 0; i < xkrylovlen; i++)
xkrylov[i] += yj * Vj[i];
}
// this was the old code.
//
//for (int jj = 0; jj < xkrylovlen; jj++) {
// tmp = 0.0;
// for (register int l = 0; l < k; l++)
// tmp += y[l] * V[l][jj];
// xkrylov[jj] += tmp;
//}
The text was updated successfully, but these errors were encountered:
In
GMRES()
, a matrixV
is defined with dimensionsxkrylovlen
by(m+1)
, wherem
is the number of iterations per restart and is chosen in the code to be 20.The columns of this matrix are accessed by calls to
getColumn()
andputColumn()
, which gather/scatter the data from/to a vector. This is likely to be expensive, since the large size of the Krylov vectors means that these accesses will usually be cache misses; such scatter/gather can also be a computational bottleneck when a vector unit is available. TransposingV
would eliminate the need for any copying of data. We can then makev
andw
simple pointers:and change the getColumn() calls to set the pointer:
and simply delete the putColumn() calls.
This change also allows us to vectorize the use of V to modify the Krylov vector at the end of the loop:
The text was updated successfully, but these errors were encountered: