Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize clover storage and memory traffic #202

Closed
maddyscientist opened this issue Jan 8, 2015 · 7 comments
Closed

Optimize clover storage and memory traffic #202

maddyscientist opened this issue Jan 8, 2015 · 7 comments

Comments

@maddyscientist
Copy link
Member

There is symmetry in the clover them that is not exploited that could reduce the memory footprint and reduce the memory traffic in the kernels that use the clover term.

Specifically, if we consider the 2x2 block form each of chiral block of the clover matrix, we have

/ A   B \
\ B* -A /

We presently only store B* and don't double store with B, however, we do double store A and -A. If we only stored A, then the number of real numbers required to store the clover term is reduced from 72 to 54, which represents 25% reduction in both memory footprint and memory traffic.

I suspect this optimization is of most utility for the twisted clover formulation, since I believe it requires two clover matrices, hence is more memory footprint bound.

This symmetry does not hold directly for the clover inverse, so it would only apply to the kernels with the direct clover term. However, clearly we can use only 54 numbers for the clover inverse, since we could just load the direct matrix (using 54 numbers) and invert on the fly. This may be too computationally expensive, but more insight may be gained from considering the 2x2 block form of the inverse.

Anyway, as a first step, we should replace the clover direct matrix using this reduced storage form. After that is done, we can consider reduced storage options for the inverse.

@maddyscientist
Copy link
Member Author

Ok, on closer inspection I see that the clover term relation isn't quite I wrote above. Since the regular Wilson diagonal term contribution is included in the clover matrix, the actual form is like this:

/ 1+A  B  \
\ B*  1-A /

so the relation we need to relate the upper and lower diagonal parts is slightly more involved and likely means that the only way to do the memory saving for inverse matrix is to invert the clover matrix on the fly.

@AlexVaq
Copy link
Member

AlexVaq commented Jan 21, 2015

Ok understood, so I should get going with fma soon, and then with the
inversion on-the-fly.

Alex

El 21/1/2015, a las 20:06, mikeaclark notifications@github.com escribió:

Ok, on closer inspection I see that the clover term relation isn't quite I
wrote above. Since the regular Wilson diagonal term contribution is
included in the clover matrix, the actual form is like this:

/ 1+A B
\ B* 1-A /

so the relation we need to relate the upper and lower diagonal parts is
slightly more involved and likely means that the only way to do the memory
saving for inverse matrix is to invert the clover matrix on the fly.


Reply to this email directly or view it on GitHub
#202 (comment).

@mathiaswagner
Copy link
Member

Can’t you just recover your original form of the matrix using -1 ?
But maybe I just did not think about that.

@maddyscientist
Copy link
Member Author

Yes you can recover it trivially for the regular clover matrix, but for the inverse, it means there is no easy relationship between the upper and lower block diagonals. So I believe inversion on the fly is required (happy to be proven wrong though).

@AlexVaq
Copy link
Member

AlexVaq commented Jan 21, 2015

In any case, there are other strong reasons to push the inversion
on-the-fly, so I really think we (I) should start working on this seriously.

El 21/1/2015, a las 20:21, mikeaclark notifications@github.com escribió:

Yes you can recover it trivially for the regular clover matrix, but for the
inverse, it means there is no easy relationship between the upper and lower
block diagonals. So I believe inversion on the fly is required (happy to be
proven wrong though).


Reply to this email directly or view it on GitHub
#202 (comment).

@mathiaswagner
Copy link
Member

Well, the inverse was the thing I did not think about. Maybe one should anyhow try to see what happens by the additional one on the diagonal. I remember there are some formulas also for diagonal matrix + ‘something’ but I don’t know remember which restrictions there are for 'something'.

@weinbe2
Copy link
Contributor

weinbe2 commented Aug 30, 2024

Addressed by #1213

@weinbe2 weinbe2 closed this as completed Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants