WIP: TFQMR classes and reference kernels #23

hartwiganzt · 2018-03-19T14:16:49Z

this is the initial tfqmr pull request. Core functionality + reference kernels

pratikvn · 2018-03-21T12:22:00Z

core/solver/tfqmr.cpp

+                          rel_residual_goal_)) {
+            break;
+        }
+        if (iter % 2 != 0) {


Maybe this if clause can go into an helper function after combining the two steps 1 and 5 so that you just have to pass in your different inputs to the helper function.

gflegar

I didn't go through the whole thing, but I did find some serious problems with it - it seem that something is not right here. See inline comments. (That's why I haven't look at the rest in more detail.)

Also, I think we have a general problem with reviewing Krylov solvers, see #29.

Spent the whole day on this, but at least something useful came out of it: found a bug here, found several possible improvements, started a discussion on how we review Krylov solvers, reminded myself of the details of CG, and created a draft for better CG documentation.

gflegar · 2018-03-30T12:02:26Z

reference/solver/tfqmr_kernels.cpp

+{
+    for (size_type j = 0; j < d->get_num_cols(); ++j) {
+        if (alpha->at(j) != zero<ValueType>()) {
+            sigma->at(j) = theta->at(j) / alpha->at(j) * eta->at(j);


step_2 description says sigma = (theta^2 / alpha) * eta

gflegar · 2018-03-30T12:04:02Z

reference/solver/tfqmr_kernels.cpp

+        } else {
+            theta->at(j) = one<ValueType>();
+        }
+        auto tmp = one<ValueType>() / sqrt(one<ValueType>() + theta->at(j));


description says c_mp1 = 1 / (1 + theta)

gflegar · 2018-03-30T12:44:30Z

core/solver/tfqmr.cpp

+    Ad->copy_from(d.get());
+
+    for (int iter = 0; iter < max_iters_; ++iter) {
+        if (iter % 2 == 0) {


Trying to mask two clearly different iterations into one has serious drawbacks:

It is virtually impossible to understand (I just spent several hours trying to figure out what's going on here!). Krylov methods already have extremely complicated loop invariants, adding control statements to them just to decrease code size makes reasoning about them difficult. You need to mentally unroll the loop anyway, so why not just write it like that in the first place?

You can merge more kernels than you have currently merged.

You have quite a few additional operations that are completely unnecessary.

After figuring out what this does (all without understanding the Krylov method, just by looking at the sequence of operations), I found that there must be at least 1 bug in the algorithm here, that you actually need only 5 merged kernels for 2 iterations (instead of 7 for 1) and that you can avoid copying vectors 3 times and scalars 1 time over the course of 2 iterations. This is all by taking only a superficial look, and it might be possible to get even more savings. Of course, it didn't make sense to look into it in more detail, since there's a bug in the algorithm anyway.

See the following file for details: tfqmr.txt

I tried to keep the code as close as possible to the MATLAB code. The idea was that this could be used by users to match the functionality. I you outlined in #29 there may be another, more efficient way to go.

You can merge more kernels than you have currently merged.

Yes, I think we had this discussion on the phone. I would like to see timings to decide whether it is a good idea. But sure, I can do that.

You have quite a few additional operations that are completely unnecessary.

Not sure about that. For example, I think

[u_mp1] = u_m - [alpha] * v ! ERROR: the result is never used

is needed in the update step

pu_m = M^-1 [u_mp1] <- apply

for the even iteration counts.

you actually need only 5 merged kernels for 2 iterations (instead of 7 for 1)

Yes, we have more merging in MAGMA-sparse. And I wanted to see the performance difference because less merging is easier to understand.

I tried to keep the code as close as possible to the MATLAB code. The idea was that this could be used by users to match the functionality.

Functionality isn't dictated by implementation details, we shouldn't blindly stick to something if it doesn't work well.

Yes, I think we had this discussion on the phone. I would like to see timings to decide whether it is a good idea. But sure, I can do that.

I am aware of our discussion:

Even if you keep scalar updates separate from vector updates, you can still merge more.

Either way, you should merge Ginkgo kernels, and have some of them call two CUDA kernels. This performance issue is an implementation detail, which can differ for different devices, and shouldn't leak to the top-level solver implementation.

Not sure about that. For example, I think

[u_mp1] = u_m - [alpha] * v ! ERROR: the result is never used

is needed in the update step

pu_m = M^-1 [u_mp1] <- apply

for the even iteration counts.

u_mp1 gets updated before it's used here: [u_mp1] = w + [beta] * u_m. Just search for u_mp1 in any of the codes (the original file, my transcript of the code, or the unrolled one), and you'll see that there are two consecutive assigns to u_mp1.
This is something that's hard to notice in the original version of the code due to the convoluted implementation. If that's based on MATLAB's code, then their implementation is quite horrible.
Please take some more time to look through my comments (at least the error, and opt ones), and point out exactly what's wrong if something is wrong with them.

less merging is easier to understand.

I disagree, we have comments below, so you don't look at the merged call anyway. Where's the difference for the user if he sees calls to step_1 ..., step_5 as oposed to step_1, ...., step_7?
Also this code has far greater issues when it comes to "understanding".

gflegar · 2018-11-26T15:21:03Z

Closing this due to inactivity and interface changes done to the main Ginkgo branch that make this implementation obsolete (e.g. updated EnableLinOp mixin implementation, stopping criteria, loggers).

We should probably keep the branch for reference, until a proper TFQMR implementation is made.

hartwiganzt requested review from gflegar, pratikvn, tcojean and ginkgo-bot March 19, 2018 14:16

pratikvn reviewed Mar 21, 2018

View reviewed changes

gflegar assigned hartwiganzt Mar 23, 2018

gflegar added the is:new-feature A request or implementation of a feature that does not exist yet. label Mar 23, 2018

gflegar changed the title ~~Initial tfqmr~~ TFQMR classes and reference kernels Mar 23, 2018

gflegar removed the request for review from ginkgo-bot March 23, 2018 14:24

hartwiganzt added 4 commits March 27, 2018 17:13

initial commit for tfqmr: core and reference kernels

3448432

changes to framework

3f6b4f4

needed to change BiCGSTAB to TFQMR here manually - no idea why...

890df03

minor modifications

c91b7cb

hartwiganzt force-pushed the initial_tfqmr branch from 6276048 to c91b7cb Compare March 27, 2018 15:17

gflegar mentioned this pull request Mar 30, 2018

A critique of current Krylov method review process #29

Open

gflegar requested changes Mar 30, 2018

View reviewed changes

hartwiganzt mentioned this pull request May 11, 2018

Class hierarchy improvements #46

Merged

gflegar changed the title ~~TFQMR classes and reference kernels~~ WIP: TFQMR classes and reference kernels Jun 26, 2018

gflegar added mod:core This is related to the core module. mod:reference This is related to the reference module. labels Jun 28, 2018

gflegar closed this Nov 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: TFQMR classes and reference kernels #23

WIP: TFQMR classes and reference kernels #23

hartwiganzt commented Mar 19, 2018

pratikvn Mar 21, 2018

gflegar left a comment

gflegar Mar 30, 2018

gflegar Mar 30, 2018

gflegar Mar 30, 2018

hartwiganzt Mar 31, 2018

gflegar Mar 31, 2018

gflegar commented Nov 26, 2018 •

edited

Loading

WIP: TFQMR classes and reference kernels #23

WIP: TFQMR classes and reference kernels #23

Conversation

hartwiganzt commented Mar 19, 2018

pratikvn Mar 21, 2018

Choose a reason for hiding this comment

gflegar left a comment

Choose a reason for hiding this comment

gflegar Mar 30, 2018

Choose a reason for hiding this comment

gflegar Mar 30, 2018

Choose a reason for hiding this comment

gflegar Mar 30, 2018

Choose a reason for hiding this comment

hartwiganzt Mar 31, 2018

Choose a reason for hiding this comment

gflegar Mar 31, 2018

Choose a reason for hiding this comment

gflegar commented Nov 26, 2018 • edited Loading

gflegar commented Nov 26, 2018 •

edited

Loading