Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change logic in tensor product kernels to positive statements? #13432

Merged
merged 3 commits into from Feb 22, 2022

Conversation

nfehn
Copy link
Contributor

@nfehn nfehn commented Feb 22, 2022

I was wondering whether there is a reason to write the code as is? Performance aspects?

From a logical perspective, I would prefer to have positive statements. If you agree, I would be willing to refactor all the other occurrences.

@kronbichler
Copy link
Member

There is no difference in terms of performance between the two (at least I hope so), the reason is primarily historical from the time Katharina first wrote the code for the non-add version back in 2009, and we added addition as a second path. So I would be in favor of making this change as you suggest.

@nfehn
Copy link
Contributor Author

nfehn commented Feb 22, 2022

ok, I will probably do all the changes later today. Is this the only file or does it make sense to also consider other files/classes in this PR?

@nfehn
Copy link
Contributor Author

nfehn commented Feb 22, 2022

Another thing I have seen more often is something like

else
  r0 = Number();

For me it looks like this is superfluous? If so, I would, however, say that this should be addressed in a separate PR.

@nfehn
Copy link
Contributor Author

nfehn commented Feb 22, 2022

Furthermore, I believe readability would be greatly improved by introducing empty lines here and there between the many if, else, for branches.

@kronbichler
Copy link
Member

No, the r0 = Number(); statements are really because they augment the if clause of those codes that runs only when the degree is larger than 0 or the number of quadrature points is larger than 1. (The reason why this is not done unconditionally also for the cases where we later overwrite the variables with a different content is that it made compilers to emit unnecessary instructions at some point in the past. It might not be the case any more with better compilers than 12 years ago when the initial optimizations were made, but one would need to check that.)

@nfehn
Copy link
Contributor Author

nfehn commented Feb 22, 2022

So the definition of the variable does not call the default constructor?

@kronbichler
Copy link
Member

Furthermore, I believe readability would be greatly improved by introducing empty lines here and there between the many if, else, for branches.

I agree that this would help for some of the statements. However, I would probably not put them everywhere because if a function consumes too much vertical space it gets more difficult to fit on a screen, which also hampers readability. (But we might want to refactor the code at some point to break out 1D interpolation formulas for the different cases, i.e., use something like in https://github.com/kronbichler/multigrid/blob/experimental_merge_vector/common/matrix_vector_kernel.h that gets picked up in the other loops, to avoid the redundancy mentioned in the last bullet point of the comment #9545 (comment) and to also make the code more understandable.)

@kronbichler
Copy link
Member

ok, I will probably do all the changes later today. Is this the only file or does it make sense to also consider other files/classes in this PR?

I think this is where we collect most statements of this kind, so I think the overlap with other parts is probably not very high and follows a different pattern. Hence, I would consider improvements made here to already give a good benefit.

@kronbichler
Copy link
Member

kronbichler commented Feb 22, 2022

So the definition of the variable does not call the default constructor?

No, it does not. [Edit: It does call a default constructor, but the default constructor does not zero-initialize the memory, so the content of the variable is undefined until it is assigned a value.] This code is written for fundamental types like double or float (or aggregate types like VectorizedArray) where the creation does not zero-initialize the objects. (On purpose, for performance reasons, because these get created and set up in inner loops.)

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you.

@kronbichler
Copy link
Member

/rebuild

Copy link
Member

@kronbichler kronbichler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of test failures. I believe they all go back to the error below - can you try to adjust and see if things are working then?

if (add == false)
out[col * stride] = shape_values[col] * in[0];
if (add)
out[col * stride] += shape_values[col] * in[0];
else
out[col * stride] += shape_values[col] * in[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out[col * stride] += shape_values[col] * in[0];
out[col * stride] = shape_values[col] * in[0];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. I pushed a new commit.

Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice job. Negatives are hard to read.

@bangerth
Copy link
Member

/rebuild

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants