-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEPointEvaluation: Implement n_active_entries_per_quadrature_batch #16891
Conversation
ed6f0a2
to
812c3cd
Compare
ExcMessage( | ||
"Calling this function only makes sense in fully vectorized mode.")); | ||
if (q == n_q_batches - 1) | ||
return n_q_points_scalar & (n_lanes_user_interface - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you change from the modulo operation to this form? In my experience, compilers reduce modulo (power of two)
into bitwise and with (power of two - 1)
, and to me the modulo operation is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use it here
dealii/include/deal.II/matrix_free/fe_point_evaluation.h
Lines 2833 to 2835 in 31a1263
if (const unsigned int n_filled_lanes = | |
this->n_q_points_scalar & (n_lanes_internal - 1); | |
n_filled_lanes > 0) |
I can change this to modulo if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care too much, I was just wondering why you changed it. I think it is good to be consistent within the code base, and I believe the modulo is used more frequently as I see from a quick search (but it is hard to filter out other uses of &
).
812c3cd
to
778ce95
Compare
/rebuild |
This function allows to only loop over active lanes in a quadrature batch.
FYI @ritthaler