New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

shift functionality for cfl grid related time step restrictions to directory for spatial operator #526

Merged

nfehn merged 1 commit into exadg:master from nfehn:cfl_condition

Aug 1, 2023

Member

nfehn commented Jul 27, 2023

I am not sure if all the code in the new file (which I have just copied) is correct for simplicial elements. @kronbichler @peterrum could you please have a look at this and let me in case something has to be changed for simplex. Then, I will address this in the other PR on simplex elements (PR #525), once the present PR has been merged.


          shift functionality for cfl condition and other grid related time ste…

a2eee3b

…p restrictions to directory for spatial operator

kronbichler approved these changes

View reviewed changes

Contributor

kronbichler left a comment

Since all code uses either cell->center() or MatrixFree + CellIntegrator, which should be agnostic of the underlying element type, this looks fine. If we want to support mixed meshes, we will obviously need to augment some code, because then we have different CellIntegrator objects on different cells, but that should be it.

When looking through the code, I observed some stylistic issues in terms of HPC programming practices that we could change, but it all those places have pre-existed and I merely noted them now. I can open a PR elsewhere if you do not want them now.

include/exadg/operators/grid_related_time_step_restrictions.h

+                                             double const       exponent_fe_degree = 3.0)
+              {
+                double const time_step =
+                  pow(global_min_cell_diameter, 2.0) / pow(fe_degree, exponent_fe_degree) / diffusivity;

Contributor

kronbichler Jul 30, 2023

Even though this is not in the critical path, we should use an efficient coding style that tries to avoid non-integer powers if possible,

Suggested change

      
                pow(global_min_cell_diameter, 2.0) / pow(fe_degree, exponent_fe_degree) / diffusivity;
          
                std::pow(global_min_cell_diameter, 2) / std::pow(fe_degree, exponent_fe_degree) / diffusivity;

include/exadg/operators/grid_related_time_step_restrictions.h

+                  if(cell->is_locally_owned())
+                  {
+                    // calculate maximum velocity
+                    dealii::Point<dim> point = cell->center();

Contributor

kronbichler Jul 30, 2023

Considering #524, shouldn't we use a mapping here? It won't probably change much, but it might help the consistency in the code base.

Member Author

nfehn Aug 1, 2023

agree

include/exadg/operators/grid_related_time_step_restrictions.h

Comment on lines +101 to +103

+                    U = vel.norm();
+                    if(U > max_U)
+                      max_U = U;

Contributor

kronbichler Jul 30, 2023

This leads to inefficient code. Albeit it won't matter here, it is better to consistently use

Suggested change

      
                  U = vel.norm();
          
                  if(U > max_U)
          
                    max_U = U;
          
                  max_U = std::max(max_U, vel.norm());

The reason is that max typically maps to single-instruction operations available in the instruction set (maxsd and related functions in the case of x86-64), whereas if leads to a conditional branch. If we would like to be even more efficient, we could compute the maximum for vel.norm_square() and only compute the the square root in the end once we return from the function.

include/exadg/operators/grid_related_time_step_restrictions.h

+                                             double const       exponent_fe_degree = 2.0)
+              {
+                double const time_step =
+                  global_min_cell_diameter / pow(fe_degree, exponent_fe_degree) / max_velocity;

Contributor

kronbichler Jul 31, 2023

Suggested change

      
                global_min_cell_diameter / pow(fe_degree, exponent_fe_degree) / max_velocity;
          
                global_min_cell_diameter / (std::pow(fe_degree, exponent_fe_degree) * max_velocity);

include/exadg/operators/grid_related_time_step_restrictions.h

+                  // loop over vectorized array
+                  double dt = std::numeric_limits<double>::max();
+                  for(unsigned int v = 0; v < dealii::VectorizedArray<value_type>::size(); ++v)

Contributor

kronbichler Jul 31, 2023

I think we discussed this elsewhere, but it would be safer to compute this as

Suggested change

      
                for(unsigned int v = 0; v < dealii::VectorizedArray<value_type>::size(); ++v)
          
                for(unsigned int v = 0; v < data.n_active_entries_per_cell_batch(cell); ++v)

because then we do not need to depend on the fact that delta_t_cell is valid.

include/exadg/operators/grid_related_time_step_restrictions.h


		double new_time_step = std::numeric_limits<double>::max();

		double const cfl_p = 1.0 / pow(degree, exponent_fe_degree);

Contributor

kronbichler Jul 31, 2023

I'm curious: Do we import namespace std somewhere, or why is pow without std:: well-defined? I think we should try to be consistent here and elsewhere and use std::pow.

include/exadg/operators/grid_related_time_step_restrictions.h

Comment on lines +248 to +256

+                    if(cfl_condition_type == CFLConditionType::VelocityNorm)
+                    {
+                      delta_t_cell = std::min(delta_t_cell, cfl_p / ut_xi.norm());
+                    }
+                    else if(cfl_condition_type == CFLConditionType::VelocityComponents)
+                    {
+                      for(unsigned int d = 0; d < dim; ++d)
+                        delta_t_cell = std::min(delta_t_cell, cfl_p / (std::abs(ut_xi[d])));
+                    }

Contributor

kronbichler Jul 31, 2023

Now again performance is not important here, but since divisions and square roots are expensive and everything else except ut_xi is constant across all quadrature point, I wonder if it would be better style to compute the maximum of the velocity (either norm_square for the directed one or std::abs(ut_xi[d])) in the loop over q and perform the division and square root outside.

include/exadg/operators/grid_related_time_step_restrictions.h

Comment on lines +281 to +282

		// at a slightly different time in the next time step and so on). This way, it can be ensured
		// that the sequence of time step sizes is exactly reproducible with the results being

Contributor

kronbichler Jul 31, 2023

I know this comment is pre-existing, but it is not ensured that the sequence of time steps is exactly reproducible: Within the gap between the truncation and roundoff, we still have the probability of making the wrong decision, as we saw in a failing test recently.

Suggested change

      
              // at a slightly different time in the next time step and so on). This way, it can be ensured
          
              // that the sequence of time step sizes is exactly reproducible with the results being
          
              // at a slightly different time in the next time step and so on). This way, we get a
          
              // high probability that the sequence of time step sizes is

Member Author

nfehn Aug 1, 2023

I also agree here. One could even ask whether we should truncate at all.

include/exadg/operators/grid_related_time_step_restrictions.h

+                    invJ  = transpose(invJ);
+                    ut_xi = invJ * u_x;
+                    u_va = std::max(u_va, ut_xi.norm());

Contributor

kronbichler Jul 31, 2023

Same here, I would prefer to use

Suggested change

      
                  u_va = std::max(u_va, ut_xi.norm());
          
                  u_va = std::max(u_va, ut_xi.norm_square());

and take the square root in line 341.

Member Author

nfehn Aug 1, 2023

I am not so sure about this type of change, since it probably makes the code harder to read/understand. The question is whether this change is really that relevant regarding the overall performance. If we make the change, the variable names should be changed to _square as well.

Member Author

nfehn commented Aug 1, 2023

I would suggest to merge the present PR and then address the open issues in a separate PR. I will create an issue for this.

nfehn mentioned this pull request

Refactor CFL time step calculations #529

Open

nfehn merged commit cc3c14a into exadg:master

4 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment