Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with linear solver convergence due to choice of Reference viscosity #1814

Open
jaustermann opened this issue Jun 15, 2017 · 2 comments

Comments

@jaustermann
Copy link
Contributor

I've had some issues with choosing the "right" reference viscosity (in models with varying viscosity) in order to get convergence of the linear system and was wondering if there might be some way of improving this. I've found that the choice of the reference viscosity strongly determines the convergence behavior of the linear solver. I assume that is because the reference viscosity is used to precondition the system and so if it's preconditioned well it converges well (right?). I understand that the reference viscosity should be some average of the viscosity model, but it's not clear to me what kind of average it should be and I also found that a "good" (i.e. leading to fast convergence) reference viscosity can vary depending on other aspects of the model run that aren't the viscosity profile. For example, I did two identical runs only switching the surface boundary condition (plate velocities vs. free slip) and I needed to choose a different reference viscosity to get convergence (1E21 vs 8E21). I assume that has to do with the different velocities in the two runs.

It is not very convenient to 'guess' the reference viscosity and I was wondering whether there could be a more internally consistent way for determining this parameter. One option would be to calculate a first guess of the reference viscosity within the code (e.g. a depth mean) and then calculate the residual for different reference viscosities around this mean. The code could then choose the reference viscosity that results in the smallest residual as the reference viscosity to use for the iteration. Another option might be to adjust the reference viscosity for things like RMS velocity, but I'm not sure what other parameters might affect the convergence or what the best way of adjustment would be.

Anyhow, I hope the issue is clear and I'd be interested in hearing some thoughts on this.

@gassmoeller
Copy link
Member

Hi Jacky,
sorry for the late reply. The topic is a bit involved, I also had to read myself into it again.

First a clarification: The reference viscosity is not really used to precondition (at least not in the sense of a preconditioner), it is used to scale the pressure solution and corresponding matrix block compared to the velocity block (more precisely the pressure scaling is reference viscosity / reference length scale, the second one is provided by the geometry model). This is done to achieve comparable orders of magnitude for pressure and velocity to avoid floating-point imprecision. Therefore when you change the reference viscosity, you essentially change how large the variable p_bar is that we solve for (not to be confused with the real pressure p, which stays unchanged) compared to v. Because both the matrix block and the solution variable is scaled the same way, the residual using different scaling is also the same. Therefore we can not use the residual to determine the correct scaling.

That being said I made the same experience, oftentimes lowering the pressure scaling (by lowering the reference viscosity) leads to much less iterations and a faster Stokes solve, and at least the SolCx benchmark solution does not seem to deteriorate (i.e. we are likely not getting a faster solver on the cost of accuracy).
This even holds for models with constant viscosity. Using a viscosity lower than the actual viscosity as reference viscosity speeds up the solution. This means using some kind of viscosity average to determine the reference might not even help. One question for @tjhei and @bangerth in this regard: In the 2012 paper you mention that we use a distance of 10 km (scaled by the total domain size) as reference length instead of the whole domain size, because "We found it is best ..." (section 3.2.4). Do you remember how you did this test? Because for many models I ran it looks like using the total domain size (therefore lowering the pressure scaling by about 2 orders of magnitude) seems to perform better.
@jaustermann do you see the same? I.e. does lowering the reference viscosity by 2 orders of magnitude usually solve the problems?

One thing that worries me a bit: When I increase the reference viscosity the solver breaks down at some point (too many gmres outer iterations), likely because it can no longer find an accurate pressure solution. But when I instead decrease the reference viscosity to absurdly low values (like 32 orders of magnitude lower than the initial value) the solver still converges, and the solution does not change much. If this would be an accuracy problem, should it not occur towards both ends? Am i essentially scaling the velocity vector to zero (compared to pressure), but it does not affect the solution? This would be weird.

@bangerth
Copy link
Contributor

bangerth commented Jul 7, 2017

@jaustermann -- in essence, the reference viscosity (divided by the length scale) is used to multiply the entire second equation. That's because the force balance and the mass conservation equations have different physical units, but we need to compare their residuals when running GMRES (namely, when we compute the norm of the current residual, which consists of the norms of the two equations added together) and to make that comparison useful in any reasonable sense, the two residuals need to have the same physical units. We achieve this by scaling one equation (which implies also scaling the pressure to keep the equations symmetric).

Now, if you choose the reference viscosity very large, then you multiply the second equation with a large factor -- and GMRES will stop as soon as the residual of the second equation is small because the residual of the first is (relatively) small to begin with because of the scaling factor. In other words, GMRES stops once you have found a divergence free field, whether or not it satisfies the force balance. On the other hand, if you choose the reference viscosity small, then you multiply the second equation with something small and GMRES will only stop once the force balance equation is satisfied, whether or not you have a divergence-free field. In the extreme limits of these two cases (choosing the reference viscosity equal to zero or infinity, or something that's numerically equivalent) you end up with a situation where the system is either under- or overdetermined (if the reference viscosity is either infinity or zero); in the first case, you converge quickly to something, in the latter case you won't converge to anything at all -- which is exactly the behavior @gassmoeller observes.

As for why we choose 10km: I believe that it was a compromise between accuracy and stability of the solver with regard to it being too small or too large, but I don't recall the details. It does make sense, though, if you think about what the scaling factor represents: In the first equation, you have the operator div (eta grad u), whereas in the second equation you have div u. So to scale the second, you need to multiply it by (i) 1/L that corresponds to the length scale seen by the differential operator, i.e., the length scale over which the velocity field changes, and (ii) eta. Obviously, the length scale over which the velocity field changes depends on the viscosity itself, but in geophysical situations it is around 10 km if you consider global convection models. That's what we had in mind at the time, but I'm sure there would be a better way today to figure out what an appropriate length scale would be -- I suspect it's going to be something that could be computed from a Rayleigh number, for example because we know that the larger Ra, the smaller the length scale of solution features. I just don't know what the exact scaling is -- is it L = D/Ra or L = D/sqrt(Ra) or something else, where D is the diameter of the domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Issue tracker
  
Questions?
Development

No branches or pull requests

3 participants