Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex ANK+NK #130

Merged
merged 27 commits into from
Mar 19, 2021
Merged

Complex ANK+NK #130

merged 27 commits into from
Mar 19, 2021

Conversation

anilyil
Copy link
Contributor

@anilyil anilyil commented Mar 9, 2021

Purpose

This PR improves the complex ANK and NK solvers, as well as improving how complex residuals and printouts are handled. With @joanibal, we tried to figure out how the complex step method worked for implicit solvers. We concluded that the path of convergence is not important, but in the end, complex residuals must be converged to obtain accurate derivatives. This means that during a complex-step evaluation, we dont need to re-set flow, and just call the NK/ANK solvers repeatedly so that complex residuals converge. This PR has a few changes that makes this possible.

Here is a list of what we did with this PR:

  1. The residual printout is fixed in complex mode. The code used to print the square of the residual, and the imaginary part of the printed value is equivalent to an interesting derivative (derivative of the solver update wrt complex-perturbed DV), but it was not printing the true complex residual. Because we need to converge the complex residuals, we print these directly now. (interestingly, complex residuals go to zero also when the derivative of the solver update wrt complex-perturbed DV also goes to zero, but the residual norm is easier to understand and is always positive)
  2. The funcional printout is also improved. The output now looks like this:
#-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#  Grid  | Iter | Iter |  Iter  |   CFL   | Step | Lin  |    Wall    |            Res rho              |                    C_drag                       |            totalRes             |
#  level |      | Tot  |  Type  |         |      | Res  | Clock (s)  |                                 |                                                 |                                 |
#-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      1       0      0     None     ----    ----   ----  0.39389E-02   0.14551813E-14 +0.11287700E-129i  0.4710284646449747E-01 +0.1156509033894532E-123i  0.45644342E-12 +0.84458180E-128i
      1       1     21      *NK     ----    1.00  0.000  0.70712E+00   0.13697775E-14 +0.15267370E-132i  0.4710284646449758E-01 +0.1156509263873306E-123i  0.33574505E-12 +0.77222624E-131i

We print all 16 digits of real and complex parts for the functionals because the real part is the value and the complex part is the derivative. However, we do not print all digits for residuals since their magnitude is more important anyways. I have left about 8 digits on both real and complex parts of the residuals for debugging purposes. The nonlinear iteration header is also fixed so that things are aligned. I also explicitly give 3 digits to the exponent because we are now interested in doing smaller complex-steps that underflow and do not affect real parts when the real values are zero.

  1. The complex build process is slightly improved. Complexify.f90 is moved to src_cs. We now recompile only the changed files in complex mode. We used to do this for fortran modules but still would copy over other include files which caused long build times. This is fixed.

  2. The absolute tolerance (atol) in the NK solver is set separately for complex mode. For the real mode, we set this atol to be 0.01 lower than the real L2 convergence target. This makes sure that the NK solver solves the linear system well enough to put us across the finish line, while not doing any extra work. This approach does not work with complex step. Because derivatives lag, complex residuals must be converged, even after the real ones hit machine zero. Furthermore, if we want to converge the complex system without re-setting the flow, then the real residuals will hang around machine zero and we again need the linear solver to actually solve the system and not quit early. Same fix is also applied to the ANK solver.

  3. ANK solver is modified to handle complex mode. This involved several changes to the iteration algorithm as well as the line searches. I am still not 100% on some details, but it works, and the parts I dont understand go to zero as the CFL number is increased, so it should be fine. We only really want to avoid having the complex part diverge. As long as a relatively stable and low value of complex state is maintained, the complex system needs to be converged after the real part converges anyways due to the lag, so again, it should be fine.

  4. We added a nonlinear solver test for complex ANK/NK (but see the unfinished item no 2 below).

Here is a list that we did not do yet, and should be done in a future PR:

  1. The complex build process needs to be cleaned up a bit more. This is not terribly bad, I just did not want to delay this PR for that because the complex build does work now. This will be a future PR.
  2. We should add better ANK and NK solver tests. There are a few features that we want to maintain, and the simple test we added here can be modified to include more tests. Depending on feedback, we can do this in this PR, or in a future PR.
  3. The complex residual computation squares the real and complex parts separately to compute the two separate residuals. This will underflow with a complex step of 1e-200 and the residuals will just print zero. We ideally want to fix this square=>squareroot process so that the values do not underflow.

Type of change

What types of change is it?
Select the appropriate type(s) that describe this PR

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (non-backwards-compatible fix or feature)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Documentation update
  • Maintenance update
  • Other (please describe)

Testing

We added complex-step tests for the ANK and NK solvers.

Checklist

Put an x in the boxes that apply.

  • I have run flake8 and black to make sure the code adheres to PEP-8 and is consistently formatted
  • I have run unit and regression tests which pass locally with my changes
  • I have added new tests that prove my fix is effective or that my feature works
  • I have added necessary documentation

@anilyil anilyil requested a review from a team as a code owner March 9, 2021 09:14
@anilyil anilyil marked this pull request as draft March 9, 2021 09:14
@anilyil anilyil requested a review from joanibal March 9, 2021 09:14
@anilyil
Copy link
Contributor Author

anilyil commented Mar 9, 2021

So I still have a few TODOs before this is ready to merge. The fixes only work for decoupled ANK. I will need to fix the coupled ANK and turbANK until the PR is ready. Also, I want to add a test for multiple modes of ANK/NK in both real and complex modes. Finally, I think we can also update the existing complex tests to use ANK/NK with this PR.

I wanted to create the PR to track these TODOs, and also have azure test the changes. Any feedback is welcome.

@nwu63 I plan on fixing the complex compile in a separate PR. I have most of it laid out mentally, just need to figure out a few details. Once I create that PR, someone else can apply the similar changes to idwarp (I believe its the only other code we have with a similar complex build process?)

@anilyil
Copy link
Contributor Author

anilyil commented Mar 9, 2021

The tests fail because of the complex ank/nk test. That wasnt tagged as cmplx_ as it should be. I will take care of it soon. We will need to update input files as well

src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
@anilyil
Copy link
Contributor Author

anilyil commented Mar 10, 2021

I have updated the input files and added one real and one complex test for solver combinations. These should pass now. Then I will add more combinations with different options.

@anilyil anilyil marked this pull request as ready for review March 10, 2021 11:43
@anilyil
Copy link
Contributor Author

anilyil commented Mar 10, 2021

This one is good to go for the review. I have added solver combination tests that switch between smoother, ANK, SANK, CSANK, and NK. I test all equation types with these, also with the 2 turbulence solvers that can be used in ANK. I still test the turbulence solvers with Euler and laminar NS because in the past, we had a few issues with these cases seg-faulting. With the current test, we just expect equation modes that does not have any turbulence model to just work regardless of the turbulence options specified. I test both real and complex versions with these.

The test itself uses a simple 2^3 block originally made by @joanibal. I have modified the mesh slightly so there is a wall, and rest of the BCs are farfield. The flow also comes in at an angle to the wall so that Euler residuals are not zero with free stream.

I have also modified existing complex tests to use ANK for the solution. @sseraj can you please check if I did it correctly? For the euler tests at least, I had to set the second order switch to get it to converge; it would oscillate a bit otherwise. Now it looks it is converging okay, but I dont really have a reference on how fast it should be.

The only remaining item that I wanted to possibly do with this PR is the "unfinished" item 3 I listed above. A complex step size of around 1e-160 and below will underflow with the residual norm computation and the residual monitor will print zero for complex residuals. We can maybe include that in the "complex magic" PR that does a few fancy things. We can also update the step sizes in complex tests in that PR so that they use the underflow version. I will create an issue about this so we remember what is going on.

@anilyil anilyil requested a review from sseraj March 10, 2021 16:00
Copy link
Collaborator

@sseraj sseraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the code changes look good from what I understand. Some clarification questions and comments below

src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Outdated Show resolved Hide resolved
src_cs/build/Makefile1 Outdated Show resolved Hide resolved
@@ -3043,15 +3061,15 @@ subroutine physicalityCheckANK(lambdaP)
real(kind=realType), pointer :: wvec_pointer(:)
real(kind=realType), pointer :: dvec_pointer(:)
real(kind=alwaysRealType) :: lambdaL ! L is for local
real(kind=realType) :: ratio

real(kind=alwaysRealType) :: tmp ! to receive the global step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a name like lambdaP_tmp so it's easier to keep track of what's happening.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even step_size if that is what it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed all instances of tmp in the two physicality checks to lambdaP_recv

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me

@anilyil
Copy link
Contributor Author

anilyil commented Mar 15, 2021

Thanks for the review @sseraj, I will go through your comments and make the suggested changes. @joanibal can you review this anytime soon? I would rather address your comments together with addressing @sseraj's comments.

@@ -3043,15 +3061,15 @@ subroutine physicalityCheckANK(lambdaP)
real(kind=realType), pointer :: wvec_pointer(:)
real(kind=realType), pointer :: dvec_pointer(:)
real(kind=alwaysRealType) :: lambdaL ! L is for local
real(kind=realType) :: ratio

real(kind=alwaysRealType) :: tmp ! to receive the global step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even step_size if that is what it is


! check density
#ifndef USE_COMPLEX
! to have the real mode sliiiightly more efficient, just do stuff with real numbers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this comment. In real mode aren't you always working with real numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so in the real mode, I dont need to check if the variable is real. I will rephrase the comment

src/NKSolver/NKSolvers.F90 Show resolved Hide resolved
lambdaP = tmp
#else
! finally, as a safety check, purge the complex part of lambda
lambdaP = cmplx(tmp, 0.0_realType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So lambdaP is always real?
lambdaL is always real

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't all lambda's always be real

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if all lambdas are always real, then you could probably get rid of your tmp variables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed the tmp because the MPI_MIN operation in the MPI_Allreduce is not defined for complex numbers. It just fails. So I have to use an always real type to do the allreduce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had a comment right above as well:

! mpi allreduce is not defined for complex numbers with the min operation
! so we will use the lambdaP_recv variable to receive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I just realized, lambdaP is not alwaysReal, it is just realType. As a result, I needed that additional always real variable to receive the MPI allreduce's output

src/solver/solvers.F90 Show resolved Hide resolved
tests/reg_tests/test_solver_combos.py Outdated Show resolved Hide resolved
@joanibal
Copy link
Collaborator

Added a few suggestions in my review

@joanibal
Copy link
Collaborator

joanibal commented Mar 16, 2021

also, to be clear.

Does this resolve the issue of the slow convergence of the imaginary part after the real part has converged?

@anilyil
Copy link
Contributor Author

anilyil commented Mar 16, 2021

So slow is a bit relative term here and I am not sure what you have in mind.

After this PR, the complex parts still lag a bit. But, even the small amount of lag is okay for matching 10 digits of accuracy. So for most practical cases, we can just converge the real part tightly (1e-15) and the complex convergence you get there is good enough to get a 10 digit match with the adjoint. You can do more iterations to further nail down the complex residuals, and the accuracy increases (I mean difference between CS and adjoint results get smaller as expected).

The changes to the atol in NK type solvers also mean that even when the complex residuals lag and the real part converged, you can just do one more call to the solver and that makes significant progress in the complex part. This is because before this PR, the linear solver would quit early because it would hit the atol limit. Now because atol is so much lower, it does more work solving the linear system even when real part is converged.

I have also tested this w/o resetting the real flow. With these cases, the real residual starts from machine zero whereas complex residuals start on the order of the complex step. Here, the complex convergence depends on the linear solver tolerance. With a linear solver tolerance of 1e-6, complex residuals converge after like 2-3 calls to the CFD solver. This also supports the theory we have been developing.

Does this answer your last question @joanibal?

@joanibal
Copy link
Collaborator

Yes it does.

Thank you

@anilyil
Copy link
Contributor Author

anilyil commented Mar 16, 2021

I addressed the comments in the fortran code. Now I will go over the tests again. I left all of the conversations as unresolved, please make sure that I addressed your comments @joanibal and @sseraj

@anilyil
Copy link
Contributor Author

anilyil commented Mar 16, 2021

@sseraj is there a particular reason why we have 1e-16 convergence on the rans complex cases? It just hits machine zero and bounces around. Tests pass with an L2 convergence of 1e-15 as well, so I am thinking of just modifying that. Any objections here?

@sseraj
Copy link
Collaborator

sseraj commented Mar 16, 2021

@sseraj is there a particular reason why we have 1e-16 convergence on the rans complex cases? It just hits machine zero and bounces around. Tests pass with an L2 convergence of 1e-15 as well, so I am thinking of just modifying that. Any objections here?

No particular reason that I can think of. If all the tests pass with 1e-15, then that should be fine.

@anilyil
Copy link
Contributor Author

anilyil commented Mar 16, 2021

With my recent changes to the tests, I believe I addressed every comment from both of you. Please make sure if all is good. If the tests also pass, this is good to go for me.

@anilyil anilyil requested review from sseraj and joanibal March 16, 2021 20:31
Copy link
Collaborator

@joanibal joanibal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment about setting ANK_physLSTOl and other variables as always real times.

src/solver/solvers.F90 Show resolved Hide resolved
tests/reg_tests/test_solver_combos.py Outdated Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Show resolved Hide resolved
src/NKSolver/NKSolvers.F90 Show resolved Hide resolved
@anilyil
Copy link
Contributor Author

anilyil commented Mar 18, 2021

I updated the input files. The cube mesh is now a 4x4x4 block and the Euler tests also converge 12 orders of magnitude with 2 procs like the rest of the cases. Are there any other changes to the tests you want @joanibal ? I think now the last issue is to figure out the passing of alwaysRealType stuff from python to fortran?

@anilyil
Copy link
Contributor Author

anilyil commented Mar 19, 2021

I think I addressed all of the comments with the tests and the code. The only thing I did not address yet is passing alwaysRealType options to Fortran with f2py. I created an issue for this #133 and I think this should be a separate PR because it touches more places in the code. @joanibal and @sseraj are you happy with the current state? Do you want any other changes?

Copy link
Collaborator

@sseraj sseraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating the issue. This is good to go for me.

@ewu63 ewu63 requested a review from joanibal March 19, 2021 17:08
@joanibal joanibal merged commit 6c6443b into mdolab:master Mar 19, 2021
@joanibal joanibal deleted the cmplx_solvers branch March 19, 2021 21:34
@sseraj sseraj mentioned this pull request Mar 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update ANK to work with complex step
4 participants