-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex ANK+NK #130
Complex ANK+NK #130
Conversation
…ed like the real part
So I still have a few TODOs before this is ready to merge. The fixes only work for decoupled ANK. I will need to fix the coupled ANK and turbANK until the PR is ready. Also, I want to add a test for multiple modes of ANK/NK in both real and complex modes. Finally, I think we can also update the existing complex tests to use ANK/NK with this PR. I wanted to create the PR to track these TODOs, and also have azure test the changes. Any feedback is welcome. @nwu63 I plan on fixing the complex compile in a separate PR. I have most of it laid out mentally, just need to figure out a few details. Once I create that PR, someone else can apply the similar changes to idwarp (I believe its the only other code we have with a similar complex build process?) |
The tests fail because of the complex ank/nk test. That wasnt tagged as |
I have updated the input files and added one real and one complex test for solver combinations. These should pass now. Then I will add more combinations with different options. |
This one is good to go for the review. I have added solver combination tests that switch between smoother, ANK, SANK, CSANK, and NK. I test all equation types with these, also with the 2 turbulence solvers that can be used in ANK. I still test the turbulence solvers with Euler and laminar NS because in the past, we had a few issues with these cases seg-faulting. With the current test, we just expect equation modes that does not have any turbulence model to just work regardless of the turbulence options specified. I test both real and complex versions with these. The test itself uses a simple 2^3 block originally made by @joanibal. I have modified the mesh slightly so there is a wall, and rest of the BCs are farfield. The flow also comes in at an angle to the wall so that Euler residuals are not zero with free stream. I have also modified existing complex tests to use ANK for the solution. @sseraj can you please check if I did it correctly? For the euler tests at least, I had to set the second order switch to get it to converge; it would oscillate a bit otherwise. Now it looks it is converging okay, but I dont really have a reference on how fast it should be. The only remaining item that I wanted to possibly do with this PR is the "unfinished" item 3 I listed above. A complex step size of around 1e-160 and below will underflow with the residual norm computation and the residual monitor will print zero for complex residuals. We can maybe include that in the "complex magic" PR that does a few fancy things. We can also update the step sizes in complex tests in that PR so that they use the underflow version. I will create an issue about this so we remember what is going on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the code changes look good from what I understand. Some clarification questions and comments below
src/NKSolver/NKSolvers.F90
Outdated
@@ -3043,15 +3061,15 @@ subroutine physicalityCheckANK(lambdaP) | |||
real(kind=realType), pointer :: wvec_pointer(:) | |||
real(kind=realType), pointer :: dvec_pointer(:) | |||
real(kind=alwaysRealType) :: lambdaL ! L is for local | |||
real(kind=realType) :: ratio | |||
|
|||
real(kind=alwaysRealType) :: tmp ! to receive the global step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a name like lambdaP_tmp
so it's easier to keep track of what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or even step_size if that is what it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed all instances of tmp in the two physicality checks to lambdaP_recv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me
src/NKSolver/NKSolvers.F90
Outdated
@@ -3043,15 +3061,15 @@ subroutine physicalityCheckANK(lambdaP) | |||
real(kind=realType), pointer :: wvec_pointer(:) | |||
real(kind=realType), pointer :: dvec_pointer(:) | |||
real(kind=alwaysRealType) :: lambdaL ! L is for local | |||
real(kind=realType) :: ratio | |||
|
|||
real(kind=alwaysRealType) :: tmp ! to receive the global step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or even step_size if that is what it is
src/NKSolver/NKSolvers.F90
Outdated
|
||
! check density | ||
#ifndef USE_COMPLEX | ||
! to have the real mode sliiiightly more efficient, just do stuff with real numbers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this comment. In real mode aren't you always working with real numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so in the real mode, I dont need to check if the variable is real. I will rephrase the comment
src/NKSolver/NKSolvers.F90
Outdated
lambdaP = tmp | ||
#else | ||
! finally, as a safety check, purge the complex part of lambda | ||
lambdaP = cmplx(tmp, 0.0_realType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So lambdaP is always real?
lambdaL is always real
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't all lambda's always be real
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if all lambdas are always real, then you could probably get rid of your tmp
variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed the tmp
because the MPI_MIN operation in the MPI_Allreduce is not defined for complex numbers. It just fails. So I have to use an always real type to do the allreduce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had a comment right above as well:
! mpi allreduce is not defined for complex numbers with the min operation
! so we will use the lambdaP_recv variable to receive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I just realized, lambdaP
is not alwaysReal
, it is just realType
. As a result, I needed that additional always real variable to receive the MPI allreduce's output
Added a few suggestions in my review |
also, to be clear. Does this resolve the issue of the slow convergence of the imaginary part after the real part has converged? |
So slow is a bit relative term here and I am not sure what you have in mind. After this PR, the complex parts still lag a bit. But, even the small amount of lag is okay for matching 10 digits of accuracy. So for most practical cases, we can just converge the real part tightly (1e-15) and the complex convergence you get there is good enough to get a 10 digit match with the adjoint. You can do more iterations to further nail down the complex residuals, and the accuracy increases (I mean difference between CS and adjoint results get smaller as expected). The changes to the I have also tested this w/o resetting the real flow. With these cases, the real residual starts from machine zero whereas complex residuals start on the order of the complex step. Here, the complex convergence depends on the linear solver tolerance. With a linear solver tolerance of 1e-6, complex residuals converge after like 2-3 calls to the CFD solver. This also supports the theory we have been developing. Does this answer your last question @joanibal? |
Yes it does. Thank you |
@sseraj is there a particular reason why we have 1e-16 convergence on the rans complex cases? It just hits machine zero and bounces around. Tests pass with an L2 convergence of 1e-15 as well, so I am thinking of just modifying that. Any objections here? |
…nt with the original code now
No particular reason that I can think of. If all the tests pass with 1e-15, then that should be fine. |
With my recent changes to the tests, I believe I addressed every comment from both of you. Please make sure if all is good. If the tests also pass, this is good to go for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about setting ANK_physLSTOl and other variables as always real times.
I updated the input files. The cube mesh is now a 4x4x4 block and the Euler tests also converge 12 orders of magnitude with 2 procs like the rest of the cases. Are there any other changes to the tests you want @joanibal ? I think now the last issue is to figure out the passing of alwaysRealType stuff from python to fortran? |
I think I addressed all of the comments with the tests and the code. The only thing I did not address yet is passing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating the issue. This is good to go for me.
Purpose
This PR improves the complex ANK and NK solvers, as well as improving how complex residuals and printouts are handled. With @joanibal, we tried to figure out how the complex step method worked for implicit solvers. We concluded that the path of convergence is not important, but in the end, complex residuals must be converged to obtain accurate derivatives. This means that during a complex-step evaluation, we dont need to re-set flow, and just call the NK/ANK solvers repeatedly so that complex residuals converge. This PR has a few changes that makes this possible.
Here is a list of what we did with this PR:
We print all 16 digits of real and complex parts for the functionals because the real part is the value and the complex part is the derivative. However, we do not print all digits for residuals since their magnitude is more important anyways. I have left about 8 digits on both real and complex parts of the residuals for debugging purposes. The nonlinear iteration header is also fixed so that things are aligned. I also explicitly give 3 digits to the exponent because we are now interested in doing smaller complex-steps that underflow and do not affect real parts when the real values are zero.
The complex build process is slightly improved. Complexify.f90 is moved to src_cs. We now recompile only the changed files in complex mode. We used to do this for fortran modules but still would copy over other include files which caused long build times. This is fixed.
The absolute tolerance (
atol
) in the NK solver is set separately for complex mode. For the real mode, we set this atol to be 0.01 lower than the real L2 convergence target. This makes sure that the NK solver solves the linear system well enough to put us across the finish line, while not doing any extra work. This approach does not work with complex step. Because derivatives lag, complex residuals must be converged, even after the real ones hit machine zero. Furthermore, if we want to converge the complex system without re-setting the flow, then the real residuals will hang around machine zero and we again need the linear solver to actually solve the system and not quit early. Same fix is also applied to the ANK solver.ANK solver is modified to handle complex mode. This involved several changes to the iteration algorithm as well as the line searches. I am still not 100% on some details, but it works, and the parts I dont understand go to zero as the CFL number is increased, so it should be fine. We only really want to avoid having the complex part diverge. As long as a relatively stable and low value of complex state is maintained, the complex system needs to be converged after the real part converges anyways due to the lag, so again, it should be fine.
We added a nonlinear solver test for complex ANK/NK (but see the unfinished item no 2 below).
Here is a list that we did not do yet, and should be done in a future PR:
Type of change
What types of change is it?
Select the appropriate type(s) that describe this PR
Testing
We added complex-step tests for the ANK and NK solvers.
Checklist
Put an
x
in the boxes that apply.flake8
andblack
to make sure the code adheres to PEP-8 and is consistently formatted