Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending zero displacements in a serial-implicit coupling crashes precice (perpendicular-flap tutorial) #1558

Open
mennodeij opened this issue Feb 6, 2023 · 12 comments
Assignees
Labels
bug preCICE does not behave the way we want and we should look into it (and fix it if possible)
Milestone

Comments

@mennodeij
Copy link
Contributor

mennodeij commented Feb 6, 2023

Describe your setup

Operating system (e.g. Linux distribution and version): Linux Rocky 8.0
preCICE Version: 2.5.0

Describe the problem
I am developing a new fsi adapter and due to programming error on my side, the coupling data was lagging behind 1 coupling step. This resulted in sending zero displacement to the fluid-solver once too often. When using a parallel-implicit coupling, there was no problem, but when using a serial-implicit coupling, precice crashes with a vague warning message and core-dump due to std::dequeue failure (see below).

NOTES:

  • the problem was solved by sending the correct displacements in the new fsi adapter.
  • the problem can be reproduced with the attached solid participant that always sends zero displacement.
  • the problem has nothing to do with my programming error, it is just how I found the problem.

Error message

---[precice]  relative convergence measure: relative two-norm diff of data "Displacement" = inf, limit = 1.00e-03, normalization = 0.00e+00, conv = true
---[precice]  All converged
---[precice] WARNING:  The coupling residual equals almost zero. There is maybe something wrong in your adapter. Maybe you always write the same data or you call advance without providing new data first or you do not use available read data. Or you just converge much further than actually necessary.
/cm/local/apps/gcc/9.2.0/include/c++/9.2.0/bits/stl_deque.h:1477: std::_Deque_base<_Tp, _Alloc>::_Alloc_traits::reference std::deque<_Tp, _Alloc>::front() [with _Tp = int; _Alloc = std::allocator<int>]: Assertion '__builtin_expect(!this->empty(), true)' failed.
Aborted (core dumped)
[precice-crash.zip](https://github.com/precice/precice/files/10660838/precice-crash.zip)

Step To Reproduce

  1. Run the attached adapter (solid-python.py) with the attached config (precice-config.serial-implicit.xml) with the tutorial perpendicular-flap example and the fluid-openfoam solver.

Expected behaviour
If sending zero displacement should be possible, that needs to be fixed.
Otherwise, I would have liked to get more explanation on "The coupling residual equals almost zero", especially while the line above says that the residual is infinity. Furthermore, a better explanation on why sending zero displacement is not suitable for the serial implicit coupling.

Additional context
None.

@mennodeij mennodeij added the bug preCICE does not behave the way we want and we should look into it (and fix it if possible) label Feb 6, 2023
@mennodeij
Copy link
Contributor Author

precice-crash.zip

@fsimonis
Copy link
Member

fsimonis commented Feb 6, 2023

Welcome back @mennodeij!

Some more details (from the zip):

The coupling scheme is configured using IQN-ILS:

<coupling-scheme:serial-implicit>
  <time-window-size value="  0.1000000000000000E-01" />
  <max-time value="  0.5000000000000000E+01" />
  <participants first="Fluid" second="Solid" />
  <exchange data="Displacement" mesh="Solid-Mesh-Node" from="Solid" to="Fluid" />
  <relative-convergence-measure limit="  0.1000000000000000E-02" data="Displacement" mesh="Solid-Mesh-Node" />
  <exchange data="Force" mesh="Solid-Mesh-Node" from="Fluid" to="Solid" />
  <max-iterations value="10" />
  <acceleration:IQN-ILS>
    <initial-relaxation value="  0.1000000000000000E+00" />
    <max-used-iterations value="100" />
    <time-windows-reused value="15" />
    <data name="Displacement" mesh="Solid-Mesh-Node" />
    <!--data name="Force" mesh="Solid-Mesh-Node" /-->
    <filter type="QR2" limit="1e-2" />
    <preconditioner type="residual-sum" />
  </acceleration:IQN-ILS>

</coupling-scheme:serial-implicit>

The assertion hit is inside std::deque<int>::front(). Problem is that the queue is empty.

The warning is located here

if (math::equals(utils::IntraComm::l2norm(_residuals), 0.0)) {
PRECICE_WARN("The coupling residual equals almost zero. There is maybe something wrong in your adapter. "
"Maybe you always write the same data or you call advance without "
"providing new data first or you do not use available read data. "
"Or you just converge much further than actually necessary.");
}

@fsimonis fsimonis added this to the Version 3.0.0 milestone Feb 6, 2023
@MakisH
Copy link
Member

MakisH commented Feb 8, 2023

Looks like another (useful!) instance of #1496 (or related to it)

@mennodeij
Copy link
Contributor Author

Ok, I hope it helps to make serial-implicit with IQN more robust.

@precice-bot
Copy link

This issue has been mentioned on preCICE Forum on Discourse. There might be relevant details there:

https://precice.discourse.group/t/openfoam-calculix-fsi-unable-to-converge/1349/4

@MakisH
Copy link
Member

MakisH commented Nov 29, 2023

A few recent changes in develop (coming up in v3.0.0, as well as in v2.5.1) have probably fixed this issue. @mennodeij could you please confirm? (edit: actually, no, the issue still persists, but I am not sure we can do something about it)

Just to clarify different cases:

  1. IQN not being able to start from zero (e.g., zero flow, followed by opening valve) was fixed by Add a warning for empty IQN matrix, but keep going #1895
  2. IQN not being able to handle a steady-state (e.g., converging partitioned heat conduction) was fixed by Give warning info other than error by zero-update of V in QN-update #1863
  3. Always giving the same values in a serial-implicit scheme is still an issue.

If I hack the OpenFOAM adapter to always write zero displacement, disregarding the forces I read, I get the following warnings from the second participant, followed by a crash (perpedicular-flap, fluid-openfoam, solid-openfoam, serial-implicit with IQN, using only the Displacement for the acceleration):

---[precice] WARNING:  The coupling residual equals almost zero. There is maybe something wrong in your adapter. Maybe you always write the same data or you call advance without providing new data first or you do not use available read data. Or you just converge much further than actually necessary.
---[precice] WARNING:  All residual sub-vectors in the residual-sum preconditioner are numerically zero ( sum = 0). This indicates that the data values exchanged between two successive iterations did not change. The simulation may be unstable, e.g. produces NAN values. Please check the data values exchanged between the solvers is not identical between iterations. The preconditioner scaling factors were not updated in this iteration and the scaling factors determined in the previous iteration were used.
[stack trace]
=============
#1  Foam::sigFpe::sigHandler(int) in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so
#2  ? in /lib/x86_64-linux-gnu/libc.so.6
#3  ? in ~/repos/precice/precice/build/libprecice.so.3
#4  ? in ~/repos/precice/precice/build/libprecice.so.3
#5  ? in ~/repos/precice/precice/build/libprecice.so.3
#6  ? in ~/repos/precice/precice/build/libprecice.so.3
#7  ? in ~/repos/precice/precice/build/libprecice.so.3
#8  ? in ~/repos/precice/precice/build/libprecice.so.3
#9  ? in ~/repos/precice/precice/build/libprecice.so.3
#10  ? in ~/repos/precice/precice/build/libprecice.so.3
#11  precice::Participant::advance(double) in ~/repos/precice/precice/build/libprecice.so.3
#12  preciceAdapter::Adapter::advance() in ~/OpenFOAM/gc-v2306/platforms/linux64GccDPInt32Opt/lib/libpreciceAdapterFunctionObject.so
#13  preciceAdapter::Adapter::execute() in ~/OpenFOAM/gc-v2306/platforms/linux64GccDPInt32Opt/lib/libpreciceAdapterFunctionObject.so
#14  Foam::functionObjects::preciceAdapterFunctionObject::execute() in ~/OpenFOAM/gc-v2306/platforms/linux64GccDPInt32Opt/lib/libpreciceAdapterFunctionObject.so
#15  Foam::functionObjectList::execute() in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so
#16  Foam::Time::run() const in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so
#17  Foam::Time::loop() in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so
#18  ? in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/bin/solidDisplacementFoam
#19  ? in /lib/x86_64-linux-gnu/libc.so.6
#20  __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
#21  ? in /usr/lib/openfoam/openfoam2306/platforms/linux64GccDPInt32Opt/bin/solidDisplacementFoam
=============
Floating point exception (core dumped)

But I think this is expected behavior, no? preCICE tells you that something is wrong, but you may have reasons to continue.

Here is the config:

  <coupling-scheme:serial-implicit>
    <time-window-size value="0.01" />
    <max-time value="5" />
    <participants first="Fluid" second="Solid" />
    <exchange data="Force" mesh="Solid-Mesh" from="Fluid" to="Solid" />
    <exchange data="Displacement" mesh="Solid-Mesh" from="Solid" to="Fluid" />
    <max-iterations value="50" />
    <relative-convergence-measure limit="5e-3" data="Displacement" mesh="Solid-Mesh" />
    <relative-convergence-measure limit="5e-3" data="Force" mesh="Solid-Mesh" />
    <acceleration:IQN-ILS>
      <data name="Displacement" mesh="Solid-Mesh" />
      <!-- <data name="Force" mesh="Solid-Mesh" /> -->
      <preconditioner type="residual-sum" />
      <filter type="QR2" limit="1e-2" />
      <initial-relaxation value="0.5" />
      <max-used-iterations value="100" />
      <time-windows-reused value="15" />
    </acceleration:IQN-ILS>
  </coupling-scheme:serial-implicit>

and here is the modified OpenFOAM adapter FSI/Displacement.C:

forAll(cellDisplacement_->boundaryField()[patchID], i)
{
    for (unsigned int d = 0; d < dim; ++d)
        buffer[bufferIndex++] = 0;
            // cellDisplacement_->boundaryField()[patchID][i][d];
}

With v2.5.0, there is an error+exit instead of warnings+crash:

---[precice] WARNING:  The coupling residual equals almost zero. There is maybe something wrong in your adapter. Maybe you always write the same data or you call advance without providing new data first or you do not use available read data. Or you just converge much further than actually necessary.
---[precice] ERROR:  Attempting to add a zero vector to the quasi-Newton V matrix. This means that the residuals in two consecutive iterations are identical. If a relative convergence limit was selected, consider increasing the convergence threshold.

A parallel-implicit scheme works anyway.

@uekerman @Fujikawas do you think we can still improve something here, while still allowing simulations to continue?

@uekerman
Copy link
Member

Thanks for the detailed analysis, everybody!

If the QN system runs empty, we could do an underrelaxation step again. This was the original idea in #1510

But probably beyond v3.0.0

@davidscn
Copy link
Member

@uekerman @Fujikawas do you think we can still improve something here, while still allowing simulations to continue?

I think we can still do better. The problem you describe above stems from the preconditioner and not from the QN and I would consider it more of a bug on how we treat the scaling in the preconditioner. You can easily verify this by selecting preconditioner=constant in your example, which will not break but continue the simulation. For steady-state simulations, it might even be desirable to continue the simulation.

The problem is that we divide here by zero

for (size_t k = 0; k < _subVectorSizes.size(); k++) {
_residualSum[k] += norms[k] / sum;

(and similarly in other preconditioner) if the residual is already zero in the very first iteration.

@uekerman
Copy link
Member

Using a preconditioner when there is only one data vector in the acceleration is anyway useless. Maybe we could forbid this or switch off automatically.

@davidscn
Copy link
Member

davidscn commented Nov 30, 2023

Using a preconditioner when there is only one data vector in the acceleration is anyway useless. Maybe we could forbid this or switch off automatically.

Yes, switching off automatically sounds like a good idea as configuration changes become otherwise more intrusive when trying different configurations. The problem above would persist though.

@MakisH
Copy link
Member

MakisH commented Nov 30, 2023

So, do we want to somehow address this for v3.0, or postpone it to v3.1?
I think that this one is more of a corner case that one does not just end up in a usual setup. As @mennodeij mentioned in the beginning, and as I imitated in my example, there was an issue in the adapter that led to this.

I get the impressions from the discussion that there is more juice to squeeze out of this fruit and we should not rush it.

@uekerman
Copy link
Member

uekerman commented Dec 1, 2023

The problem above would persist though.

You mean that it would also happen if multiple data fields are used in acceleration and all of them are zero, right?
I have the feeling that this case is already covered by #1863 as all convergence measures then converge. But, of course, this is very implicit.

Should probably be not too complicated to extend the preconditioners by alternative ways how to treat the case when sum becomes 0.

But let's move indeed to v3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug preCICE does not behave the way we want and we should look into it (and fix it if possible)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants