Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

artificial viscosity visualization postprocessor does not work on more than 1 processor #5238

Closed
jdannberg opened this issue Jul 9, 2023 · 6 comments

Comments

@jdannberg
Copy link
Contributor

The artificial viscosity visualization postprocessor (both for temperature and composition) does not work on more than 1 processor.
I have attached the input file that is crashing (I had to change the file ending to txt to be able to upload it).

composition_active.prm.txt

It is basically the composition active cookbook, with the posprocessor added in

subsection Postprocess
  subsection Visualization
    set List of output variables = density, artificial viscosity
  end
end

This is the backtrace:

-----------------------------------------------------------------------------
-- This is ASPECT, the Advanced Solver for Problems in Earth's ConvecTion.
--     . using deal.II 9.4.0
--     .       with 32 bit indices and vectorization level 2 (256 bits)
--     . using Trilinos 12.18.1
--     . using p4est 2.3.2
--     . running in DEBUG mode
--     . running with 2 MPI processes
-----------------------------------------------------------------------------

-----------------------------------------------------------------------------
-- For information on how to cite ASPECT, see:
--   https://aspect.geodynamics.org/citing.html?ver=2.3.0-pre&sha=21ea4ee94&src=code
-----------------------------------------------------------------------------
Number of active cells: 1,024 (on 6 levels)
Number of degrees of freedom: 22,214 (8,450+1,089+4,225+4,225+4,225)

*** Timestep 0:  t=0 seconds, dt=0 seconds
   Solving temperature system... 0 iterations.
   Solving C_1 system ... 0 iterations.
   Solving C_2 system ... 0 iterations.
   Rebuilding Stokes preconditioner...
   Solving Stokes system... 61+0 iterations.

   Postprocessing:
[juliane-xps:644891] *** Process received signal ***
[juliane-xps:644891] Signal: Floating point exception (8)
[juliane-xps:644891] Signal code: Invalid floating point operation (7)
[juliane-xps:644891] Failing at address: 0x7fd0cfa11cf9
[juliane-xps:644891] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fd0b9a83420]
[juliane-xps:644891] [ 1] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZNK6dealii8internal16VectorOperations11Vector_copyIdfEclEjj+0x179)[0x7fd0cfa11cf9]
[juliane-xps:644891] [ 2] SIGFPE received
/home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZN6dealii8internal16VectorOperations12parallel_forINS1_11Vector_copyIdfEEEEvRT_jjRKSt10shared_ptrINS_8parallel8internal14TBBPartitionerEE+0x1eb)[0x7fd0cf9ef139]
[juliane-xps:644891] [ 3] --------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
/home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZN6dealii13LinearAlgebra15ReadWriteVectorIdEaSIfEERS2_RKNS1_IT_EE+0xfd)[0x7fd0cfb73823]
[juliane-xps:644891] [ 4] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(+0xf2e5c8e)[0x7fd0c963fc8e]
[juliane-xps:644891] [ 5] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(+0xf2ddc6a)[0x7fd0c9637c6a]
[juliane-xps:644891] [ 6] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZN6dealii8internal21DataOutImplementation9DataEntryILi2ELi2EdEC2INS_15DataOut_DoFDataILi2ELi2ELi2ELi2EE14DataVectorTypeENS_6VectorIfEEEEPKNS_10DoFHandlerILi2ELi2EEEPKT0_RKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISN_EERKSH_INS_27DataComponentInterpretation27DataComponentInterpretationESaIST_EET_+0xff)[0x7fd0c96fd43f]
[juliane-xps:644891] [ 7] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZSt11make_uniqueIN6dealii8internal21DataOutImplementation9DataEntryILi2ELi2EdEEJRPKNS0_10DoFHandlerILi2ELi2EEEPKNS0_6VectorIfEERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISK_EERKSE_INS0_27DataComponentInterpretation27DataComponentInterpretationESaISP_EERNS0_15DataOut_DoFDataILi2ELi2ELi2ELi2EE14DataVectorTypeEEENSt9_MakeUniqIT_E15__single_objectEDpOT0_+0x9d)[0x7fd0c96ec463]
[juliane-xps:644891] [ 8] /home/juliane/software/deal.II-v9.4.0/deal.II-v9.4.0/lib/libdeal_II.g.so.9.4.0(_ZN6dealii15DataOut_DoFDataILi2ELi2ELi2ELi2EE24add_data_vector_internalINS_6VectorIfEEEEvPKNS_10DoFHandlerILi2ELi2EEERKT_RKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISI_EENS1_14DataVectorTypeERKSC_INS_27DataComponentInterpretation27DataComponentInterpretationESaISP_EEb+0xd61)[0x7fd0c964d4b5]
[juliane-xps:644891] [ 9] /home/juliane/software/aspect-melt/build_debug/aspect(_ZN6dealii15DataOut_DoFDataILi2ELi2ELi2ELi2EE15add_data_vectorINS_6VectorIfEEEEvRKT_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS1_14DataVectorTypeERKSt6vectorINS_27DataComponentInterpretation27DataComponentInterpretationESaISJ_EE+0x173)[0x560e3ba3967b]
[juliane-xps:644891] [10] /home/juliane/software/aspect-melt/build_debug/aspect(_ZN6aspect11Postprocess13VisualizationILi2EE7executeB5cxx11ERN6dealii12TableHandlerE+0xc10)[0x560e3ba25b14]
[juliane-xps:644891] [11] /home/juliane/software/aspect-melt/build_debug/aspect(_ZN6aspect11Postprocess7ManagerILi2EE7executeB5cxx11ERN6dealii12TableHandlerE+0xf3)[0x560e3bb34665]
[juliane-xps:644891] [12] /home/juliane/software/aspect-melt/build_debug/aspect(_ZN6aspect9SimulatorILi2EE11postprocessEv+0xea)[0x560e3adaeea8]
[juliane-xps:644891] [13] /home/juliane/software/aspect-melt/build_debug/aspect(_ZN6aspect9SimulatorILi2EE3runEv+0x702)[0x560e3adabb34]
[juliane-xps:644891] [14] /home/juliane/software/aspect-melt/build_debug/aspect(_Z13run_simulatorILi2EEvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_bbb+0x367)[0x560e3c6a37fa]
[juliane-xps:644891] [15] /home/juliane/software/aspect-melt/build_debug/aspect(main+0x653)[0x560e3c65bcf3]
[juliane-xps:644891] [16] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fd0b98a1083]
[juliane-xps:644891] [17] /home/juliane/software/aspect-melt/build_debug/aspect(_start+0x2e)[0x560e3ad8eb0e]
[juliane-xps:644891] *** End of error message ***
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[37674,1],1]
  Exit code:    1
@bangerth
Copy link
Contributor

On it!

@bangerth
Copy link
Contributor

The issue is that DataOut wants to copy the vector it is given. It is careful to only copy the part of the vector that the local process actually owns, but because we're dealing with cell vectors (as opposed to dof vectors), we just pass a sequential Vector in which we own all elements, and so DataOut copies all elements, including the ones that are NaN.

Copying should not trigger any complication with NaNs. The problem is that DataOut copies from a Vector<float> to a ReadWriteVector<double>. It is the conversion that causes the issue.

@bangerth
Copy link
Contributor

Solution 1: We don't set values of that vector to NaN. @jdannberg Where exactly was the place where you found this? That's the easiest, but the NaN serves a purpose and I'm reluctant to do that.

Solution 2: We use a Vector<double> for the cell-based visualization postprocessors. This has the advantage that we don't need to copy in DataOut, but it changes the interface of these classes incompatibly. Perhaps now is the time to do that. I suspect that there aren't all that many instances of user plugins of that type out there.

Solution 3: We keep the interface, but copy from Vector<float> to Vector<double> ourselves. Not particularly efficient, but probably still fine. We do more expensive things than to allocate and copy relatively small, local vectors.

Solution 4: Instead of a Vector<float>, we give these postprocessors a vector that really only stores elements corresponding to locally owned cells. It would be a sequential vector that only stores some of its elements. I don't actually know whether that is allowed. The set of locally owned cells in the order of cells may also not be particularly contiguous, so it may be expensive to index into these kinds of vectors.

My proposal would be to use solution 3. Thoughts?

@jdannberg
Copy link
Contributor Author

@bangerth

if (cell->is_artificial()
||
(cell->is_ghost() &&
parameters.use_artificial_viscosity_smoothing == false))
{
viscosity_per_cell[cell->active_cell_index()] = numbers::signaling_nan<T>();
continue;

@bangerth
Copy link
Contributor

I have a patch, but #5251 broke my ability to compile for the moment (I need to also update my two-day old installation of deal.II). Will get back to it.

@jdannberg
Copy link
Contributor Author

Fixed by #5253 and #5274.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants