Segfault after grid adaption #228

rd-contr · 2020-10-15T19:21:21Z

I am trying to run the abl_godunov_cn case but with adaption turned on by turning off constant density and specifying incflo.graderr. It looks like the grid adaption works but the overall process segfaults before any new iterations are completed on the adapted grid. Solver version and input are below. Entirely possible that I don't have all of the settings in place for AMR. I'm open to any suggestions on how best to get AMR to work with ABL cases.

Solver output before segfault
Regrid mesh ... time elapsed = 0.02479205467
Grid summary:
Level 0 343 grids 8000000 cells 100 % of domain
smallest grid: 24 x 24 x 24 biggest grid: 32 x 32 x 32
Level 1 427 grids 3472384 cells 5.4256 % of domain
smallest grid: 8 x 8 x 8 biggest grid: 32 x 32 x 32

Step: 100 dt: 0.3355257033 Time: 56.5592 to 56.8947
CFL: 0.95 (conv: 0.949537 diff: 0 src: 0.0209629 )

Godunov:
System Iters Initial residual Final residual
----------------------------------------------------------------------------
Segfault

Solver Build Settings
AMR-Wind (https://github.com/exawind/amr-wind)

AMR-Wind Git SHA :: 01187b8
AMReX version :: 20.09-80-g61734d3da08b ( 20.09-80-g61734d3da08b )

Exec. date :: Fri Oct 9 19:49:40 2020
Build date :: Oct 5 2020 19:14:13
C++ compiler :: GNU 7.3.0

MPI :: ON (Num. ranks = 96)
GPU :: OFF
OpenMP :: OFF

Solver Input
time.stop_time = 200.0 # Max (simulated) time to evolve
time.max_step = -1 # Max number of time steps

time.fixed_dt = -0.5 # Use this constant dt if > 0
time.cfl = 0.95 # CFL factor

io.KE_int = 1
io.line_plot_int = 1
time.plot_interval = 100 # Steps between plot files
time.checkpoint_interval = -1000 # Steps between checkpoint files
amr.plt_tracer = 1

incflo.gravity = 0. 0. -9.81 # Gravitational force (3D)
incflo.density = 1.0 # Reference density
incflo.constant_density = 0

incflo.use_godunov = 1
#incflo.diffusion_type = 1
transport.viscosity = 1.0e-5
transport.laminar_prandtl = 0.7
transport.turbulent_prandtl = 0.3333
turbulence.model = Smagorinsky
Smagorinsky_coeffs.Cs = 0.135

incflo.physics = ABL
ICNS.source_terms = BoussinesqBuoyancy CoriolisForcing ABLForcing
BoussinesqBuoyancy.reference_temperature = 300.0
ABL.reference_temperature = 300.0
CoriolisForcing.latitude = 41.3
ABLForcing.abl_forcing_height = 90

incflo.velocity = 6.128355544951824 5.142300877492314 0.0

ABL.temperature_heights = 650.0 750.0 1000.0
ABL.temperature_values = 300.0 308.0 308.75

ABL.kappa = .41
ABL.surface_roughness_z0 = 0.15

amr.n_cell = 200 200 200 # Grid cells at coarsest AMRlevel
amr.max_level = 1 # Max AMR level in hierarchy
time.regrid_interval = 50
incflo.gradrhoerr = 0.0000000000003

geometry.prob_lo = 0. 0. 0. # Lo corner coordinates
geometry.prob_hi = 1000. 1000. 1000. # Hi corner coordinates
geometry.is_periodic = 1 1 0 # Periodicity x y z (0/1)

zlo.type = "wall_model"
zlo.temperature_type = "fixed_gradient"
zlo.temperature = 0.0

zhi.type = "slip_wall"
zhi.temperature_type = "fixed_gradient"
zhi.temperature = 0.003 # tracer is used to specify potential temperature gradient

incflo.verbose = 0 # incflo_level

amrex.fpe_trap_invalid = 0 # Trap NaNs

The text was updated successfully, but these errors were encountered:

sayerhs · 2020-10-15T19:28:23Z

Non-constant density with ABL simulations is not something that has been tested ... terms like BoussinesqBuoyancy etc are assuming constant density. Also, I am not sure if density is the right criteria for regridding for ABL flows. However, that being said, it is unclear where your segfault is occurring, if you can provide a stack trace that would be useful.

If you want to just try multiple levels of refinement, have you tried this regression test: https://github.com/Exawind/amr-wind/tree/development/test/test_files/abl_godunov_static_refinement

rd-contr · 2020-10-15T19:44:33Z

Understood on constant versus non-constant density. I was only running with non-constant because density variation appeared to be the only variable that AMR could be triggered through the incflo input settings. I see that incflo and IAMR have different settings to trigger AMR but it wasn't clear to me if those were specific to incflo or IAMR, or if they were an underlying setting from AMReX. I did see the static refinement case test case. Unfortunately, the application that I am looking at (non wind energy related) would likely have dynamic regions where adaption needs to take place. Given that the static refinement case is one that has been tested, I will look more closely to see if perhaps I could use it for my intended use case. I may lalso ook at the source a little to see what it would take to tag cells for adaption based on something other than density of density gradient.

Let me see if I can get the run to generate a backtrace that I could send related to the segfault. For whatever reason, it didn't automatically generate a backtrace.

sayerhs · 2020-10-15T19:51:55Z

@rd-contr Understood. In that case, I'll recommend the Rayleigh-Taylor case which uses density based dynamic adaptation. It is a better case to explore dynamic adaptation than the ABL case you were playing with https://github.com/Exawind/amr-wind/blob/development/test/test_files/rayleigh_taylor_godunov/rayleigh_taylor_godunov.i

By default AMReX traps segfault and outputs a Backtrace.* files, you might want to look there. You can also disable this trapping by setting the following in the input file

amrex.throw_exception = 1
amrex.signal_handling = 0

Also if you don't mind, may I ask what application you were considering using AMR-Wind for?

sayerhs · 2020-10-15T20:05:45Z

The density based tagging is implemented here:

amr-wind/amr-wind/incflo_tagging.cpp

Lines 55 to 80 in 8cf1eb8

    
               if (tag_rho or tag_gradrho)  
        
               { 
        
                   Array4<Real const> const& rho = den.const_array(mfi); 
        
                   Real rhoerr = tag_rho ? rhoerr_v[lev]: std::numeric_limits<Real>::max(); 
        
                   Real gradrhoerr = tag_gradrho ? gradrhoerr_v[lev] : std::numeric_limits<Real>::max(); 
        
                   amrex::ParallelFor(bx, 
        
                   [tag_rho,tag_gradrho,rhoerr,gradrhoerr,rho,tag] 
        
                   AMREX_GPU_DEVICE (int i, int j, int k) noexcept 
        
                   { 
        
                       if (tag_rho and rho(i,j,k) > rhoerr) { 
        
                           tag(i,j,k) = tagval; 
        
                       } 
        
                       if (tag_gradrho) { 
        
                           Real ax = amrex::Math::abs(rho(i+1,j,k) - rho(i,j,k)); 
        
                           Real ay = amrex::Math::abs(rho(i,j+1,k) - rho(i,j,k)); 
        
                           Real az = amrex::Math::abs(rho(i,j,k+1) - rho(i,j,k)); 
        
                           ax = amrex::max(ax,amrex::Math::abs(rho(i,j,k) - rho(i-1,j,k))); 
        
                           ay = amrex::max(ay,amrex::Math::abs(rho(i,j,k) - rho(i,j-1,k))); 
        
                           az = amrex::max(az,amrex::Math::abs(rho(i,j,k) - rho(i,j,k-1))); 
        
                           if (amrex::max(ax,ay,az) >= gradrhoerr) { 
        
                               tag(i,j,k) = tagval; 
        
                           } 
        
                       } 
        
                   }); 
        
               }  
        
           }

We have been planning to use a class-based refinement criteria

amr-wind/amr-wind/utilities/tagging/RefinementCriteria.H

Line 30 in 8cf1eb8

RefinementCriteria() = default;

the static refinement is implemented as a subclass.

rd-contr · 2020-10-15T22:36:09Z

Here is a copy of the backtrace.

=== If no file names and line numbers are shown below, one can run
addr2line -Cpfie my_exefile my_line_address
to convert my_line_address (e.g., 0x4a6b) into file name and line number.
Or one can use amrex/Tools/Backtrace/parse_bt.py.

=== Please note that the line number reported by addr2line may not be accurate.
One can use
readelf -wl my_exefile | grep my_line_address'
to find out the offset for that line.

0: ~/amr-wind/bin/amr_wind() [0x694501]
amrex::BLBackTrace::print_backtrace_info(_IO_FILE*)
??:0

1: ~/amr-wind/bin/amr_wind() [0x69621a]
amrex::BLBackTrace::handler(int)
??:0

2: /lib64/libc.so.6(+0x36400) [0x2aaaabddb400]
__restore_rt
??:0

3: /lib64/libc.so.6(gsignal+0x37) [0x2aaaabddb387]
raise
??:0

4: /lib64/libc.so.6(abort+0x148) [0x2aaaabddca78]
abort
??:0

5: /lib64/libc.so.6(+0x78ed7) [0x2aaaabe1ded7]
__libc_message
??:0

6: /lib64/libc.so.6(+0x82aa6) [0x2aaaabe27aa6]
_int_malloc
??:0

7: /lib64/libc.so.6(__libc_malloc+0x4c) [0x2aaaabe2a6fc]
malloc
??:0

8: /app/gnu/7.3.0/lib64/libstdc++.so.6(_Znwm+0x18) [0x2aaaab5993d8]
operator new(unsigned long)
~/objdir/../gcc-7.3.0/libstdc++-v3/libsupc++/new_op.cc:50

9: ~/amr-wind/bin/amr_wind() [0x5eead0]
amrex::ParallelDescriptor::util::DoAllReduceReal(double*, unsigned int, int)
??:0

10: ~/amr-wind/bin/amr_wind() [0x51ebdc]
amr_wind::ABLWallFunction::computeplanar()
??:0

11: ~/amr-wind/bin/amr_wind() [0x51ffc5]
amr_wind::ABLWallFunction::update_umean()
??:0

12: ~/amr-wind/bin/amr_wind() [0x5088e5]
amr_wind::ABL::pre_advance_work()
??:0

13: ~/amr-wind/bin/amr_wind() [0x418ead]
incflo::pre_advance_stage2()
??:0

14: ~/amr-wind/bin/amr_wind() [0x41ecac]
incflo::Evolve()
??:0

15: ~/amr-wind/bin/amr_wind() [0x41017e]
main
??:0

16: /lib64/libc.so.6(__libc_start_main+0xf5) [0x2aaaabdc7555]
__libc_start_main
??:0

17: ~/amr-wind/bin/amr_wind() [0x41675b]
_start
??:0

rd-contr · 2020-10-15T22:38:02Z

Thanks for the tips on where to start poking around in the code for cell tagging for adaption. I'll take a closer look.

sayerhs · 2020-10-16T00:10:01Z

@rd-contr thanks for the backtrace, I see that the error is generated in computeplanar() which currently has a limitation that the height where it is computing the mean velocities should have the finest level covering the entire x/y domain extent. We didn't anticipate that anyone would be doing dynamic refinement for ABL (in near term), so we decided to simplify code with that limitation, and planned to revisit in the future.

By any chance, do you also have the log output captured to a file? I am curious what the density solve is doing that it changed density field sufficiently to trigger regrid.

sayerhs · 2020-10-16T00:37:56Z

@rd-contr Looking at the ABL codebase, I think there is a workaround that will allow you to run dynamic mesh adaptation with ABL (hopefully using something else other than density as a criterion). Let's say your Level0 mesh is 10m resolution (the coarsest resolution) and you want max_level = 3, then you can create a static refinement near the ground (up to 20m, that is encompassing the first two cells off of the wall in the coarsest level) such that, throughout the domain, the resolution between 0-20m is always the finest resolution (i.e., 1.25m at Level3). I realize this is not a great solution, but at least it will let you continue on with your code evaluation.

Currently, the ABL LES wall shear stress model requires a planar averaging, and it is tricky to do this properly with varying resolutions near the ground. So by ensuring that the first two cells off the ground (in the coarsest mesh) remains uniform resolution and static, we can accommodate the current wall function implementation.

rd-contr · 2020-10-16T15:49:30Z

Here is the output that was written to stdout/stderr during the run. The density variations are very small. I ran a case without adaption, looked at the gradients of density in the result, and then set the density gradient for the adaption accordingly. The gradient is incredibly small (on the order of 1e-13), so the settings are really contrived. I was just trying to get it to adapt to see how the code behaves.

run.txt

rd-contr · 2020-10-16T15:50:21Z

Thanks for the suggestion about adapting near the ground. Let me give that a try and see what it does.

sayerhs · 2020-10-16T19:36:34Z

@rd-contr Thanks, I'll take a look at the log file. Also with #229 merged, your previous experiment might work if you want to try.

rd-contr · 2020-10-19T03:17:58Z

Looks like my previous experiment is working with the latest version of the code (I pulled to commit 0fb46db). The resulting adapted grid still doesn't look as I would expect but I'm sure that's because adaption based on density or density gradient is still not the right quantity to target for adaption. I plan to take a look at tagging based on a different quantity and see what that does. I'm going to go ahead and close this issue as the segfault has been addressed by recent updates.

sayerhs · 2021-02-12T22:27:42Z

@rd-contr Recently we updated the wall shear-stress model (see PR #335) to better behave with nested mesh refinement near the ground. You might want to check if the latest options helps with your problem.

rd-contr closed this as completed Oct 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault after grid adaption #228

Segfault after grid adaption #228

rd-contr commented Oct 15, 2020

sayerhs commented Oct 15, 2020

rd-contr commented Oct 15, 2020

sayerhs commented Oct 15, 2020 •

edited

Loading

sayerhs commented Oct 15, 2020

rd-contr commented Oct 15, 2020

rd-contr commented Oct 15, 2020

sayerhs commented Oct 16, 2020

sayerhs commented Oct 16, 2020

rd-contr commented Oct 16, 2020

rd-contr commented Oct 16, 2020

sayerhs commented Oct 16, 2020

rd-contr commented Oct 19, 2020

sayerhs commented Feb 12, 2021

Segfault after grid adaption #228

Segfault after grid adaption #228

Comments

rd-contr commented Oct 15, 2020

sayerhs commented Oct 15, 2020

rd-contr commented Oct 15, 2020

sayerhs commented Oct 15, 2020 • edited Loading

sayerhs commented Oct 15, 2020

rd-contr commented Oct 15, 2020

rd-contr commented Oct 15, 2020

sayerhs commented Oct 16, 2020

sayerhs commented Oct 16, 2020

rd-contr commented Oct 16, 2020

rd-contr commented Oct 16, 2020

sayerhs commented Oct 16, 2020

rd-contr commented Oct 19, 2020

sayerhs commented Feb 12, 2021

sayerhs commented Oct 15, 2020 •

edited

Loading