Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault after grid adaption #228

Closed
rd-contr opened this issue Oct 15, 2020 · 13 comments
Closed

Segfault after grid adaption #228

rd-contr opened this issue Oct 15, 2020 · 13 comments

Comments

@rd-contr
Copy link

I am trying to run the abl_godunov_cn case but with adaption turned on by turning off constant density and specifying incflo.graderr. It looks like the grid adaption works but the overall process segfaults before any new iterations are completed on the adapted grid. Solver version and input are below. Entirely possible that I don't have all of the settings in place for AMR. I'm open to any suggestions on how best to get AMR to work with ABL cases.

Solver output before segfault
Regrid mesh ... time elapsed = 0.02479205467
Grid summary:
Level 0 343 grids 8000000 cells 100 % of domain
smallest grid: 24 x 24 x 24 biggest grid: 32 x 32 x 32
Level 1 427 grids 3472384 cells 5.4256 % of domain
smallest grid: 8 x 8 x 8 biggest grid: 32 x 32 x 32

Step: 100 dt: 0.3355257033 Time: 56.5592 to 56.8947
CFL: 0.95 (conv: 0.949537 diff: 0 src: 0.0209629 )

Godunov:
System Iters Initial residual Final residual
----------------------------------------------------------------------------
Segfault

Solver Build Settings
AMR-Wind (https://github.com/exawind/amr-wind)

AMR-Wind Git SHA :: 01187b8
AMReX version :: 20.09-80-g61734d3da08b ( 20.09-80-g61734d3da08b )

Exec. date :: Fri Oct 9 19:49:40 2020
Build date :: Oct 5 2020 19:14:13
C++ compiler :: GNU 7.3.0

MPI :: ON (Num. ranks = 96)
GPU :: OFF
OpenMP :: OFF

Solver Input
time.stop_time = 200.0 # Max (simulated) time to evolve
time.max_step = -1 # Max number of time steps

time.fixed_dt = -0.5 # Use this constant dt if > 0
time.cfl = 0.95 # CFL factor

io.KE_int = 1
io.line_plot_int = 1
time.plot_interval = 100 # Steps between plot files
time.checkpoint_interval = -1000 # Steps between checkpoint files
amr.plt_tracer = 1

incflo.gravity = 0. 0. -9.81 # Gravitational force (3D)
incflo.density = 1.0 # Reference density
incflo.constant_density = 0

incflo.use_godunov = 1
#incflo.diffusion_type = 1
transport.viscosity = 1.0e-5
transport.laminar_prandtl = 0.7
transport.turbulent_prandtl = 0.3333
turbulence.model = Smagorinsky
Smagorinsky_coeffs.Cs = 0.135

incflo.physics = ABL
ICNS.source_terms = BoussinesqBuoyancy CoriolisForcing ABLForcing
BoussinesqBuoyancy.reference_temperature = 300.0
ABL.reference_temperature = 300.0
CoriolisForcing.latitude = 41.3
ABLForcing.abl_forcing_height = 90

incflo.velocity = 6.128355544951824 5.142300877492314 0.0

ABL.temperature_heights = 650.0 750.0 1000.0
ABL.temperature_values = 300.0 308.0 308.75

ABL.kappa = .41
ABL.surface_roughness_z0 = 0.15

amr.n_cell = 200 200 200 # Grid cells at coarsest AMRlevel
amr.max_level = 1 # Max AMR level in hierarchy
time.regrid_interval = 50
incflo.gradrhoerr = 0.0000000000003

geometry.prob_lo = 0. 0. 0. # Lo corner coordinates
geometry.prob_hi = 1000. 1000. 1000. # Hi corner coordinates
geometry.is_periodic = 1 1 0 # Periodicity x y z (0/1)

zlo.type = "wall_model"
zlo.temperature_type = "fixed_gradient"
zlo.temperature = 0.0

zhi.type = "slip_wall"
zhi.temperature_type = "fixed_gradient"
zhi.temperature = 0.003 # tracer is used to specify potential temperature gradient

incflo.verbose = 0 # incflo_level

amrex.fpe_trap_invalid = 0 # Trap NaNs

@sayerhs
Copy link
Contributor

sayerhs commented Oct 15, 2020

Non-constant density with ABL simulations is not something that has been tested ... terms like BoussinesqBuoyancy etc are assuming constant density. Also, I am not sure if density is the right criteria for regridding for ABL flows. However, that being said, it is unclear where your segfault is occurring, if you can provide a stack trace that would be useful.

If you want to just try multiple levels of refinement, have you tried this regression test: https://github.com/Exawind/amr-wind/tree/development/test/test_files/abl_godunov_static_refinement

@rd-contr
Copy link
Author

Understood on constant versus non-constant density. I was only running with non-constant because density variation appeared to be the only variable that AMR could be triggered through the incflo input settings. I see that incflo and IAMR have different settings to trigger AMR but it wasn't clear to me if those were specific to incflo or IAMR, or if they were an underlying setting from AMReX. I did see the static refinement case test case. Unfortunately, the application that I am looking at (non wind energy related) would likely have dynamic regions where adaption needs to take place. Given that the static refinement case is one that has been tested, I will look more closely to see if perhaps I could use it for my intended use case. I may lalso ook at the source a little to see what it would take to tag cells for adaption based on something other than density of density gradient.

Let me see if I can get the run to generate a backtrace that I could send related to the segfault. For whatever reason, it didn't automatically generate a backtrace.

@sayerhs
Copy link
Contributor

sayerhs commented Oct 15, 2020

@rd-contr Understood. In that case, I'll recommend the Rayleigh-Taylor case which uses density based dynamic adaptation. It is a better case to explore dynamic adaptation than the ABL case you were playing with https://github.com/Exawind/amr-wind/blob/development/test/test_files/rayleigh_taylor_godunov/rayleigh_taylor_godunov.i

By default AMReX traps segfault and outputs a Backtrace.* files, you might want to look there. You can also disable this trapping by setting the following in the input file

amrex.throw_exception = 1
amrex.signal_handling = 0

Also if you don't mind, may I ask what application you were considering using AMR-Wind for?

@sayerhs
Copy link
Contributor

sayerhs commented Oct 15, 2020

The density based tagging is implemented here:

if (tag_rho or tag_gradrho)
{
Array4<Real const> const& rho = den.const_array(mfi);
Real rhoerr = tag_rho ? rhoerr_v[lev]: std::numeric_limits<Real>::max();
Real gradrhoerr = tag_gradrho ? gradrhoerr_v[lev] : std::numeric_limits<Real>::max();
amrex::ParallelFor(bx,
[tag_rho,tag_gradrho,rhoerr,gradrhoerr,rho,tag]
AMREX_GPU_DEVICE (int i, int j, int k) noexcept
{
if (tag_rho and rho(i,j,k) > rhoerr) {
tag(i,j,k) = tagval;
}
if (tag_gradrho) {
Real ax = amrex::Math::abs(rho(i+1,j,k) - rho(i,j,k));
Real ay = amrex::Math::abs(rho(i,j+1,k) - rho(i,j,k));
Real az = amrex::Math::abs(rho(i,j,k+1) - rho(i,j,k));
ax = amrex::max(ax,amrex::Math::abs(rho(i,j,k) - rho(i-1,j,k)));
ay = amrex::max(ay,amrex::Math::abs(rho(i,j,k) - rho(i,j-1,k)));
az = amrex::max(az,amrex::Math::abs(rho(i,j,k) - rho(i,j,k-1)));
if (amrex::max(ax,ay,az) >= gradrhoerr) {
tag(i,j,k) = tagval;
}
}
});
}
}

We have been planning to use a class-based refinement criteria

RefinementCriteria() = default;
the static refinement is implemented as a subclass.

@rd-contr
Copy link
Author

Here is a copy of the backtrace.

=== If no file names and line numbers are shown below, one can run
addr2line -Cpfie my_exefile my_line_address
to convert my_line_address (e.g., 0x4a6b) into file name and line number.
Or one can use amrex/Tools/Backtrace/parse_bt.py.

=== Please note that the line number reported by addr2line may not be accurate.
One can use
readelf -wl my_exefile | grep my_line_address'
to find out the offset for that line.

0: ~/amr-wind/bin/amr_wind() [0x694501]
amrex::BLBackTrace::print_backtrace_info(_IO_FILE*)
??:0

1: ~/amr-wind/bin/amr_wind() [0x69621a]
amrex::BLBackTrace::handler(int)
??:0

2: /lib64/libc.so.6(+0x36400) [0x2aaaabddb400]
__restore_rt
??:0

3: /lib64/libc.so.6(gsignal+0x37) [0x2aaaabddb387]
raise
??:0

4: /lib64/libc.so.6(abort+0x148) [0x2aaaabddca78]
abort
??:0

5: /lib64/libc.so.6(+0x78ed7) [0x2aaaabe1ded7]
__libc_message
??:0

6: /lib64/libc.so.6(+0x82aa6) [0x2aaaabe27aa6]
_int_malloc
??:0

7: /lib64/libc.so.6(__libc_malloc+0x4c) [0x2aaaabe2a6fc]
malloc
??:0

8: /app/gnu/7.3.0/lib64/libstdc++.so.6(_Znwm+0x18) [0x2aaaab5993d8]
operator new(unsigned long)
~/objdir/../gcc-7.3.0/libstdc++-v3/libsupc++/new_op.cc:50

9: ~/amr-wind/bin/amr_wind() [0x5eead0]
amrex::ParallelDescriptor::util::DoAllReduceReal(double*, unsigned int, int)
??:0

10: ~/amr-wind/bin/amr_wind() [0x51ebdc]
amr_wind::ABLWallFunction::computeplanar()
??:0

11: ~/amr-wind/bin/amr_wind() [0x51ffc5]
amr_wind::ABLWallFunction::update_umean()
??:0

12: ~/amr-wind/bin/amr_wind() [0x5088e5]
amr_wind::ABL::pre_advance_work()
??:0

13: ~/amr-wind/bin/amr_wind() [0x418ead]
incflo::pre_advance_stage2()
??:0

14: ~/amr-wind/bin/amr_wind() [0x41ecac]
incflo::Evolve()
??:0

15: ~/amr-wind/bin/amr_wind() [0x41017e]
main
??:0

16: /lib64/libc.so.6(__libc_start_main+0xf5) [0x2aaaabdc7555]
__libc_start_main
??:0

17: ~/amr-wind/bin/amr_wind() [0x41675b]
_start
??:0

@rd-contr
Copy link
Author

Thanks for the tips on where to start poking around in the code for cell tagging for adaption. I'll take a closer look.

@sayerhs
Copy link
Contributor

sayerhs commented Oct 16, 2020

@rd-contr thanks for the backtrace, I see that the error is generated in computeplanar() which currently has a limitation that the height where it is computing the mean velocities should have the finest level covering the entire x/y domain extent. We didn't anticipate that anyone would be doing dynamic refinement for ABL (in near term), so we decided to simplify code with that limitation, and planned to revisit in the future.

By any chance, do you also have the log output captured to a file? I am curious what the density solve is doing that it changed density field sufficiently to trigger regrid.

@sayerhs
Copy link
Contributor

sayerhs commented Oct 16, 2020

@rd-contr Looking at the ABL codebase, I think there is a workaround that will allow you to run dynamic mesh adaptation with ABL (hopefully using something else other than density as a criterion). Let's say your Level0 mesh is 10m resolution (the coarsest resolution) and you want max_level = 3, then you can create a static refinement near the ground (up to 20m, that is encompassing the first two cells off of the wall in the coarsest level) such that, throughout the domain, the resolution between 0-20m is always the finest resolution (i.e., 1.25m at Level3). I realize this is not a great solution, but at least it will let you continue on with your code evaluation.

Currently, the ABL LES wall shear stress model requires a planar averaging, and it is tricky to do this properly with varying resolutions near the ground. So by ensuring that the first two cells off the ground (in the coarsest mesh) remains uniform resolution and static, we can accommodate the current wall function implementation.

@rd-contr
Copy link
Author

Here is the output that was written to stdout/stderr during the run. The density variations are very small. I ran a case without adaption, looked at the gradients of density in the result, and then set the density gradient for the adaption accordingly. The gradient is incredibly small (on the order of 1e-13), so the settings are really contrived. I was just trying to get it to adapt to see how the code behaves.

run.txt

@rd-contr
Copy link
Author

Thanks for the suggestion about adapting near the ground. Let me give that a try and see what it does.

@sayerhs
Copy link
Contributor

sayerhs commented Oct 16, 2020

@rd-contr Thanks, I'll take a look at the log file. Also with #229 merged, your previous experiment might work if you want to try.

@rd-contr
Copy link
Author

Looks like my previous experiment is working with the latest version of the code (I pulled to commit 0fb46db). The resulting adapted grid still doesn't look as I would expect but I'm sure that's because adaption based on density or density gradient is still not the right quantity to target for adaption. I plan to take a look at tagging based on a different quantity and see what that does. I'm going to go ahead and close this issue as the segfault has been addressed by recent updates.

@sayerhs
Copy link
Contributor

sayerhs commented Feb 12, 2021

@rd-contr Recently we updated the wall shear-stress model (see PR #335) to better behave with nested mesh refinement near the ground. You might want to check if the latest options helps with your problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants