-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show convergence bottleneck nodes to users #452
Comments
We already depend on TimerOutputs but currently don't really use it, but if we hook it up properly the results can be really helpful, like for https://github.com/trixi-framework/Trixi.jl.
|
We have somewhat extensive experience with slowly running models in variable density SEAWAT simulations, since the solute transport component MT3D does adaptive timestepping. Without exception, large flows in small cells are a problem. The way we analyze those is by analyzing flow output and computing flow velocities. The same is probably true for Ribasim: if you have output, you can color edges by computed flows. You can also compute derived data, such as average residence times (current storage / current outflow). These can be easily plotted on a map, and will likely help identifying which nodes are problematic. |
Flow velocities are also appreciated output from an ecological perspective (as indicated by Ellis Penning) |
Another thought: we're currently relying on DifferentialEquations.jl machinery to do the timestepping for us. As far as I know, it compares a higher and lower order to estimate whether time step sizes have to be adjusted. Essentially, I expected it to compare This should help us and users to identify what the problematic nodes and connections are. |
I asked about this on the Julia discourse: https://discourse.julialang.org/t/find-out-which-state-s-cause-convergence-problems/113738 |
Chris (the man, the myth, the legend) has responded. Something like this works (for certain algorithms): (; cache, p) = model.integrator
(; basin) = p
if hasproperty(cache, :atmp)
storage_error = abs.(cache.atmp.storage)
perm = sortperm(storage_error, rev=true)
println("Basins in descending order of being a convergence bottleneck:")
for i in perm
node_id = basin.node_id.values[i]
error = storage_error[i]
println("$node_id, (error = $error)")
end
end though this should probably be logged, and we should have a sensible threshold value for when the error is large enough to mention. This also relates to how these error values should be interpreted, which I don't know exactly how to do yet. Edit: they come from |
Fixes #452. To do: - [x] Implement #452 (comment), - [x] Add some more basins to the model and test that the quickly emptying one is the bottleneck.
Related to #419, if a run doesn't converge or is slower than expected, we need good tools to investigate the issue.
One thing that will help is to look at the solver stats:
Additionally it would be nice to locate the nodes in the network causing the issues easily. One thing @Huite suggested was to calculate some stability criterium, like the Basin retention time or courant number.
The goal of this issue is to collect some tools, some of which could be useful for the documentation.
The text was updated successfully, but these errors were encountered: