-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize 2-layer configs for MultiLayerQG
#270
Conversation
MultiLayerQG
90edadb is a temporary hack |
Needs to be cleaned up. But seems to work!! |
test/runtests.jl
Outdated
using BenchmarkTools | ||
|
||
dev = GPU() | ||
|
||
@show nlayers = 2 | ||
prob = GeophysicalFlows.MultiLayerQG.Problem(nlayers, dev) | ||
@btime stepforward!(prob) | ||
|
||
@show nlayers = 3 | ||
prob = GeophysicalFlows.MultiLayerQG.Problem(nlayers, dev) | ||
@btime stepforward!(prob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this before merging. I just put it here to make sure I don't introduce some slow-down shenanigans...
Qy :: Aphys3D | ||
"rfft plan for FFTs" | ||
rfftplan :: Trfft | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seemed to me we could have NLayerParams{N, ...}
to avoid copy-paste for SingleLayerParams
(or even better, just add N
to Params
...) This is totally fine with me now if it works, we can clean up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! It seems possible to reuse one Params
struct by adding a type parameter N
corresponding to the number of layers. Then we can define the aliases
const SingleLayerParams = Params{1}
const TwoLayerParams = Params{2}
I don't see this as necessary for this PR though, just future cleanup.
I agree! Can you open an issue with these ideas for a future cleanup? |
if nlayers==2 | ||
return TwoLayerParams(T(g), T(f₀), T(β), A(ρ), (T(H[1]), T(H[2])), U, eta, T(μ), T(ν), nν, calcFq, T(g′[1]), Qx, Qy, rfftplanlayered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own understanding - Why would this make it much faster? It seems we are still calculating S and S-1 in lines 330-334 when nlayers==2, even though they are not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@szy21, this part is not what makes the difference. Either way computing or not the S and S^{-1} matrices only happens once so that was not what was causing the slowdown.
But by defining a new AbstractParameter type for 2-layers then we can add a new method to streamfunctionfrompv!
that works specifically for 2-layers.
See, e.g., this general one
GeophysicalFlows.jl/src/multilayerqg.jl
Lines 560 to 572 in 4c310b8
""" | |
streamfunctionfrompv!(ψh, qh, params, grid) | |
Invert the PV to obtain the Fourier transform of the streamfunction `ψh` in each layer from | |
`qh` using `ψh = params.S⁻¹ qh`. | |
""" | |
function streamfunctionfrompv!(ψh, qh, params, grid) | |
for j=1:grid.nl, i=1:grid.nkr | |
CUDA.@allowscalar @views ψh[i, j, :] .= params.S⁻¹[i, j] * qh[i, j, :] | |
end | |
return nothing | |
end |
versus this 2-layer specific one that has the inversion hard-coded in.
GeophysicalFlows.jl/src/multilayerqg.jl
Lines 586 to 618 in 4c310b8
""" | |
streamfunctionfrompv!(ψh, qh, params::TwoLayerParams, grid) | |
Invert the PV to obtain the Fourier transform of the streamfunction `ψh` for the special | |
case of a two fluid layer configuration. In this case we have, | |
```math | |
ψ̂₁ = - [k⁻² q̂₁ + (f₀² / g′) (q̂₁ / H₂ + q̂₂ / H₁)] / Δ , | |
``` | |
```math | |
ψ̂₂ = - [k⁻² q̂₂ + (f₀² / g′) (q̂₁ / H₂ + q̂₂ / H₁)] / Δ , | |
``` | |
where ``Δ = k² [k² + f₀² (H₁ + H₂) / (g′ H₁ H₂)]``. | |
(Here, the PV-streamfunction relationship is hard-coded to avoid scalar operations | |
on the GPU.) | |
""" | |
function streamfunctionfrompv!(ψh, qh, params::TwoLayerParams, grid) | |
f₀, g′, H₁, H₂ = params.f₀, params.g′, params.H[1], params.H[2] | |
q1h, q2h = qh[:, :, 1], qh[:, :, 2] | |
@. ψh[:, :, 1] = - grid.Krsq * q1h - f₀^2 / g′ * (q1h / H₂ + q2h / H₁) | |
@. ψh[:, :, 2] = - grid.Krsq * q2h - f₀^2 / g′ * (q1h / H₂ + q2h / H₁) | |
for j in 1:2 | |
@. ψh[:, :, j] *= grid.invKrsq / (grid.Krsq + f₀^2 / g′ * (H₁ + H₂) / (H₁ * H₂)) | |
end | |
return nothing | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this! It will make my life much easier:)
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
This PR attempts to optimize the 2-layer configuration of
MultiLayerQG
module by hard-coding the PV-streamfunction relationships. To do so we create aTwoLayerParams <: AbstractParams
and then use this type to add methods in bothstreamfunctionfrompv!
andpvfromstreamfunction!
functions that include the hard-coded PV-streamfunction relationships for 2-layer configurations.Some benchmarks for 2- and 3-layer configs with
nx = ny = 128
:with
nlayers=2
before this PR
after this PR
with
nlayers=3
(shouldn't be affected by this PR)before this PR
after this PR
1650-fold faster for 2-layer config. We'll take that as an improvement. (Possibly even faster for higher grid resolution!)
Closes #267.