Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize 2-layer configs for MultiLayerQG #270

Merged
merged 24 commits into from
Nov 21, 2021
Merged

Optimize 2-layer configs for MultiLayerQG #270

merged 24 commits into from
Nov 21, 2021

Conversation

navidcy
Copy link
Member

@navidcy navidcy commented Nov 20, 2021

This PR attempts to optimize the 2-layer configuration of MultiLayerQG module by hard-coding the PV-streamfunction relationships. To do so we create a TwoLayerParams <: AbstractParams and then use this type to add methods in both streamfunctionfrompv! and pvfromstreamfunction! functions that include the hard-coded PV-streamfunction relationships for 2-layer configurations.

Some benchmarks for 2- and 3-layer configs with nx = ny = 128:

with nlayers=2

before this PR

julia> using GeophysicalFlows, BenchmarkTools

julia> nlayers = 2; prob = MultiLayerQG.Problem(nlayers, GPU()); @btime stepforward!(prob)
3.317 s (700757 allocations: 114.42 MiB)

after this PR

julia> using GeophysicalFlows, BenchmarkTools

julia> nlayers = 2; prob = MultiLayerQG.Problem(nlayers, GPU()); @btime stepforward!(prob)
1.923 ms (3385 allocations: 321.58 KiB)

with nlayers=3 (shouldn't be affected by this PR)

before this PR

julia> using GeophysicalFlows, BenchmarkTools

julia> nlayers = 3; prob = MultiLayerQG.Problem(nlayers, GPU()); @btime stepforward!(prob)
6.194 s (1299797 allocations: 213.44 MiB)

after this PR

julia> using GeophysicalFlows, BenchmarkTools

julia> nlayers = 3; prob = MultiLayerQG.Problem(nlayers, GPU()); @btime stepforward!(prob)
5.641 s (1299797 allocations: 213.44 MiB)

1650-fold faster for 2-layer config. We'll take that as an improvement. (Possibly even faster for higher grid resolution!)

Closes #267.

@navidcy navidcy changed the title Ncc/optimize 2layer Optimize 2-layer configs for MultiLayerQG Nov 20, 2021
@navidcy
Copy link
Member Author

navidcy commented Nov 20, 2021

90edadb is a temporary hack

@navidcy navidcy requested review from glwagner and szy21 November 20, 2021 00:57
@navidcy
Copy link
Member Author

navidcy commented Nov 20, 2021

Needs to be cleaned up. But seems to work!!

test/runtests.jl Outdated
Comment on lines 26 to 36
using BenchmarkTools

dev = GPU()

@show nlayers = 2
prob = GeophysicalFlows.MultiLayerQG.Problem(nlayers, dev)
@btime stepforward!(prob)

@show nlayers = 3
prob = GeophysicalFlows.MultiLayerQG.Problem(nlayers, dev)
@btime stepforward!(prob)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this before merging. I just put it here to make sure I don't introduce some slow-down shenanigans...

@navidcy
Copy link
Member Author

navidcy commented Nov 20, 2021

ok @glwagner and @szy21, feel free to review this!

Qy :: Aphys3D
"rfft plan for FFTs"
rfftplan :: Trfft
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed to me we could have NLayerParams{N, ...} to avoid copy-paste for SingleLayerParams (or even better, just add N to Params...) This is totally fine with me now if it works, we can clean up later.

src/multilayerqg.jl Outdated Show resolved Hide resolved
src/multilayerqg.jl Outdated Show resolved Hide resolved
src/multilayerqg.jl Outdated Show resolved Hide resolved
Copy link
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! It seems possible to reuse one Params struct by adding a type parameter N corresponding to the number of layers. Then we can define the aliases

const SingleLayerParams = Params{1}
const TwoLayerParams = Params{2}

I don't see this as necessary for this PR though, just future cleanup.

@navidcy
Copy link
Member Author

navidcy commented Nov 20, 2021

Looks great to me! It seems possible to reuse one Params struct by adding a type parameter N corresponding to the number of layers. Then we can define the aliases

const SingleLayerParams = Params{1}

const TwoLayerParams = Params{2}

I don't see this as necessary for this PR though, just future cleanup.

I agree! Can you open an issue with these ideas for a future cleanup?

Comment on lines +344 to +345
if nlayers==2
return TwoLayerParams(T(g), T(f₀), T(β), A(ρ), (T(H[1]), T(H[2])), U, eta, T(μ), T(ν), nν, calcFq, T(g′[1]), Qx, Qy, rfftplanlayered)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own understanding - Why would this make it much faster? It seems we are still calculating S and S-1 in lines 330-334 when nlayers==2, even though they are not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szy21, this part is not what makes the difference. Either way computing or not the S and S^{-1} matrices only happens once so that was not what was causing the slowdown.

But by defining a new AbstractParameter type for 2-layers then we can add a new method to streamfunctionfrompv! that works specifically for 2-layers.

See, e.g., this general one

"""
streamfunctionfrompv!(ψh, qh, params, grid)
Invert the PV to obtain the Fourier transform of the streamfunction `ψh` in each layer from
`qh` using `ψh = params.S⁻¹ qh`.
"""
function streamfunctionfrompv!(ψh, qh, params, grid)
for j=1:grid.nl, i=1:grid.nkr
CUDA.@allowscalar @views ψh[i, j, :] .= params.S⁻¹[i, j] * qh[i, j, :]
end
return nothing
end

versus this 2-layer specific one that has the inversion hard-coded in.

"""
streamfunctionfrompv!(ψh, qh, params::TwoLayerParams, grid)
Invert the PV to obtain the Fourier transform of the streamfunction `ψh` for the special
case of a two fluid layer configuration. In this case we have,
```math
ψ̂₁ = - [k⁻² q̂₁ + (f₀² / g′) (q̂₁ / H₂ + q̂₂ / H₁)] / Δ ,
```
```math
ψ̂₂ = - [k⁻² q̂₂ + (f₀² / g′) (q̂₁ / H₂ + q̂₂ / H₁)] / Δ ,
```
where ``Δ = k² [k² + f₀² (H₁ + H₂) / (g′ H₁ H₂)]``.
(Here, the PV-streamfunction relationship is hard-coded to avoid scalar operations
on the GPU.)
"""
function streamfunctionfrompv!(ψh, qh, params::TwoLayerParams, grid)
f₀, g′, H₁, H₂ = params.f₀, params.g′, params.H[1], params.H[2]
q1h, q2h = qh[:, :, 1], qh[:, :, 2]
@. ψh[:, :, 1] = - grid.Krsq * q1h - f₀^2 / g′ * (q1h / H₂ + q2h / H₁)
@. ψh[:, :, 2] = - grid.Krsq * q2h - f₀^2 / g′ * (q1h / H₂ + q2h / H₁)
for j in 1:2
@. ψh[:, :, j] *= grid.invKrsq / (grid.Krsq + f₀^2 / g′ * (H₁ + H₂) / (H₁ * H₂))
end
return nothing
end

Copy link
Collaborator

@szy21 szy21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! It will make my life much easier:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimized PV inversion for two layer case in MultilayerQG
3 participants