Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plots replace PyPlot in more examples #75

Merged
merged 23 commits into from
May 19, 2020
Merged

Plots replace PyPlot in more examples #75

merged 23 commits into from
May 19, 2020

Conversation

navidcy
Copy link
Member

@navidcy navidcy commented May 17, 2020

No description provided.

@codecov
Copy link

codecov bot commented May 17, 2020

Codecov Report

Merging #75 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #75   +/-   ##
=======================================
  Coverage   98.63%   98.63%           
=======================================
  Files           6        6           
  Lines         511      511           
=======================================
  Hits          504      504           
  Misses          7        7           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d43e9a...318cbb4. Read the comment docs.

Copy link
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a major improvement of the original example and thus I support merging as-is.

I have raised a number of issues that could also be addressed before merging. But this improvements can also be saved for a subsequent PR.

examples/barotropicqg_betaforced.jl Outdated Show resolved Hide resolved
examples/barotropicqg_betaforced.jl Outdated Show resolved Hide resolved
examples/barotropicqg_betaforced.jl Outdated Show resolved Hide resolved
examples/barotropicqg_betaforced.jl Outdated Show resolved Hide resolved

if j%(1000/nsubs)==0; println(log) end

p[1][1][:z] = Array(vs.zeta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the call to Array is to ensure this works on the GPU? Could be good to add a comment here, especially since this is an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the problem I wanted you to help me out.
It's a bit complicated... if I don't us Array() then @animate does not use the first frame for vorticity. Very strange....

@@ -55,7 +55,6 @@ kf, dkf = 14.0, 1.5 # forcing wavenumber and width of forcing ring in wavenu

gr = TwoDGrid(nx, Lx)

x, y = gridpoints(gr)
Kr = [ gr.kr[i] for i=1:gr.nkr, j=1:gr.nl]

forcingcovariancespectrum = @. exp(-(sqrt(gr.Krsq)-kf)^2/(2*dkf^2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is hard to read and somewhat inscrutable. I might infer that kf is a forcing wavenumber... but maybe writing forcing_wavenumber would make this unambiguous. A comment that explains the point of this could help too. dkf could be named forcing_bandwidth.

@@ -55,7 +55,6 @@ kf, dkf = 14.0, 1.5 # forcing wavenumber and width of forcing ring in wavenu

gr = TwoDGrid(nx, Lx)

x, y = gridpoints(gr)
Kr = [ gr.kr[i] for i=1:gr.nkr, j=1:gr.nl]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this line here? I'm not sure where Kr is used, but the next line down seems unrelated.

@@ -69,7 +68,7 @@ seed!(1234) # reset of the random number generator for reproducibility
nothing # hide

# Next we construct function `calcF!` that computes a forcing realization every timestep
function calcFq!(Fh, sol, t, cl, v, p, g)
function calcFq!(Fh, sol, t, cl, vs, pr, gr)
ξ = ArrayType(dev)(exp.(2π*im*rand(Float64, size(sol)))/sqrt(cl.dt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this allocate a new array every time-step?

You may want to allocate this array outside of the function, and then reference it as a global. I think declaring it const will ensure that it runs on the GPU.

Another option is to make calcFq a callable object that carries around a reference to ξ (rather than referencing it as a global). The complication with this approach is that I think you will need to use Adapt to get it to work on the GPU.

It's probably important to introduce what we think is best practice for this problem in the example. If its complicated, I think that means we simply need to add more simple examples to bridge the gap between simple cases and this sophisticated case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess referencing ξ as a global is probably a simple and adequate solution for this problem.

I'm worried this forcing function won't work (at least not generically, or for large problems) on the GPU if it allocates every time-step, as there is no garbage collection on the GPU.

Also, the syntax on this fairly complex line is very compact, which makes it difficult to read. Putting some white space between the arithmetic operations will make this part of the code easier to read and understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Copy link
Member Author

@navidcy navidcy May 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution is to avoid ξ whatsoever by:

  Fqh .= forcing_spectrum .* ArrayType(dev)(exp.(2π*im*rand(eltype(gr), size(sol)))/sqrt(clock.dt))

But that seems cumbersome, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm... I tried @btime and it made no difference...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your solution still allocates memory. Memory is allocated by this snippet:

rand(eltype(gr), size(sol))

Which is embedded within the expression above. This function creates an array of size(sol).

This function always executes on the CPU, as well.

Note that (the last time I checked) it is not possible to generate random numbers on the GPU within a kernel (and therefore without allocation GPU memory). This function will be slow on the GPU. It may be best to write this code only to execute on the CPU, so that people are not misled into thinking it would be a good idea to run this on the GPU. I'm pretty sure this code would be very slow on the GPU.

One way to avoid allocating memory on the CPU is probably to write a function that executes a loop over the two dimensions of ξ. For example, you can fill an array with random values by writing

for i in eachindex(a)
    @inbounds a[i] = rand()
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also important to note that ArrayType(GPU())(a) creates a new array that has the data in a on the GPU, and therefore allocates memory on the GPU. Because there is no garbage collection on the GPU, the GPU will run out of memory when this function is executed many times.

Copy link
Member

@glwagner glwagner May 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example:

julia> a = zeros(256, 256)
julia> @time a .= rand(Float64, size(a));
  0.000175 seconds (9 allocations: 512.297 KiB)
julia> function fillrand!(a)
           for i in eachindex(a)
               @inbounds a[i] = rand()
           end
           nothing
       end

julia> @time fillrand!(a)
  0.009296 seconds (6.76 k allocations: 315.136 KiB)

julia> @time fillrand!(a)
  0.000686 seconds (4 allocations: 160 bytes)

fillrand!, which allocates 160 bytes rather than 512,297 bytes, is actually slower in this case... but it could speed up time-stepping since the garbage collector won't have to be called... I'm not sure. It probably depends on the problem.

Copy link
Member

@glwagner glwagner May 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, best of all:

using Random

a = zeros(256, 256)

@time rand!(a)

@time rand!(a)

I get

julia> @time rand!(a);
  0.000087 seconds (4 allocations: 160 bytes)

https://docs.julialang.org/en/v1/stdlib/Random/

@navidcy navidcy changed the title Plots replace PyPlot in barotropicqg_betaforced.jl example Plots replace PyPlot in more examples May 18, 2020
@navidcy navidcy linked an issue May 18, 2020 that may be closed by this pull request
@navidcy
Copy link
Member Author

navidcy commented May 18, 2020

@glwagner this PR has become huge. I'm tempted to merge as soon as tests and docs are built properly and open issues regarding

  • are calcF!() functions allocate arrays every time-step?
  • are statements like ξ[1, 1]=0 within calcF!() functions really slow down GPU?
  • why Plots.jl seems to care if, e.g., p[1][1][:z]=prob.vars.zeta or p[1][1][:z]=Array(prob.vars.zeta)?

What do you reckon?

@navidcy
Copy link
Member Author

navidcy commented May 19, 2020

@glwagner, since this PR is about PyPlot.jl->Plots.jl, I'm merging and moving the discussion about memory allocation and forcing functions on CPU/GPU elsewhere.

@navidcy navidcy merged commit 71145eb into master May 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Drop PyPlot.jl and use Plots.jl in examples
2 participants