Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick fix for threaded assembly example on Julia 1.8 #534

Merged
merged 1 commit into from
Nov 7, 2022

Conversation

kimauth
Copy link
Member

@kimauth kimauth commented Nov 7, 2022

This guarantees the threaded assembly example to work correctly even on Julia 1.8. I am not really sure though what's the best practice pattern to use now - a channel with scratch values I guess?

@fredrikekre
Copy link
Member

I am not really sure though what's the best practice pattern to use now - a channel with scratch values I guess?

Probably something like

@sync begin
workqueue = Channel{Int}(Inf)
foreach(x -> put!(workqueue, x), color)
close(workqueue)
for _ in 1:Threads.nthreads()
Threads.@spawn begin
local scratch = create_scratchbuffer(K, f, dh)
for ci in workqueue
# println("Now I am doing color $(ci)")
assemble_cell!(scratch, ci, grid, dh, C, b)
end
end
end
end
, but :static should be fine too.

@fredrikekre fredrikekre merged commit dd88d1e into master Nov 7, 2022
@fredrikekre fredrikekre deleted the ka/fix_threaded_assembly_example branch November 7, 2022 17:29
@KristofferC
Copy link
Collaborator

Have someone benchmarked the Channel approach?

@fredrikekre
Copy link
Member

IIRC it is more or less equivalent to the :static approach for this example.

@lijas
Copy link
Collaborator

lijas commented Nov 8, 2022

This is what I get:

Channels:

function doassemble(K::SparseMatrixCSC, f, colors, grid::Grid, dh::DofHandler, C::SymmetricTensor{4, dim}) where {dim}
    for color in colors
        @sync begin
            workqueue = Channel{Int}(Inf)
            foreach(x -> put!(workqueue, x), color)
            close(workqueue)
            for _ in 1:Threads.nthreads()
                Threads.@spawn begin
                    local scratch = create_scratchbuffer(K, f, dh)
                    for ci in workqueue
                        # println("Now I am doing color $(ci)")
                        assemble_cell!(scratch, ci, grid, dh, C)
                    end
                end
            end
        end
    end
end
$ julia --project -t 1 docs/src/literate/threaded_assembly.jl 
  0.775692 seconds (250.56 k allocations: 11.765 MiB, 6.53% gc time)
$ julia --project -t 4 docs/src/literate/threaded_assembly.jl 
  0.240576 seconds (1.00 M allocations: 44.171 MiB, 25.31% gc time)

@ threads:

function doassemble(K::SparseMatrixCSC, f, colors, grid::Grid, dh::DofHandler, C::SymmetricTensor{4, dim}) where {dim}
scratches = create_scratchvalues(K, f, dh)
for color in colors
        ## Each color is safe to assemble threaded
        Threads.@threads :static for i in 1:length(color)
            assemble_cell!(scratches[Threads.threadid()], color[i], grid, dh, C)
        end
    end
end
$ julia --project -t 1 docs/src/literate/threaded_assembly.jl 
  0.535967 seconds (14.03 k allocations: 632.516 KiB)
$ julia --project -t 4 docs/src/literate/threaded_assembly.jl 
  0.170042 seconds (55.96 k allocations: 2.461 MiB)

But here create_scratchvalues(..) is also included in the time

Edit: Is it possible to pre-allocate the local sratch_buffers and workqueue in some way (when using channels in the first approach)? For the scratch_buffers, one could use some sort of "pool" which the threads can get the sratch-buffers from...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants