Minor performance improvements for `close!(::MixedDofHandler)` #637

fredrikekre · 2023-03-24T13:13:29Z

This patch tweaks the dof distribution for MixedDofHandler slightly:

push! cell dofs directly into dh.cell_dofs instead of first pushing to an element celldofs vector and then pushing to the global one. This removes one append! per cell, but does not affect the benchmark that much (dict lookup is still dominating).
Use the computed dict token for extraction in get_or_create_dofs!
Change keyword arguments in get_or_create_dofs! to positional. This does not matter for performance, but gives nicer profile stackframes.

I used this benchmark code:

using Ferrite
using BenchmarkTools
function dofhandler(::Type{DHT}, grid) where DHT
    dh = DHT(grid)
    add!(dh, :v, 2, Lagrange{2,RefCube,2}())
    add!(dh, :s, 1, Lagrange{2,RefCube,1}())
    close!(dh)
    return dh
end
grid = generate_grid(Quadrilateral, (1000, 1000))
@btime dofhandler(MixedDofHandler, $grid)

with the following (disappointing) results:

1.605 s (319 allocations: 640.42 MiB) # master
1.573 s (316 allocations: 639.34 MiB) # patch

For reference, DofHandler gives 1.3s, 405MiB for this benchmark.

This patch tweaks the dof distribution for `MixedDofHandler` slightly: - `push!` cell dofs directly into `dh.cell_dofs` instead of first pushing to an element celldofs vector and then pushing to the global one. This removes one `append!` per cell, but does not affect the benchmark that much (dict lookup is still dominating). - Use the computed dict token for extraction in `get_or_create_dofs!` - Change keyword arguments in `get_or_create_dofs!` to positional. This does not matter for performance, but gives nicer profile stackframes. I used this benchmark code: ```julia using Ferrite using BenchmarkTools function dofhandler(::Type{DHT}, grid) where DHT dh = DHT(grid) add!(dh, :v, 2, Lagrange{2,RefCube,2}()) add!(dh, :s, 1, Lagrange{2,RefCube,1}()) close!(dh) return dh end grid = generate_grid(Quadrilateral, (1000, 1000)) @Btime dofhandler(MixedDofHandler, $grid) ``` with the following (disappointing) results: ``` 1.605 s (319 allocations: 640.42 MiB) # master 1.573 s (316 allocations: 639.34 MiB) # patch ``` For reference, `DofHandler` gives 1.3s, 405MiB for this benchmark.

codecov-commenter · 2023-03-24T14:30:51Z

Codecov Report

Patch coverage: 86.66% and project coverage change: -0.01 ⚠️

Comparison is base (4b330ce) 92.41% compared to head (b4ef9f6) 92.41%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #637      +/-   ##
==========================================
- Coverage   92.41%   92.41%   -0.01%     
==========================================
  Files          29       29              
  Lines        4432     4429       -3     
==========================================
- Hits         4096     4093       -3     
  Misses        336      336

Impacted Files	Coverage Δ
src/Dofs/MixedDofHandler.jl	`79.18% <86.66%> (-0.28%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

This patch removes the tracking of (internal) cell dofs when distributing dofs for MixedDofHandler. There is no need to track these since they are defined as dofs which are never shared with other elements. This almost closes the performance gap between the DofHandlers (benchmark code from #637): ``` 1.573 s (316 allocations: 639.34 MiB) # MixedDofHandler master 1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler patch 1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch ```

#639) This patch removes the tracking of (internal) cell dofs when distributing dofs for MixedDofHandler. There is no need to track these since they are defined as dofs which are never shared with other elements. This almost closes the performance gap between the DofHandlers (benchmark code from #637): ``` 1.573 s (316 allocations: 639.34 MiB) # MixedDofHandler master 1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler patch 1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch ```

This reduces memory used in dof distribution for MixedDofHandler by only storing the first dof that are added for every entity, instead of the range of dofs. This simply copies the logic from DofHandler, which also means that MixedDofHandler now also support multiple dofs per face in 2D, but not in 3D, just like DofHandler. This closes the performance gap between the DofHandlers (benchmark code from #637): ``` 1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler master 1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler patch 1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch ```

This patch changes the storage for vertex dof tracking from dictionaries to vectors. This is possible since we know already from the start the number of vertices and their global numbers. Since dof distribution is almost 100% dict hashing this give a pretty big performance boost. Running the benchmark code from #637: ``` 1.267 s (234 allocations: 405.12 MiB) # DofHandler master 812.388 ms (130 allocations: 290.07 MiB) # DofHandler patch 1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler master 838.601 ms (145 allocations: 341.00 MiB) # MixedDofHandler patch ``` Note that the same optimization can be done for edges/faces too if those were globally enumerated. Fortunately vertex dofs are the most common case so implementing this is not a high priority.

fredrikekre requested a review from kimauth March 24, 2023 13:13

fredrikekre force-pushed the fe/mdh-close branch from 1c5afd5 to b4ef9f6 Compare March 24, 2023 14:18

kimauth approved these changes Mar 24, 2023

View reviewed changes

fredrikekre merged commit 43a1c19 into master Mar 24, 2023

fredrikekre deleted the fe/mdh-close branch March 24, 2023 14:41

fredrikekre mentioned this pull request Mar 24, 2023

Don't track internal dofs when distributing dofs for MixedDofHandler #639

Merged

fredrikekre added the performance label Mar 24, 2023

fredrikekre mentioned this pull request Mar 24, 2023

Reduce memory footprint in dof distribution for MixedDofHandler #642

Merged

fredrikekre mentioned this pull request Mar 25, 2023

Use vectors instead of dicts for keeping track of vertex dofs #643

Merged

kimauth mentioned this pull request Mar 25, 2023

Tracking progress for merging the dofhandlers #629

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor performance improvements for `close!(::MixedDofHandler)` #637

Minor performance improvements for `close!(::MixedDofHandler)` #637

fredrikekre commented Mar 24, 2023

codecov-commenter commented Mar 24, 2023 •

edited

Loading

Minor performance improvements for close!(::MixedDofHandler) #637

Minor performance improvements for close!(::MixedDofHandler) #637

Conversation

fredrikekre commented Mar 24, 2023

codecov-commenter commented Mar 24, 2023 • edited Loading

Codecov Report

Minor performance improvements for `close!(::MixedDofHandler)` #637

Minor performance improvements for `close!(::MixedDofHandler)` #637

codecov-commenter commented Mar 24, 2023 •

edited

Loading