Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sorted OrderedSet in FieldHandler #654

Closed
wants to merge 1 commit into from
Closed

Conversation

fredrikekre
Copy link
Member

This changes FieldHandler.cellset to be a sorted OrderedSet instead of a Set. This ensures that loops over sub-domains are done in ascending cell order.

Since e.g. cells, node coordinates, and dofs are stored in ascending cell order this gives a significant performance boost to loops over sub-domains, i.e. assembly-style loops. In particular, this removes the performance gap between MixedDofHandler and DofHandler in the create_sparsity_pattern benchmark in #629.

This is a minimal/initial step towards #625 that can be done before the DofHandler merge and rework of FieldHandler/SubDofHandler.

This changes `FieldHandler.cellset` to be a sorted `OrderedSet` instead
of a `Set`. This ensures that loops over sub-domains are done in
ascending cell order.

Since e.g. cells, node coordinates, and dofs are stored in ascending
cell order this gives a significant performance boost to loops over
sub-domains, i.e. assembly-style loops. In particular, this removes the
performance gap between `MixedDofHandler` and `DofHandler` in the
`create_sparsity_pattern` benchmark in #629.

This is a minimal/initial step towards #625 that can be done before the
`DofHandler` merge and rework of `FieldHandler`/`SubDofHandler`.
@fredrikekre
Copy link
Member Author

fredrikekre commented Mar 29, 2023

Perhaps BitSet would be an alternative here, actually. That would store at maximum getncells(grid) / 64 (if the cell IDs in a subdomain are randomly distributed in 1:getncells(grid)) and at minimum length(set) / 64 (if each subset contain consecutively numbered cells, 1:10, 11:20, ...) UInt64 per sub-domain. Compared to Set, which stores roughly length(set) * 3 / 2 (I believe dict buffers should be filled up to at most 2/3), this would become worse if your set is randomly numbered and you have more than 64 * 3 / 2 ≈ 100 subdomains (and strictly better if you number your cells according to the subdomains, which I think is common anyway). I don't think this matters much though, this storage is nothing compared to dof storage, for example. Maybe it is easier to use BitSet since that is built in :)

fredrikekre added a commit that referenced this pull request Mar 30, 2023
This patch creates a `BitSet` of `FieldHandler.cellset` in loops, in
particular in `close!(::DofHandler)` and
`create_sparsity_pattern(::DofHandler)`. Since `BitSet`s are sorted this
ensures that these loops are done in ascending cell order, which gives a
performance boost due to better memory locality.

This is an even smaller change than #654 (and #625) which should be
completely non-breaking since the type of `FieldHandler.cellset` is not
changed. Larger refactoring, such as using `BitSet` or `OrderedSet` will
be done in the `FieldHandler/`SubDofHandler` rework.
fredrikekre added a commit that referenced this pull request Mar 30, 2023
This patch creates a `BitSet` of `FieldHandler.cellset` in loops, in
particular in `close!(::DofHandler)` and
`create_sparsity_pattern(::DofHandler)`. Since `BitSet`s are sorted this
ensures that these loops are done in ascending cell order, which gives a
performance boost due to better memory locality.

This is an even smaller change than #654 (and #625) which should be
completely non-breaking since the type of `FieldHandler.cellset` is not
changed. Larger refactoring, such as using `BitSet` or `OrderedSet` will
be done in the `FieldHandler/`SubDofHandler` rework.
@fredrikekre fredrikekre marked this pull request as draft March 30, 2023 16:52
@KnutAM
Copy link
Member

KnutAM commented May 6, 2023

Realized today that at least for threading, the ordered set must be collected: https://julialang.slack.com/archives/C6SMTHQ3T/p1683401329225109.
Copied example from that thread:

using OrderedCollections
vec = [1,3,5];
v = zeros(Int, 5);
oset = OrderedSet(vec);
Threads.@threads :static for i in vec
    v[i] = Threads.threadid()
end
Threads.@threads :static for i in oset
    v[i] = Threads.threadid()
end
ERROR: TaskFailedException
Stacktrace:
 [1] wait
   @ .\task.jl:345 [inlined]
 [2] threading_run(fun::var"#45#threadsfor_fun#5"{var"#45#threadsfor_fun#4#6"{OrderedSet{Int64}}}, static::Bool)
   @ Base.Threads .\threadingconstructs.jl:38
 [3] top-level scope
   @ .\threadingconstructs.jl:93

    nested task error: MethodError: no method matching firstindex(::OrderedSet{Int64})
    Closest candidates are:
      firstindex(::Any, ::Any) at abstractarray.jl:402
      firstindex(::Tuple) at tuple.jl:25
      firstindex(::Pair) at pair.jl:50
      ...
    Stacktrace:
     [1] #45#threadsfor_fun#4
       @ .\threadingconstructs.jl:69 [inlined]
     [2] #45#threadsfor_fun
       @ .\threadingconstructs.jl:51 [inlined]
     [3] (::Base.Threads.var"#1#2"{var"#45#threadsfor_fun#5"{var"#45#threadsfor_fun#4#6"{OrderedSet{Int64}}}, Int64})()
       @ Base.Threads .\threadingconstructs.jl:30

fredrikekre added a commit that referenced this pull request May 19, 2024
This patch changes the use of `Set` to `OrderedSet` in the grid. This
means that e.g. loops over these sets follow the original specified
order. This is important for performance since it can reduce
cache-misses, for example. Fixes #631. Closes #654.

Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Co-authored-by: Dennis Ogiermann <termi-official@users.noreply.github.com>
fredrikekre added a commit that referenced this pull request May 19, 2024
This patch changes the use of `Set` to `OrderedSet` in the grid. This
means that e.g. loops over these sets follow the original specified
order. This is important for performance since it can reduce
cache-misses, for example. Fixes #631. Closes #654.

Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Co-authored-by: Dennis Ogiermann <termi-official@users.noreply.github.com>
fredrikekre added a commit that referenced this pull request May 20, 2024
This patch changes the use of `Set` to `OrderedSet` in the grid. This
means that e.g. loops over these sets follow the original specified
order. This is important for performance since it can reduce
cache-misses, for example. Fixes #631. Closes #654.

Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Co-authored-by: Dennis Ogiermann <termi-official@users.noreply.github.com>
fredrikekre added a commit that referenced this pull request May 20, 2024
This patch changes the use of `Set` to `OrderedSet` in the grid. This
means that e.g. loops over these sets follow the original specified
order. This is important for performance since it can reduce
cache-misses, for example. Fixes #631. Closes #654.

Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Co-authored-by: Dennis Ogiermann <termi-official@users.noreply.github.com>
fredrikekre added a commit that referenced this pull request May 20, 2024
This patch changes the use of `Set` to `OrderedSet` in the grid. This
means that e.g. loops over these sets follow the original specified
order. This is important for performance since it can reduce
cache-misses, for example. Fixes #631. Closes #654.

Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Co-authored-by: Dennis Ogiermann <termi-official@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants