Skip to content

Threading kills performance when extracting values from a vector of dual numbers #31

@danielwe

Description

@danielwe

This is strangely specific, but look, a factor ~200 slowdown when enabling multithreading:

julia> using BenchmarkTools, FastBroadcast, ForwardDiff

julia> N = 100; x = ForwardDiff.Dual.(randn(N), randn(N)); v = zeros(N);

julia> @btime @. $v = ForwardDiff.value($x);  # Baseline
  17.873 ns (0 allocations: 0 bytes)

julia> @btime @.. $v = ForwardDiff.value($x);
  16.712 ns (0 allocations: 0 bytes)

julia> @btime @.. thread=true $v = ForwardDiff.value($x);
  3.101 μs (1 allocation: 48 bytes)

It's the same when extracting partials. Is there something special about the Dual struct layout or how these functions are written that causes multithreaded broadcasting to go haywire here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions