Skip to content

Conversation

@chriselrod
Copy link
Contributor

@chriselrod chriselrod commented Aug 18, 2019

By preventing the mutable structs from escaping (by inlining all the functions they are passed to as arguments) we avoid allocations and get better performance.

Before PR:

julia> using AccurateArithmetic, BenchmarkTools

julia> x = rand(1024);

julia> y = rand(1024);

julia> @benchmark sum_kbn($x)
BenchmarkTools.Trial: 
  memory estimate:  592 bytes
  allocs estimate:  5
  --------------
  minimum time:     238.235 ns (0.00% GC)
  median time:      243.833 ns (0.00% GC)
  mean time:        324.037 ns (22.48% GC)
  maximum time:     14.107 μs (96.99% GC)
  --------------
  samples:          10000
  evals/sample:     442

julia> @benchmark sum_oro($x)
BenchmarkTools.Trial: 
  memory estimate:  16 bytes
  allocs estimate:  1
  --------------
  minimum time:     150.962 ns (0.00% GC)
  median time:      152.232 ns (0.00% GC)
  mean time:        156.910 ns (1.02% GC)
  maximum time:     8.306 μs (97.48% GC)
  --------------
  samples:          10000
  evals/sample:     818

julia> @benchmark dot_oro($x,$y)
BenchmarkTools.Trial: 
  memory estimate:  1.16 KiB
  allocs estimate:  9
  --------------
  minimum time:     379.704 ns (0.00% GC)
  median time:      402.424 ns (0.00% GC)
  mean time:        561.343 ns (26.13% GC)
  maximum time:     30.973 μs (97.65% GC)
  --------------
  samples:          10000
  evals/sample:     203

after PR:

julia> using AccurateArithmetic, BenchmarkTools
[ Info: Precompiling AccurateArithmetic [22286c92-06ac-501d-9306-4abd417d9753]

julia> x = rand(1024);

julia> y = rand(1024);

julia> @benchmark sum_kbn($x)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     186.095 ns (0.00% GC)
  median time:      186.521 ns (0.00% GC)
  mean time:        189.388 ns (0.00% GC)
  maximum time:     424.038 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     666

julia> @benchmark sum_oro($x)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     140.254 ns (0.00% GC)
  median time:      140.663 ns (0.00% GC)
  mean time:        142.889 ns (0.00% GC)
  maximum time:     349.425 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     855

julia> @benchmark dot_oro($x,$y)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     219.204 ns (0.00% GC)
  median time:      220.613 ns (0.00% GC)
  mean time:        223.682 ns (0.00% GC)
  maximum time:     479.681 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     501

The above tests were run on:

julia> versioninfo()
Julia Version 1.4.0-DEV.0
Commit 2ef0ed159d* (2019-08-17 17:21 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.0 (ORCJIT, skylake)

ffevotte added a commit that referenced this pull request Aug 18, 2019
@ffevotte ffevotte merged commit 8b410fe into JuliaMath:master Aug 18, 2019
@ffevotte
Copy link
Contributor

Thanks a lot!

ffevotte added a commit that referenced this pull request Aug 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants