use FMA where possible in fitting #740

palday · 2024-01-26T19:48:29Z

Thanks for contributing!

Did behavior change? Did you add need features? If so, please update NEWS.md

add entry in NEWS.md
after opening this PR, add a reference and run docs/NEWS-update.jl to update the cross-references.

Should we release your changes right away? If so, bump the version:

I've bumped the version appropriately

github-actions · 2024-01-26T20:00:23Z

Benchmark Report for /home/runner/work/MixedModels.jl/MixedModels.jl

Job Properties

Time of benchmarks:
- Target: 5 Mar 2024 - 17:37
- Baseline: 5 Mar 2024 - 17:42
Package commits:
- Target: 4e554b
- Baseline: c1f9ca
Julia commits:
- Target: bd47ec
- Baseline: bd47ec
Julia command flags:
- Target: None
- Baseline: -C,native,-J/opt/hostedtoolcache/julia/1.10.2/x64/lib/julia/sys.so,-g1,-O3,-e,using Pkg; Pkg.update(); Pkg.add(["BenchmarkTools", "StatsModels"])
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["crossed", "insteval:1"]`	0.76 (5%) ✅	1.00 (1%)
`["crossed", "insteval:2"]`	0.92 (5%) ✅	1.00 (1%)
`["crossed", "machines:1"]`	1.00 (5%)	0.98 (1%) ✅
`["crossed", "mrk17_exp1:1"]`	0.09 (5%) ✅	0.99 (1%)
`["crossed", "penicillin:1"]`	0.93 (5%) ✅	0.99 (1%)
`["crossedvector", "d3:1"]`	1.43 (5%) ❌	1.13 (1%) ❌
`["crossedvector", "mrk17_exp1:2"]`	0.77 (5%) ✅	1.00 (1%)
`["nested", "pastes:2"]`	0.93 (5%) ✅	1.00 (1%)
`["singlevector", "sleepstudy:2"]`	0.95 (5%) ✅	0.99 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["crossed"]
["crossedvector"]
["nested"]
["singlevector"]

Julia versioninfo

Target

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  3243 MHz       2063 s          0 s        146 s       2705 s          0 s
       #2  2445 MHz       1998 s          0 s        146 s       2762 s          0 s
       #3  2595 MHz       1677 s          0 s        560 s       2676 s          0 s
       #4  2541 MHz       1140 s          0 s        418 s       3353 s          0 s
  Memory: 15.606491088867188 GB (13623.30078125 MB free)
  Uptime: 494.05 sec
  Load Avg:  1.7  1.34  0.72
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  3243 MHz       3363 s          0 s        277 s       3936 s          0 s
       #2  2517 MHz       2906 s          0 s        274 s       4388 s          0 s
       #3  3002 MHz       2988 s          0 s        744 s       3845 s          0 s
       #4  2445 MHz       2123 s          0 s        598 s       4852 s          0 s
  Memory: 15.606491088867188 GB (13863.64453125 MB free)
  Uptime: 760.79 sec
  Load Avg:  1.72  1.7  1.06
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

codecov · 2024-01-26T20:01:17Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.33%. Comparing base (c1f9ca0) to head (d46079d).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #740   +/-   ##
=======================================
  Coverage   96.33%   96.33%           
=======================================
  Files          34       34           
  Lines        3356     3356           
=======================================
  Hits         3233     3233           
  Misses        123      123

Flag	Coverage Δ
current	`96.27% <100.00%> (ø)`
minimum	`96.23% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dmbates · 2024-02-20T18:35:18Z

It seems that muladd is more general than fma but falls back on fma when available. Is there a reason not to prefer fma over muladd?

palday · 2024-03-03T18:34:20Z

There are two differences between muladd and fma that are potentially relevant:

muladd works with missing, but fma doesn't.
fma will use software FMA if it's not available in hardware, while muladd will simply fall back to the non-fused variant. This can introduce rounding differences in addition to the performance differences (software FMA is slower than muladd is slower than hardware FMA).

If it were just (1), I would say use muladd everywhere. But I was unsure about (2) -- I was torn between consistent rounding across architectures and worse performance on legacy hardware. I don't have a strong preference either way, but I guess there is something consistent about using muladd everywhere and not thinking too hard about whether missing can arise in a given context or not.

palday · 2024-03-05T17:17:01Z

After some more thought, I think it's better to use muladd everywhere. I will update accordingly.

src/linalg.jl

src/linalg/rankUpdate.jl

src/linalg.jl

src/linalg/rankUpdate.jl

src/linearmixedmodel.jl

use FMA where possible in fitting

d0f24f9

Merge branch 'main' into pa/fma

6ca5832

use muladd everywhere

edc54d2

palday marked this pull request as ready for review March 5, 2024 17:19

github-actions bot reviewed Mar 5, 2024

View reviewed changes

src/linalg.jl Outdated Show resolved Hide resolved

src/linalg/rankUpdate.jl Outdated Show resolved Hide resolved

NEWS update

a9873f6

github-actions bot reviewed Mar 5, 2024

View reviewed changes

src/linalg.jl Outdated Show resolved Hide resolved

src/linalg/rankUpdate.jl Outdated Show resolved Hide resolved

src/linearmixedmodel.jl Outdated Show resolved Hide resolved

palday added 2 commits March 5, 2024 11:28

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into pa/fma

b10b801

format

d46079d

palday requested a review from dmbates March 5, 2024 17:31

dmbates approved these changes Mar 5, 2024

View reviewed changes

palday merged commit 510dcc3 into main Mar 5, 2024
11 of 12 checks passed

palday deleted the pa/fma branch March 5, 2024 19:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use FMA where possible in fitting #740

use FMA where possible in fitting #740

palday commented Jan 26, 2024 •

edited

github-actions bot commented Jan 26, 2024 •

edited

codecov bot commented Jan 26, 2024 •

edited

dmbates commented Feb 20, 2024

palday commented Mar 3, 2024

palday commented Mar 5, 2024

use FMA where possible in fitting #740

use FMA where possible in fitting #740

Conversation

palday commented Jan 26, 2024 • edited

github-actions bot commented Jan 26, 2024 • edited

Benchmark Report for /home/runner/work/MixedModels.jl/MixedModels.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

codecov bot commented Jan 26, 2024 • edited

Codecov Report

dmbates commented Feb 20, 2024

palday commented Mar 3, 2024

palday commented Mar 5, 2024

palday commented Jan 26, 2024 •

edited

github-actions bot commented Jan 26, 2024 •

edited

codecov bot commented Jan 26, 2024 •

edited