Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@[inline] annotation is ignored when specializing function with fixed type class argument #4191

Open
3 tasks done
TwoFX opened this issue May 16, 2024 · 0 comments
Open
3 tasks done
Labels
bug Something isn't working

Comments

@TwoFX
Copy link
Contributor

TwoFX commented May 16, 2024

Prerequisites

Please put an X between the brackets as you perform the following steps:

Description

The following clumsy benchmark tests two variants of a pair of functions reinsertAux and expand_go (named after the hash map operations they are inspired by). The difference is that the first version passes a function into reinsertAux as a type class, while the second version passes in the function directly.

class ToUInt64 (α : Type u) where
  toUInt64 : α → UInt64

instance : ToUInt64 UInt32 where
  toUInt64 := UInt32.toUInt64

@[inline] def reinsertAux₁ {α : Type u} [ToUInt64 α] (acc : List UInt64) (a : α) : List UInt64 :=
  ToUInt64.toUInt64 a :: acc

def expand₁_go {α : Type u} [ToUInt64 α] (a : List α) : List UInt64 :=
  a.foldl reinsertAux₁ []

@[noinline] def expand₁' {α : Type u} [ToUInt64 α] (a : List α) : IO (List UInt64) := do
  return expand₁_go a

@[inline] def reinsertAux₂ {α : Type u} (f : α → UInt64) (acc : List UInt64) (a : α) : List UInt64 :=
  f a :: acc

def expand₂_go {α : Type u} [ToUInt64 α] (a : List α) : List UInt64 :=
  a.foldl (reinsertAux₂ ToUInt64.toUInt64) []

@[noinline] def expand₂' {α : Type u} [ToUInt64 α] (a : List α) : IO (List UInt64) := do
  return expand₂_go a

def benchmarkSize : Nat := 50000000

@[noinline] def mkNumbers : IO (List UInt32) := do
  let mut a : List UInt32 := []
  for i in [:benchmarkSize] do
    a := i.toUInt32 :: a
  return a

@[noinline] def runBenchmark₁ : IO PUnit := do
  for _ in [:5] do
    let n ← mkNumbers
    let _ ← timeit "expand₁" $ expand₁' n

@[noinline] def runBenchmark₂ : IO PUnit := do
  for _ in [:5] do
    let n ← mkNumbers
    let _ ← timeit "expand₂" $ expand₂' n

def main (_ : List String) : IO UInt32 := do
  runBenchmark₁
  runBenchmark₂
  return 0

It is important for performance that reinsertAux is inlined into expand_go, because only after the inlining it is possible to see that the memory from the input of expand_go can be reused in the constructor application in reinsertAux. Inspecting the IR, we indeed see that the IR for both expand₁_go and expand₂_go gets this right. In fact, the IR for these two functions returned by trace.compiler.ir.reset_reuse is exactly identical. It looks like this:

def List.foldl._at.expand₁_go._spec_1._rarg (x_1 : obj) (x_2 : obj) (x_3 : obj) : obj :=
  case x_3 : obj of
  List.nil →
    ret x_2
  List.cons →
    let x_4 : obj := proj[0] x_3;
    let x_5 : obj := proj[1] x_3;
    let x_9 : obj := reset[2] x_3;
    let x_6 : u64 := app x_1 x_4;
    let x_7 : obj := reuse x_9 in ctor_1[List.cons] x_6 x_2;
    let x_8 : obj := List.foldl._at.expand₁_go._spec_1._rarg x_1 x_7 x_5;
    ret x_8
def List.foldl._at.expand₁_go._spec_1 (x_1 : ◾) : obj :=
  let x_2 : obj := pap List.foldl._at.expand₁_go._spec_1._rarg;
  ret x_2
def expand₁_go._rarg (x_1 : obj) (x_2 : obj) : obj :=
  let x_3 : obj := ctor_0[List.nil];
  let x_4 : obj := List.foldl._at.expand₁_go._spec_1._rarg x_1 x_3 x_2;
  ret x_4

However, this IR isn't actually used due to specialization. The specialized IR is visible at the two runBenchmark functions. Inspecting the reset_reuse IR this time, we get

def reinsertAux₁._at.runBenchmark₁._spec_3 (x_1 : obj) (x_2 : u32) : obj :=
  let x_3 : u64 := UInt32.toUInt64 x_2;
  let x_4 : obj := ctor_1[List.cons] x_3 x_1;
  ret x_4
def List.foldl._at.runBenchmark₁._spec_4 (x_1 : obj) (x_2 : obj) : obj :=
  case x_2 : obj of
  List.nil →
    ret x_1
  List.cons →
    let x_3 : obj := proj[0] x_2;
    let x_4 : obj := proj[1] x_2;
    let x_5 : obj := reinsertAux₁._at.runBenchmark₁._spec_3 x_1 x_3;
    let x_6 : obj := List.foldl._at.runBenchmark₁._spec_4 x_5 x_4;
    ret x_6

for the first version and

def List.foldl._at.runBenchmark₂._spec_3 (x_1 : obj) (x_2 : obj) : obj :=
  case x_2 : obj of
  List.nil →
    ret x_1
  List.cons →
    let x_3 : obj := proj[0] x_2;
    let x_4 : obj := proj[1] x_2;
    let x_8 : obj := reset[2] x_2;
    let x_5 : u64 := UInt32.toUInt64 x_3;
    let x_6 : obj := reuse x_8 in ctor_1[List.cons] x_5 x_1;
    let x_7 : obj := List.foldl._at.runBenchmark₂._spec_3 x_6 x_4;
    ret x_7

for the second version. Here we see that in the first version reinsertAux was not inlined, and so there is no reset/reuse pair for the memory cells of the lists, and this is clearly measurable when we run the code: on my machine, I get

expand₁ 609ms
expand₁ 475ms
expand₁ 519ms
expand₁ 467ms
expand₁ 525ms
expand₂ 309ms
expand₂ 335ms
expand₂ 304ms
expand₂ 337ms
expand₂ 307ms

Context

During benchmarking I noticed that insert performance for the Lean hash map is about 10% better than for the Batteries hash map. This issue is the result of analyzing that observation. The Lean hash map looks like the second version, while the Batteries hash map looks like the first version.

Steps to Reproduce

  1. Run the above code

Expected behavior: Both versions should respect the @[inline] attribute on reinsertAux during specialization.

Actual behavior: Passing in the function via a type class leads to the @[inline] attribute being ignored.

Versions

4.9.0-nightly-2024-05-16

Linux markus-z16 6.8.9-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 2 18:59:06 UTC 2024 x86_64 GNU/Linux

Additional Information

[Additional information, configuration or data that might be necessary to reproduce the issue]

Impact

Add 👍 to issues you consider important. If others are impacted by this issue, please ask them to add 👍 to it.

@TwoFX TwoFX added the bug Something isn't working label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant