turn multi methods into single methods #65

Araq · 2018-09-28T18:47:42Z

The "multi" dispatch aspect in Nim's multi methods is hardly,
if ever, used in practice and yet it complicates the language
definition, leads to the requirement of weird .base annotations
and produces more inefficient code than would otherwise be possible.

The proposal here is simple: Only use the dynamic type of
the first argument in order to determine which method to call.
This is exactly how Java, C#, C++, Delphi etc deal with this problem
and it works well enough. It would allow Nim to generate traditional
vtables which are often faster than Nim's approach on modern
hardware. It also simplifies the language spec somewhat and makes code
slightly easier to reason about as only the runtime type of the first
argument needs to be reasoned about in order to determine which method
is called.

So, in summary:

Advantages:

Simpler code generation.
Probably faster code.
Closer to how other languages interpret OO.
We can get rid of the .base pragma.

Disadvantages:

Yet another breaking change. However, we will look at method-heavy
Nim code out there and see what it breaks. In particular nimx comes
to mind.
Use cases like the famous collide example stop working.

Migration plan

Introduce {.experimental: "singleMethods".} for Nim devel.
See what packages break with that option turned on.

The text was updated successfully, but these errors were encountered:

yglukhov · 2018-09-29T08:58:52Z

Nimx doesn't rely on multimethod nature of the methods. Vtable-like semantics should work just fine.

mratsim · 2018-09-29T09:41:54Z

Collide:

method collide(self, other: Shape)
method collide(self: Circle, other: Rectangle)
method collide(self: Polygon, other: Circle)
method collide(self: Polygon, other: Rectangle)

Wikipedia gives a 2.7–6.5% percentage of multi-dispatch usage in multi-dispatch enabled language.

In practice we could have sugar macros or template to do this but I'm not really a heavy user of methods.

timotheecour · 2018-09-29T10:05:29Z

Probably faster code.

this alone, if true indeed, would justify this breaking change. If anything, multimethod dispatch could be enabled by a multimethod keyword instead of making method implicitly a multi method dispatch.

zah · 2018-09-29T12:14:38Z

I think single dispatch methods implemented though a VTable stored inside the object are a somewhat harmful construct, because they limit the possible polymorphism only to the protocols/interfaces envisioned and supported by the original author of the type. An external VTable, stored in a fat pointer is a superior paradigm. This approach is followed by the concept VTable types and in other modern languages such as Rust. To quote the manual:

In contrast to other programming languages, the vtable in Nim is stored
externally to the object, allowing you to create multiple different vtable
views for the same object. Thus, the polymorphism in Nim is unbounded -
any type can implement an unlimited number of protocols or interfaces not
originally envisioned by the type's author.

With this in mind, I see little practicality of single-dispatch methods in the long term. Multi-dispatch methods can fill their own niche on the other hand, although it would be possible to replace them with user-defined machinery if we have a robust support for compile-time variables and late stage code generation as discussed here (this is the ability to generate additional code after the entire program have gone through the sem pass).

Araq · 2018-09-29T15:02:46Z

I think single dispatch methods implemented though a VTable stored inside the object are a somewhat harmful construct, because they limit the possible polymorphism only to the protocols/interfaces envisioned and supported by the original author of the type.

That's not really a big issue, you can always wrap the object in another object that implements the required interface. Also, it's unclear if fat pointers are not slower in practice, twice the data has to be passed around everywhere (just like a closure today).

Multi-dispatch methods can fill their own niche on the other hand, although it would be possible to replace them with user-defined machinery if we have a robust support for compile-time variables and late stage code generation as discussed here (this is the ability to generate additional code after the entire program have gone through the sem pass).

If multi dispatch is required, the C solution described here https://en.wikipedia.org/wiki/Multiple_dispatch#C still is the best solution by a wide margin IMO. And in Nim you can wrap the same machinery in a macro.

zah · 2018-09-29T15:16:19Z

Please note that the C solution described on Wikipedia violates the Open-Closed Principle, because you have to enumerate all the types that will be participating in the dynamic dispatch in a single place. The current multi-methods in Nim allow you to introduce new derived types and their respective methods across the entire program, hence the need for the full-program analysis I mentioned.

The violation of the open-closed principle will have practical implications for frameworks that allow the user to introduce new implementations of a particular interface that is plugged into the framework. Actually, the open-closed principle is one of the few legitimate reasons to use OOP for certain applications instead of regular procedural code with case objects.

Araq · 2018-09-29T17:03:15Z

Yes, but with a reasonably sized .compileTime array (or with a seq instead) and a macro the open-closed principle works, especially with the last stage code generation step you're pushing for anyway. (And I don't mind it either, as long as we ensure it works with incremental compilation.)

Also the lookup array version is faster since instead of subtyping/checking via <=, you can check via ==. Which is related to how clang's homegrown RTTI works, see https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

zah · 2018-09-29T18:10:43Z

Well, OK, but then why do you want to spend time on this now? Why not revisit methods when we have the potential alternative mechanism in place?

Araq · 2018-09-29T18:45:23Z

I'm not spending time on this now, but it's something we should sort out before v1. Maybe.

mratsim · 2018-09-30T10:57:32Z

To give you some numbers on methods performance compared to alternatives:

From https://github.com/mratsim/Arraymancer/blob/master/benchmarks/implementation/proc_method_closure_bench.nim

import times

type FooBase = ref object {.inheritable.}
  dummy: int

type Foo{.final.} = ref object of FooBase
  value : float32


proc inplace_add_proc(x: var Foo, a: float32) =
  x.value += a

proc inplace_add_closure(x: var float32, a: float32) =
  proc add_closure(v: var float32) = v += a
  add_closure(x)

method inplace_add_method(x: FooBase, a: float32) {.base.} =
  discard

method inplace_add_method(x: Foo, a: float32) =
  x.value += a

var bar : Foo
new bar
var start = cpuTime()
for i in 0..<100000000:
  inplace_add_proc(bar, 1.0f)
echo " Proc with ref object ", cpuTime() - start

var x : float32
start = cpuTime()
for i in 0..<100000000:
  inplace_add_closure(x, 1.0f)
echo " Closures ", cpuTime() - start

var baz : Foo
new baz
start = cpuTime()
for i in 0..<100000000:
  inplace_add_method(baz, 1.0f)
echo " Methods ", cpuTime() - start

# Results with -d:release on i5-5257U (dual-core mobile 2.7GHz, turbo 3.1)
# Proc with ref object 0.099993
# Closures 2.708598
# Methods 0.3122219999999998

But this is only with a single type to dispatch to.

Another one much more involved used to select the dispatching techniques for Nimbus VM: https://github.com/status-im/nimbus/wiki/Interpreter-optimization-resources

import random, sequtils, times

type
  Op = enum
    Halt # = 0x0000
    Inc  # = 0x0100
    Dec  # = 0x0110
    Mul2 # = 0x0230
    Div2 # = 0x0240
    Add7 # = 0x0307
    Neg  # = 0x0400

func interp_switch(code: seq[Op], initVal: int): int =

  var
    pc = 0
  result = initVal

  while true:
    case code[pc]:
    of Halt:
      return
    of Inc:
      inc pc
      inc result
    of Dec:
      inc pc
      dec result
    of Mul2:
      inc pc
      result *= 2
    of Div2:
      inc pc
      result = result div 2
    of Add7:
      inc pc
      inc result, 7
    of Neg:
      inc pc
      result = -result

#################################################################################################################

func interp_cgoto(code: seq[Op], initVal: int): int =
  # Requires a dense enum
  var
    pc = 0
  result = initVal

  while true:
    {.computedGoto.}
    let instr = code[pc]
    case instr:
    of Halt:
      return
    of Inc:
      inc pc
      inc result
    of Dec:
      inc pc
      dec result
    of Mul2:
      inc pc
      result *= 2
    of Div2:
      inc pc
      result = result div 2
    of Add7:
      inc pc
      inc result, 7
    of Neg:
      inc pc
      result = -result

#################################################################################################################

func halt(result: var int, stop: var bool) {.inline, nimcall.}=
  stop = true

func inc(result: var int, stop: var bool) {.inline, nimcall.}=
  inc result

func dec(result: var int, stop: var bool) {.inline, nimcall.}=
  dec result

func mul2(result: var int, stop: var bool) {.inline, nimcall.}=
  result *= 2

func div2(result: var int, stop: var bool) {.inline, nimcall.}=
  result = result div 2

func add7(result: var int, stop: var bool) {.inline, nimcall.}=
  inc result, 7

func neg(result: var int, stop: var bool) {.inline, nimcall.}=
  result = -result

# Requires dense enum
type InstrF = proc (result: var int, stop: var bool){.inline, nimcall, noSideEffect, gcsafe, locks: 0.}

type FuncTable = array[Op, InstrF]

const funcTable: FuncTable = [
  Halt: halt,
  Inc: inc,
  Dec: dec,
  Mul2: mul2,
  Div2: div2,
  Add7: add7,
  Neg: neg
]

proc interp_ftable(code: seq[Op], initVal: int): int =
  # Requires dense enum
  var
    pc = 0
    stop = false
  result = initVal

  while not stop:
    funcTable[code[pc]](result, stop)
    inc pc

#################################################################################################################

type
  InstrNext = proc (val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}

  OpH = ref object
    handler: InstrNext

  FuncTableNext = array[Op, OpH]

proc halt(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc inc(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc dec(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc mul2(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc div2(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc add7(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}
proc neg(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}

let funcTableNext: FuncTableNext = [
  Halt: OpH(handler: halt),
  Inc: OpH(handler: inc),
  Dec: OpH(handler: dec),
  Mul2: OpH(handler: mul2),
  Div2: OpH(handler: div2),
  Add7: OpH(handler: add7),
  Neg: OpH(handler: neg)
]

proc halt(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  stop = true

proc inc(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=

  inc val
  inc pc
  result = funcTableNext[code[pc]]

proc dec(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  dec val
  inc pc
  result = funcTableNext[code[pc]]

proc mul2(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  val *= 2
  inc pc
  result = funcTableNext[code[pc]]

proc div2(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  val = val div 2
  inc pc
  result = funcTableNext[code[pc]]

proc add7(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  inc val, 7
  inc pc
  result = funcTableNext[code[pc]]

proc neg(val: var int, code: seq[Op], pc: var int, stop: var bool): OpH {.inline, nimcall.}=
  val = -val
  inc pc
  result = funcTableNext[code[pc]]

proc interp_handlers(code: seq[Op], initVal: int): int =
  # Requires dense enum
  var
    pc = 0
    stop = false
    oph = funcTableNext[code[pc]]
  result = initVal

  while not stop:
    oph = oph.handler(result, code, pc, stop)

#################################################################################################################

type
  OpD = ref object {.inheritable.}

  Ohalt {.final.}= ref object of OpD
  Oinc {.final.}= ref object of OpD
  Odec {.final.}= ref object of OpD
  Omul2 {.final.}= ref object of OpD
  Odiv2 {.final.}= ref object of OpD
  Oadd7 {.final.}= ref object of OpD
  Oneg {.final.}= ref object of OpD

  FuncTableToken = array[Op, OpD]

method execute(op: OpD, result: var int, stop: var bool) {.base, inline, noSideEffect.} =
  raise newException(ValueError, "To override")

method execute(op: Ohalt, result: var int, stop: var bool) {.inline, noSideEffect.}=
  stop = true

method execute(op: Oinc, result: var int, stop: var bool) {.inline, noSideEffect.}=
  inc result

method execute(op: Odec, result: var int, stop: var bool) {.inline, noSideEffect.}=
  dec result

method execute(op: Omul2, result: var int, stop: var bool) {.inline, noSideEffect.}=
  result *= 2

method execute(op: Odiv2, result: var int, stop: var bool) {.inline, noSideEffect.}=
  result = result div 2

method execute(op: Oadd7, result: var int, stop: var bool) {.inline, noSideEffect.}=
  inc result, 7

method execute(op: Oneg, result: var int, stop: var bool) {.inline, noSideEffect.}=
  result = -result

let funcTableToken: FuncTableToken = [
  Halt: Ohalt(),
  Inc: Oinc(),
  Dec: Odec(),
  Mul2: Omul2(),
  Div2: Odiv2(),
  Add7: Oadd7(),
  Neg: Oneg()
]

proc interp_methods(code: seq[Op], initVal: int): int =
  # Requires dense enum
  var
    pc = 0
    stop = false
    opt: OpD
  result = initVal

  while not stop:
    opt = funcTableToken[code[pc]]
    opt.execute(result, stop)
    inc pc

#################################################################################################################

import random, sequtils, times, os, strutils, strformat

const Nb_Instructions = 1_000_000_000

template bench(impl: untyped) =
  let start = cpuTime()
  let r = impl(instructions, n)
  let stop = cpuTIme()
  let elapsed = stop - start
  echo "result: " & $r
  let procname = impl.astToStr
  let mips = (Nb_Instructions.float / (1_000_000.0 * elapsed))
  echo procname & " took " & $elapsed & "s for " & $Nb_Instructions & " instructions: " & $mips & " Mips (M instructions/s)"

proc main(n: int)=
  randomize(42)

  let ops = [Inc, Dec, Mul2, Div2, Add7, Neg]
  let instructions = newSeqWith(Nb_Instructions, rand(ops)) & Halt

  bench(interp_switch)
  bench(interp_cgoto) # requires dense enum (no holes)
  bench(interp_ftable) # requires dense enum (no holes) or tables (instead of arrays)
  bench(interp_handlers) # requires dense enum (no holes) or tables (instead of arrays)
  bench(interp_methods) # requires dense enum (no holes) or tables (instead of arrays)

# Warmup
var start = cpuTime()
block:
  var foo = 123
  for i in 0 ..< 1_000_000_000:
    foo += i*i mod 456
    foo = foo mod 789

# Compiler shouldn't optimize away the results as cpuTime rely on sideeffects
var stop = cpuTime()
echo "Warmup: " & $(stop - start) & "s"

# Main loop
let arguments = commandLineParams()
let initial = if arguments.len > 0: parseInt($arguments[0])
              else: 1

main(initial)

## Results on i5-5257U (Broadwell mobile dual core 2.7 turbo 3.1Ghz)
# Note that since Haswell, Intel CPU are significantly improed on Switch prediction
# This probably won't carry to ARM devices

# Warmup: 4.081501s
# result: -14604293096444
# interp_switch took 8.604712000000003s for 1000000000 instructions: 116.2153945419672 Mips (M instructions/s)
# result: -14604293096444
# interp_cgoto took 7.367597000000004s for 1000000000 instructions: 135.7294651159665 Mips (M instructions/s)
# result: -201628509198920 <--- some bug here to fix
# interp_ftable took 8.957571000000002s for 1000000000 instructions: 111.6374070604631 Mips (M instructions/s)
# result: -14604293096444
# interp_handlers took 11.039072s for 1000000000 instructions: 90.58732473164413 Mips (M instructions/s)
# result: -14604293096444
# interp_methods took 23.359635s for 1000000000 instructions: 42.80888806695823 Mips (M instructions/s)

So with 7 instructions, methods dispatch achieve only 42 Mips on my machine vs 135 Mips for computed gotos or 116 Mips for switch dispatch or 111 Mips for a VTable.

Note that switch dispatching was seriously improved in Haswell+ processor generation, this might not port to AMD, MIPS and ARM processors.

In my research of dispatching logic, I've come across the Nostradamus distributor which uses an easily predictable chain of ifs, a bit similar to the current methods but with much better performance (200Mips on Pentium 4 :?). Code source available here unfortunately I never managed to compile the C++ code or reproduce the results in Nim.

Araq · 2019-03-21T09:45:05Z

This has been implemented. On devel methods are single methods now. There is a switch --multiMethods:on for a transition period.

andreaferretti · 2019-03-21T09:53:22Z

That is a pretty important change! Is there documentation on how the new single dispatch methods work? Are they based on VTables (internal, external as in Go?)? How does this match the new developments on concepts?

Araq · 2019-03-22T11:57:28Z

The implementation strategy wasn't changed for now as it can be done later. I consider Concepts and VTables orthogonal to this change.

cloudhan · 2020-09-11T01:23:27Z

There is a switch --multiMethods:on for a transition period.

@Araq when will this feature be finally deprecated/depreciated.

narimiran transferred this issue from nim-lang/Nim Jan 2, 2019

Araq closed this as completed Mar 21, 2019

mratsim mentioned this issue Oct 11, 2020

Support for Multiple Inheritence #262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

turn multi methods into single methods #65

turn multi methods into single methods #65

Araq commented Sep 28, 2018

yglukhov commented Sep 29, 2018

mratsim commented Sep 29, 2018

timotheecour commented Sep 29, 2018 •

edited

Loading

zah commented Sep 29, 2018 •

edited

Loading

Araq commented Sep 29, 2018

zah commented Sep 29, 2018

Araq commented Sep 29, 2018 •

edited

Loading

zah commented Sep 29, 2018

Araq commented Sep 29, 2018

mratsim commented Sep 30, 2018

Araq commented Mar 21, 2019

andreaferretti commented Mar 21, 2019

Araq commented Mar 22, 2019

cloudhan commented Sep 11, 2020

turn multi methods into single methods #65

turn multi methods into single methods #65

Comments

Araq commented Sep 28, 2018

Migration plan

yglukhov commented Sep 29, 2018

mratsim commented Sep 29, 2018

timotheecour commented Sep 29, 2018 • edited Loading

zah commented Sep 29, 2018 • edited Loading

Araq commented Sep 29, 2018

zah commented Sep 29, 2018

Araq commented Sep 29, 2018 • edited Loading

zah commented Sep 29, 2018

Araq commented Sep 29, 2018

mratsim commented Sep 30, 2018

Araq commented Mar 21, 2019

andreaferretti commented Mar 21, 2019

Araq commented Mar 22, 2019

cloudhan commented Sep 11, 2020

timotheecour commented Sep 29, 2018 •

edited

Loading

zah commented Sep 29, 2018 •

edited

Loading

Araq commented Sep 29, 2018 •

edited

Loading