Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concepts building on each other: duplicated procs in C codegen #13982

Open
mratsim opened this issue Apr 14, 2020 · 3 comments
Open

Concepts building on each other: duplicated procs in C codegen #13982

mratsim opened this issue Apr 14, 2020 · 3 comments

Comments

@mratsim
Copy link
Collaborator

mratsim commented Apr 14, 2020

Context

I need to build a "tower" of mathematical objects of the following representation:

  • From a base object with arithmetic routines defined on it (to be accurate modular big int arithmetic)
  • build an extension of degree 2 (equivalent to building complex from 2 real coordinates)
  • then another extension, degree 2 or 3
  • then another extension, degree 2 or 3
  • ...
  • until you reach the required degree.

Example

type Base* = object
  data*: int

type Fp2* = object
  c0*, c1*: Base

type Fp6* = object
  c0*, c1*, c2*: Fp2

type Fp12* = object
  c0*, c1*: Fp6

The arithmetic on extensions with the same degree is the same, i.e. proc on Fp2 and Fp12 are the same. And if we built Fp36 on top of Fp12 it would have the same arithmetic as Fp6.

In short, this is a great place to use concepts, which I did: https://github.com/mratsim/constantine/blob/8a9cb928/constantine/tower_field_extensions/tower_common.nim

However I noticed that my generated codesize was quite huge with lots of useless procedures that were generated for the exact same input types over 15 times:
image

(Another side issue is that compile-time is quite long, about ~10 on my very beefy machine.)

Furthermore, inline procedures defined on Base can somehow be duplicated in the Fp6 file even though nothing in Fp2 is not tagged inline and so there is no reason for those to appear in the Fp6 file.

Use-case

There are actually not many proc defined per types: addition, substraction, squaring, doubling, setZero, setOne, inversion, multiplication but this generates ~20k lines with a lot of duplicate code.

This is problematic because those towers are used to implement pairing-based cryptography which is particularly attractive in embedded and resource-restricted devices for the small key sizes required compared to the security provided (i.e. for 128 bits of security, a 256-bit and 512-bit secret/public keypair is needed compared to RSA 2x3000+ bits) and obviously code size should be small as well.

Reproducing

Minimal example

Here is a minimal example, 4 files are needed

# File: "tower_concepts.nim"
type Base* = object
  data*: int

func setZero*(a: var Base) =
  a.data = 0

func setOne*(a: var Base) =
  a.data = 1

func `+=`*(a: var Base, b: Base) =
  a.data += b.data

func prod*(r: var Base, a, b: Base) =
  r.data = a.data * b.data

# Towering concepts
# -------------------------------------------------------------------

type
  CubicExt* = concept x
    ## Cubic Extension field concept
    type BaseField = auto
    x.c0 is BaseField
    x.c1 is BaseField
    x.c2 is BaseField

  QuadraticExt* = concept x
    ## Quadratic Extension field concept
    not(x is CubicExt)

    type BaseField = auto
    x.c0 is BaseField
    x.c1 is BaseField

  ExtensionField = QuadraticExt or CubicExt

# Initialization
# -------------------------------------------------------------------

func setZero*(a: var ExtensionField) =
  for field in fields(a):
    field.setZero()

func setOne*(a: var ExtensionField) =
  for fieldName, fA in fieldPairs(a):
    when fieldName == "c0":
      fA.setOne()
    else:
      fA.setZero()

# Addition
# -------------------------------------------------------------------

func `+=`*(a: var ExtensionField, b: ExtensionField) =
  for fA, fB in fields(a, b):
    fA += fB
# File: "tower_quadratic.nim"

import tower_concepts

proc prod*(r: var QuadraticExt, a, b: QuadraticExt) =
  mixin prod
  r.c0.prod(a.c0, b.c0)
  r.c1 = a.c1
  r.c1 += b.c1
# File: "tower_cubic.nim"

import tower_concepts

proc prod*(r: var CubicExt, a, b: CubicExt) =
  mixin prod
  r.c0.prod(a.c0, b.c0)
  r.c1.prod(a.c1, b.c2)
  r.c2 = a.c2
  r.c2 += b.c2
# File: "tower_test.nim

import tower_concepts, tower_quadratic, tower_cubic

# Concrete instances
# -------------------------------------------------------------------

type Fp2* = object
  c0*, c1*: Base

type Fp6* = object
  c0*, c1*, c2*: Fp2

type Fp12* = object
  c0*, c1*: Fp6

proc main() =
  var x, y: Fp12
  x.setOne()
  y.setOne()

  x += y
  echo x

  var r: Fp12
  r.prod(x, y)

main()

Compile to a local nimcache, for example (release because the linetrace are quite verbose in the codegen)

nim c -r -d:release --hints:off --warnings:off --outdir:build --nimcache:build/nimcache_concept_dup_code build/tower_test.nim

The prod procedure in quadratic.nim is duplicated:
image

In the actual codebase

To reproduce in the actual codebase:

git clone https://github.com/mratsim/constantine
cd constantine
git checkout 8a9cb92
nim c -d:release --hints:off --warnings:off --outdir:build --nimcache:build/nimcache_test_fp12 tests/test_fp12.nim

And look for the files over 3 MB. Compilation will be long.
The file involved int he tower are in this folder https://github.com/mratsim/constantine/tree/8a9cb928/constantine/tower_field_extensions

@Araq
Copy link
Member

Araq commented Apr 15, 2020

As bad as it might look, we need to investigate if the linker can remove the duplicated code or not. In theory duplicate function elimination is trivial to do...

@mratsim
Copy link
Collaborator Author

mratsim commented Apr 15, 2020

I'm not sure if the linker can handle the sheer size of duplicates, compilers have artificial limits on code analysis (like ~300 statements even if they can be optimized away in later passes).

I didn't realize how bad it is but I have 457 instances of duplicated "prod_complex" if I understood properly the mangling scheme:

Line 13k: 220 instances of prod complex

2020-04-15_16-34

Line 19k: 458 instances

2020-04-15_16-35

And notice how the last prod_generic is calling different (but duplicates) proc with the same name with just an incremented counter at the end.

@Araq
Copy link
Member

Araq commented Apr 16, 2020

From https://stackoverflow.com/questions/15168924/gcc-clang-merging-functions-with-identical-instructions-comdat-folding

The GNU gold linker supports ICF with the --icf option, which needs the GCC option -ffunction-sections to be used.

I'm not saying that it helps, I'm saying we need to see if "--icf" helps.

There is also this https://llvm.org/docs/MergeFunctions.html

And also https://tetzank.github.io/posts/identical-code-folding/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants