Avoid recompiling inlined overload impl #6856

ehsantn · 2021-03-23T20:12:16Z

We use inline="always" for most of our overloads, and found that implementations being compiled over and over again during typing is a major compilation time bottleneck. Avoiding recompilation by this PR saves around 40% of total compilation time in my testing.

sklam · 2021-03-23T22:15:39Z

@stuartarchibald, tagged you for review since you are more familiar with this part of the code base

esc · 2021-03-29T13:41:05Z

@ehsantn @farahhariri maybe it would be good to add a reproducer or a benchmark to make it easier for reviewers to verify?

ehsantn · 2021-03-31T14:15:19Z

Below is a benchmark that requires installing bodo 2021.2: conda install bodo=2021.2 -c bodo.ai -c conda-forge

import numpy as np
import pandas as pd
import bodo
import numba

n = 10
func_text = "def f(A):\n"
func_text += "  a = 1\n"
for i in range(n):
    func_text += "  a += A.sum()\n"
func_text += "  return a\n"

loc_vars = {}
exec(func_text, {"np": np, "pd": pd}, loc_vars)
f = loc_vars["f"]
g = bodo.jit(distributed=False)(f)
A = pd.Series([1.1])
print(g(A))
ss = list(list(g.get_metadata().values())[0]['pipeline_times'].values())[0]
t = 0
for c, v in sorted(ss.items(), key=lambda a: a[1].run, reverse=True):
    print(c, v.run)
    t += v.run

print(t)

Without this patch it takes about 3.5s for me (MacBook Pro 2019) but with this patch it takes 3s.

stuartarchibald · 2021-04-09T11:41:41Z

I think it'd be good to come up with a simple reproducer that is stand alone and uses just Numba, it'll be easier to validate and debug.

stuartarchibald · 2021-04-09T11:43:21Z

I've tried this:

from numba import njit
from numba.extending import overload
import numpy as np

_GLOBAL = 1234


def _gen_involved():
    _FREEVAR = 0xCAFE

    def foo(a, b, c=12, d=1j, e=None):
        f = a + b
        a += _FREEVAR
        g = np.zeros(c, dtype=np.complex64)
        h = f + g
        i = 1j / d
        # For SSA, zero init, n and t
        n = 0
        t = 0
        if np.abs(i) > 0:
            k = h / i
            l = np.arange(1, c + 1)
            m = np.sqrt(l - g) + e * k
            if np.abs(m[0]) < 1:
                for o in range(a):
                    n += 0
                    if np.abs(n) < 3:
                        break
                n += m[2]
            p = g / l
            q = []
            for r in range(len(p)):
                q.append(p[r])
                if r > 4 + 1:
                    s = 123
                    t = 5
                    if s > 122 - c:
                        t += s
                t += q[0] + _GLOBAL

        return f + o + r + t + r + a + n

    return foo

def fortran(a, b, c=12, d=1j, e=None):
    pass

impl = _gen_involved()
overload(fortran, inline='always')(lambda a, b, c=12, d=1j, e=None: impl)

tmp = "def call_many_involved():\n\ta = 0\n"

buf = ['\ta += fortran(1, 1, 1, 1, 1)' for _ in range(10)]
tmp += '\n'.join(buf)
tmp += '\n\treturn a'
l = {}
exec(tmp, {'np': np, 'fortran': fortran}, l)

fn = njit(l['call_many_involved'])
fn()
ss = list(list(fn.get_metadata().values())[0]['pipeline_times'].values())[0]
t = 0
for c, v in sorted(ss.items(), key=lambda a: a[1].run, reverse=True):
    print(c, v.run)
    t += v.run

print(t)

but it seems to not be demonstrating the effect described.

github-actions · 2023-03-13T01:57:48Z

This pull request is marked as stale as it has had no activity in the past 3 months. Please respond to this comment if you're still interested in working on this. Many thanks!

Avoid recompiling inlined overload impl

f820894

ehsantn requested review from esc, sklam and stuartarchibald as code owners March 23, 2021 20:12

sklam added the 3 - Ready for Review label Mar 23, 2021

sklam added the Effort - medium Medium size effort needed label Mar 23, 2021

stuartarchibald added this to the PR Backlog milestone Mar 24, 2021

stuartarchibald added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Apr 23, 2021

github-actions bot added the stale Marker label for stale issues. label Mar 13, 2023

github-actions bot added the abandoned - stale PRs automatically closed due to stale status label Mar 20, 2023

github-actions bot closed this Mar 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid recompiling inlined overload impl #6856

Avoid recompiling inlined overload impl #6856

ehsantn commented Mar 23, 2021

sklam commented Mar 23, 2021

esc commented Mar 29, 2021

ehsantn commented Mar 31, 2021

stuartarchibald commented Apr 9, 2021

stuartarchibald commented Apr 9, 2021

github-actions bot commented Mar 13, 2023

Avoid recompiling inlined overload impl #6856

Avoid recompiling inlined overload impl #6856

Conversation

ehsantn commented Mar 23, 2021

sklam commented Mar 23, 2021

esc commented Mar 29, 2021

ehsantn commented Mar 31, 2021

stuartarchibald commented Apr 9, 2021

stuartarchibald commented Apr 9, 2021

github-actions bot commented Mar 13, 2023