Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid recompiling inlined overload impl #6856

Closed
wants to merge 1 commit into from

Conversation

ehsantn
Copy link
Collaborator

@ehsantn ehsantn commented Mar 23, 2021

We use inline="always" for most of our overloads, and found that implementations being compiled over and over again during typing is a major compilation time bottleneck. Avoiding recompilation by this PR saves around 40% of total compilation time in my testing.

@sklam
Copy link
Member

sklam commented Mar 23, 2021

@stuartarchibald, tagged you for review since you are more familiar with this part of the code base

@sklam sklam added the Effort - medium Medium size effort needed label Mar 23, 2021
@stuartarchibald stuartarchibald added this to the PR Backlog milestone Mar 24, 2021
@esc
Copy link
Member

esc commented Mar 29, 2021

@ehsantn @farahhariri maybe it would be good to add a reproducer or a benchmark to make it easier for reviewers to verify?

@ehsantn
Copy link
Collaborator Author

ehsantn commented Mar 31, 2021

Below is a benchmark that requires installing bodo 2021.2: conda install bodo=2021.2 -c bodo.ai -c conda-forge

import numpy as np
import pandas as pd
import bodo
import numba

n = 10
func_text = "def f(A):\n"
func_text += "  a = 1\n"
for i in range(n):
    func_text += "  a += A.sum()\n"
func_text += "  return a\n"

loc_vars = {}
exec(func_text, {"np": np, "pd": pd}, loc_vars)
f = loc_vars["f"]
g = bodo.jit(distributed=False)(f)
A = pd.Series([1.1])
print(g(A))
ss = list(list(g.get_metadata().values())[0]['pipeline_times'].values())[0]
t = 0
for c, v in sorted(ss.items(), key=lambda a: a[1].run, reverse=True):
    print(c, v.run)
    t += v.run

print(t)

Without this patch it takes about 3.5s for me (MacBook Pro 2019) but with this patch it takes 3s.

@stuartarchibald
Copy link
Contributor

I think it'd be good to come up with a simple reproducer that is stand alone and uses just Numba, it'll be easier to validate and debug.

@stuartarchibald
Copy link
Contributor

I've tried this:

from numba import njit
from numba.extending import overload
import numpy as np

_GLOBAL = 1234


def _gen_involved():
    _FREEVAR = 0xCAFE

    def foo(a, b, c=12, d=1j, e=None):
        f = a + b
        a += _FREEVAR
        g = np.zeros(c, dtype=np.complex64)
        h = f + g
        i = 1j / d
        # For SSA, zero init, n and t
        n = 0
        t = 0
        if np.abs(i) > 0:
            k = h / i
            l = np.arange(1, c + 1)
            m = np.sqrt(l - g) + e * k
            if np.abs(m[0]) < 1:
                for o in range(a):
                    n += 0
                    if np.abs(n) < 3:
                        break
                n += m[2]
            p = g / l
            q = []
            for r in range(len(p)):
                q.append(p[r])
                if r > 4 + 1:
                    s = 123
                    t = 5
                    if s > 122 - c:
                        t += s
                t += q[0] + _GLOBAL

        return f + o + r + t + r + a + n

    return foo

def fortran(a, b, c=12, d=1j, e=None):
    pass

impl = _gen_involved()
overload(fortran, inline='always')(lambda a, b, c=12, d=1j, e=None: impl)

tmp = "def call_many_involved():\n\ta = 0\n"

buf = ['\ta += fortran(1, 1, 1, 1, 1)' for _ in range(10)]
tmp += '\n'.join(buf)
tmp += '\n\treturn a'
l = {}
exec(tmp, {'np': np, 'fortran': fortran}, l)

fn = njit(l['call_many_involved'])
fn()
ss = list(list(fn.get_metadata().values())[0]['pipeline_times'].values())[0]
t = 0
for c, v in sorted(ss.items(), key=lambda a: a[1].run, reverse=True):
    print(c, v.run)
    t += v.run

print(t)

but it seems to not be demonstrating the effect described.

@stuartarchibald stuartarchibald added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Apr 23, 2021
@github-actions
Copy link

This pull request is marked as stale as it has had no activity in the past 3 months. Please respond to this comment if you're still interested in working on this. Many thanks!

@github-actions github-actions bot added the stale Marker label for stale issues. label Mar 13, 2023
@github-actions github-actions bot added the abandoned - stale PRs automatically closed due to stale status label Mar 20, 2023
@github-actions github-actions bot closed this Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on author Waiting for author to respond to review abandoned - stale PRs automatically closed due to stale status Effort - medium Medium size effort needed stale Marker label for stale issues.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants