Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue #528

Closed
splhack opened this issue Jan 12, 2020 · 4 comments · Fixed by #530
Closed

Performance issue #528

splhack opened this issue Jan 12, 2020 · 4 comments · Fixed by #530

Comments

@splhack
Copy link
Contributor

splhack commented Jan 12, 2020

I'm observing performance regression on master. master takes 196 seconds to compile, although a bit old hwtypes2 takes 3 seconds. will bisect later, or create a minimal repro code. probably related to namedtuple vs Product.

Versions

  • Python 3.7.5 (ubuntu 18.04, python3.7 package)
  • self build coreir (C++) from github master, should be same as binary wheel
  • pip3 install -e . with github master
    • coreir 2.0.29
    • fault 2.0.22
    • hwtypes 1.3.1
    • loam 0.1a0
    • mantle 1.0.12
    • pyverilog 1.2.0

compile.py

set_mantle_target("coreir")
main = getattr(__import__("top"), "main")
compile("build/top", main, output="coreir-verilog", verilator_debug=True)

python3 -m cProfile compile.py with a bit old hwtypes2

result

         5286265 function calls (5046047 primitive calls) in 3.070 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
975501/945911    0.081    0.000    0.218    0.000 {built-in method builtins.isinstance}
523332/523314    0.080    0.000    0.083    0.000 {built-in method builtins.hasattr}
   504223    0.065    0.000    0.090    0.000 inspect.py:63(ismodule)
   456967    0.027    0.000    0.027    0.000 {method 'get' of 'dict' objects}
   163085    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
   129510    0.021    0.000    0.024    0.000 {built-in method builtins.getattr}
   126032    0.013    0.000    0.013    0.000 array.py:81(N)
121490/119523    0.008    0.000    0.010    0.000 {built-in method builtins.len}
    90873    0.008    0.000    0.008    0.000 array.py:85(T)
    90422    0.007    0.000    0.007    0.000 digital.py:125(__eq__)
    85449    0.026    0.000    0.126    0.000 {built-in method _abc._abc_instancecheck}
    85449    0.012    0.000    0.138    0.000 abc.py:137(__instancecheck__)0

python3 -m cProfile compile.py with master

result

         244539593 function calls (182472917 primitive calls) in 196.085 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
32132748/17026257    4.450    0.000   20.546    0.000 {built-in method builtins.isinstance}
 31368834    6.167    0.000   11.556    0.000 {built-in method builtins.issubclass}
16505762/16472856    2.277    0.000    3.285    0.000 {built-in method _abc._abc_subclasscheck}
16505762/16472856    2.262    0.000    5.543    0.000 abc.py:141(__subclasscheck__)
 15427462    1.639    0.000    1.639    0.000 adt_meta.py:248(fields)
 15257154    2.707    0.000    4.333    0.000 adt_meta.py:257(is_bound)
 15056698    5.106    0.000    5.110    0.000 weakref.py:134(__getitem__)
 15042191    3.754    0.000   16.967    0.000 typing.py:710(__instancecheck__)
 15042191    4.791    0.000   13.213    0.000 typing.py:713(__subclasscheck__)
15042063/17023    9.367    0.000  192.463    0.011 adt_meta.py:66(__getitem__)
15042063/17023    6.510    0.000  192.477    0.011 adt_meta.py:290(__getitem__)
15031417/17023    4.848    0.000  192.427    0.011 tuple.py:61(_from_idx)
15014394/13679410    4.816    0.000   60.835    0.000 tuple.py:78(<genexpr>)
  1593088    0.245    0.000    1.934    0.000 tuple.py:39(<genexpr>)
  1545490    0.096    0.000    0.096    0.000 digital.py:116(__eq__)
  1538891    0.134    0.000    0.134    0.000 array.py:88(N)
  1500788    0.113    0.000    0.113    0.000 array.py:92(T)
733659/727392    0.578    0.000    0.907    0.000 array.py:112(__eq__)
526863/526845    0.086    0.000    0.088    0.000 {built-in method builtins.hasattr}
   506267    0.069    0.000    0.093    0.000 inspect.py:63(ismodule)
   459001    0.027    0.000    0.027    0.000 {method 'get' of 'dict' objects}
361524/50216    1.354    0.000   62.162    0.001 {built-in method builtins.any}
   332334    0.026    0.000    0.026    0.000 {method 'append' of 'list' objects}
   171416    0.016    0.000    0.016    0.000 adt_meta.py:224(<lambda>)
   161170    0.029    0.000    0.036    0.000 {built-in method builtins.getattr}
143727/141760    0.014    0.000    0.016    0.000 {built-in method builtins.len}
    85449    0.056    0.000    0.282    0.000 {built-in method _abc._abc_instancecheck}
    85449    0.013    0.000    0.295    0.000 abc.py:137(__instancecheck__)
    73645    0.011    0.000    0.015    0.000 ast.py:181(iter_fields)
@splhack
Copy link
Contributor Author

splhack commented Jan 12, 2020

bisected. last known good commit (no errors and no performance regression) is 7a73ef4.
70bbf54 took 168 seconds without error. this is the first commit compile works without error after 7a73ef4.
seems like f56a1eb introduced performance regression and compile error. It took 188 seconds, also this error, which was not repro on 70bbf54.

Traceback (most recent call last):
  File "compile.py", line 5, in <module>
    compile("build/top", main, output="coreir-verilog", verilator_debug=True)
  File "magma/magma/compile.py", line 42, in compile
    uniquification_pass(main, opts.get("uniquify", "UNIQUIFY"))
  File "magma/magma/uniquification.py", line 99, in uniquification_pass
    pass_.run()
  File "magma/magma/uniquification.py", line 63, in run
    super().run()
  File "magma/magma/passes/passes.py", line 62, in run
    self._run(self.main)
  File "magma/magma/passes/passes.py", line 53, in _run
    self._run(inst_defn)
  File "magma/magma/passes/passes.py", line 53, in _run
    self._run(inst_defn)
  File "magma/magma/passes/passes.py", line 53, in _run
    self._run(inst_defn)
  [Previous line repeated 1 more time]
  File "magma/magma/passes/passes.py", line 59, in _run
    self(defn)
  File "magma/magma/uniquification.py", line 46, in __call__
    key = _hash(definition)
  File "magma/magma/uniquification.py", line 33, in _hash
    hash_struct = _make_hash_struct(definition)
  File "magma/magma/uniquification.py", line 26, in _make_hash_struct
    repr_ = repr(definition)
  File "magma/magma/circuit.py", line 141, in __repr__
    s += repr(cls.interface)
  File "magma/magma/interface.py", line 57, in __repr__
    output = input.value()
  File "magma/magma/tuple.py", line 271, in value
    return tuple_(dict(zip(self.keys(),ts)))
  File "magma/magma/tuple.py", line 436, in tuple_
    return type("anon", (t,), decl)(*args)
  File "magma/magma/tuple.py", line 105, in __call__
    obj = cls.__new__(cls, *args)
  File "hwtypes/hwtypes/adt.py", line 19, in __new__
    return cls[idx].__new__(cls[idx], *value)
  File "hwtypes/hwtypes/adt_meta.py", line 294, in __getitem__
    return super().__getitem__(idx)
  File "hwtypes/hwtypes/adt_meta.py", line 75, in __getitem__
    return cls._from_idx(idx)
  File "magma/magma/tuple.py", line 85, in _from_idx
    t = mcs(class_name, bases, {'__module__' : cls.__module__}, fields=idx)
  File "magma/magma/tuple.py", line 45, in __new__
    t = type.__new__(mcs, name, bases, namespace, **kwargs)
TypeError: Cannot create a consistent method resolution
order (MRO) for bases Tuple[Digital, Digital, Digital, (omit)

@splhack
Copy link
Contributor Author

splhack commented Jan 12, 2020

reverting these 4 commits fixed performance regression. No pytest errors (421 passed, 7 skipped in 16.91s), yes test_namedtuple_seq works.
The diff from master is https://gist.github.com/splhack/55c720be34a0aba8d71c37af22a9c11f

git revert -n 70bbf54635866eb8e061fb6000d716b1df5abcf6
git revert -n eaa24e90df730cf8e084b8bddaa2843b01778dce
git revert -n 13733f4046aa035bc1492ba216608a628d7733d8
git revert -n f56a1eb23aa5a84441f9ef64823cfad950e4a731

@leonardt
Copy link
Collaborator

Thanks for tracking this down, it's an interesting issue. I agree that it's likely an issue with the Product type logic. I'll have to do some more investigation to see what the exact issue is.

Is it possible to share you top file with the circuit definition so I can take a look at it? Or at least the types that are used? It may have to do with nested types (my guess looking at the profile log, it seems that it's spending a lot of time the inspect module, and adt_meta and tuple.py). It may be that we aren't cacheing when we should be so it's calling _from_idx many times to redefine the same type.

@splhack
Copy link
Contributor Author

splhack commented Jan 13, 2020

Ok, I found a minimal repro code.

import magma as m


class TestType(m.Product):
    V1 = m.UInt[32]
    V2 = m.UInt[32]
    V3 = m.UInt[32]
    V4 = m.UInt[32]
    V5 = m.UInt[32]
    V6 = m.UInt[32]
    V7 = m.UInt[32]


@m.circuit.sequential(async_reset=False)
class TestLogic:
    def __init__(self):
        self.V1: m.UInt[32] = 0
        self.V2: m.UInt[32] = 0
        self.V3: m.UInt[32] = 0
        self.V4: m.UInt[32] = 0
        self.V5: m.UInt[32] = 0
        self.V6: m.UInt[32] = 0
        self.V7: m.UInt[32] = 0

    def __call__(self, I: TestType, SEL: m.Bit) -> TestType:
        t = m.namedtuple(
            V1=self.V1,
            V2=self.V2,
            V3=self.V3,
            V4=self.V4,
            V5=self.V5,
            V6=self.V6,
            V7=self.V7,
        )
        if SEL:
            new_t = I
        else:
            new_t = t

        self.V1 = new_t.V1
        self.V2 = new_t.V2
        self.V3 = new_t.V3
        self.V4 = new_t.V4
        self.V5 = new_t.V5
        self.V6 = new_t.V6
        self.V7 = new_t.V7
        return new_t


m.compile("build/top", TestLogic, output="coreir-verilog")
  • 7a73ef4

    /usr/bin/time python3 test.py
    0.43user 0.00system 0:00.44elapsed 100%CPU (0avgtext+0avgdata 61080maxresident)k

  • 6d6a9d0 (master)

    /usr/bin/time python3 test.py
    375.29user 0.19system 6:15.49elapsed 99%CPU (0avgtext+0avgdata 162344maxresident)k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants