Skip to content

Commit

Permalink
python: skip work pytest_pycollect_makeitem work on certain names
Browse files Browse the repository at this point in the history
When a Python object (module/class/instance) is collected, for each name
in `obj.__dict__` (and up its MRO) the pytest_pycollect_makeitem hook is
called for potentially creating a node for it.

These Python objects have a bunch of builtin attributes that are
extremely unlikely to be collected. But due to their pervasiveness,
dispatching the hook for them ends up being mildly expensive and also
pollutes PYTEST_DEBUG=1 output and such.

Let's just ignore these attributes.

The list was composed by looking at the attributes of an empty module,
and empty class and an empty instance on CPython 3.8.

On the pandas test suite commit 04e9e0afd476b1b8bed930e47bf60e,
collect only, irrelevant lines snipped, about 5% improvement:

Before:

```
         51195095 function calls (48844352 primitive calls) in 39.089 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
226602/54    0.145    0.000   38.940    0.721 manager.py:90(_hookexec)
    72227    0.285    0.000   20.146    0.000 python.py:424(_makeitem)
    72227    0.171    0.000   16.678    0.000 python.py:218(pytest_pycollect_makeitem)
```

After:

```
          48410921 function calls (46240870 primitive calls) in 36.950 seconds

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 181429/54    0.113    0.000   36.777    0.681 manager.py:90(_hookexec)
     27054    0.130    0.000   17.755    0.001 python.py:465(_makeitem)
     27054    0.121    0.000   16.219    0.001 python.py:218(pytest_pycollect_makeitem)
```
  • Loading branch information
bluetech committed Aug 22, 2020
1 parent d69abff commit d4e7868
Showing 1 changed file with 46 additions and 0 deletions.
46 changes: 46 additions & 0 deletions src/_pytest/python.py
Expand Up @@ -341,6 +341,50 @@ def reportinfo(self) -> Tuple[Union[py.path.local, str], int, str]:
return fspath, lineno, modpath


# As an optimization, these names are pre-ignored when iterating over
# an object during collection -- the pytest_pycollect_makeitem hook is
# not even called for them.
# The list was composed from the attributes of an empty module, and empty
# class and an empty instance on CPython 3.8.
IGNORED_NAMES = frozenset(
(
"__builtins__",
"__cached__",
"__class__",
"__delattr__",
"__dict__",
"__dir__",
"__doc__",
"__eq__",
"__file__",
"__format__",
"__ge__",
"__getattribute__",
"__gt__",
"__hash__",
"__init__",
"__init_subclass__",
"__le__",
"__loader__",
"__lt__",
"__module__",
"__name__",
"__ne__",
"__new__",
"__package__",
"__reduce__",
"__reduce_ex__",
"__repr__",
"__setattr__",
"__sizeof__",
"__spec__",
"__str__",
"__subclasshook__",
"__weakref__",
)
)


class PyCollector(PyobjMixin, nodes.Collector):
def funcnamefilter(self, name: str) -> bool:
return self._matches_prefix_or_glob_option("python_functions", name)
Expand Down Expand Up @@ -402,6 +446,8 @@ def collect(self) -> Iterable[Union[nodes.Item, nodes.Collector]]:
# Note: seems like the dict can change during iteration -
# be careful not to remove the list() without consideration.
for name, obj in list(dic.items()):
if name in IGNORED_NAMES:
continue
if name in seen:
continue
seen.add(name)
Expand Down

0 comments on commit d4e7868

Please sign in to comment.