Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lru_cache on parse for observe to speed up instantiation time #1344

Merged
merged 2 commits into from
Nov 11, 2020

Conversation

aaronayres35
Copy link
Contributor

@aaronayres35 aaronayres35 commented Nov 9, 2020

related to #1316

Checklist
- [ ] Tests
- [ ] Update API reference (docs/source/traits_api_reference)
- [ ] Update User manual (docs/source/traits_user_manual)
- [ ] Update type annotation hints in traits-stubs

This is an alternate approach to what is done in #1343. Here rather than moving the parsing logic into the decorator, we simply cache the results of parsing so that if many instantiations are done, we don't need to re-parse the expression every time. This solution gives a comparable decrease in case (2) time to instantaite, without increasing the time it takes to construct the class (1). See benchmark reports with n=100000 below:

On master:

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.84740 1.00 (2): 0.14315 1.00 (3): 0.00659 1.00
@on_trait_change('name') -
(1): 9.19530 1.04 (2): 0.71941 5.03 (3): 0.00672 1.02
@observe('name') -
(1): 8.91179 1.01 (2): 6.65963 46.52 (3): 0.00647 0.98
@observe(trait('name')) -
(1): 9.09999 1.03 (2): 2.89405 20.22 (3): 0.00676 1.03


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 9.43381 1.00 (2): 0.28772 1.00 (3): 0.14178 1.00 (4): 0.00701 1.00
@on_trait_change('child.name') -
(1): 9.43325 1.00 (2): 9.59384 33.34 (3): 2.50125 17.64 (4): 0.00989 1.41
@observe('child.name') -
(1): 9.53141 1.01 (2): 16.08950 55.92 (3): 6.59454 46.51 (4): 0.00783 1.12
@observe(trait('child').trait('name')) -
(1): 10.35650 1.10 (2): 7.84243 27.26 (3): 6.84757 48.30 (4): 0.00743 1.06


Given a subclass of HasTraits with a trait that is defined as Property
that depends on a simple trait, compare having the Property be defined
as Property(), Property(depends_on="name"), Property(observe="name"),
and Property(observe=trait('name')). (Note that this is a feature
currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.54505 1.00 (2): 0.13050 1.00 (3): 0.00500 1.00
depends_on='name' -
(1): 14.24326 0.98 (2): 6.05126 46.37 (3): 0.00699 1.40
observe='name' -
(1): 14.86066 1.02 (2): 6.35408 48.69 (3): 0.00582 1.16
observe=trait('name') -
(1): 15.03425 1.03 (2): 2.80149 21.47 (3): 0.00625 1.25


Identical to the simple property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.31380 1.00 (2): 0.12962 1.00 (3): 0.00494 1.00
depends_on='name' -
(1): 14.46543 1.01 (2): 12.26901 94.66 (3): 0.00716 1.45
observe='name' -
(1): 16.66431 1.16 (2): 6.47019 49.92 (3): 0.00660 1.34
observe=trait('name') -
(1): 16.14596 1.13 (2): 3.07010 23.69 (3): 0.00633 1.28


Given a subclass of HasTraits with a trait that is defined as Property
that depends on an extended trait, compare having the Property be
defined as Property(), Property(depends_on="child.name"),
Property(observe="child.name"), and
Property(observe=trait('child').trait('name')). (Note that this is a
feature currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 16.51180 1.00 (2): 0.29474 1.00 (3): 0.13879 1.00 (4): 0.00637 1.00
depends_on='child.name' -
(1): 15.35540 0.93 (2): 9.91735 33.65 (3): 2.11387 15.23 (4): 0.00843 1.32
observe='child.name' -
(1): 15.38781 0.93 (2): 15.66122 53.14 (3): 6.44849 46.46 (4): 0.00780 1.22
observe=trait('child').trait('name') -
(1): 16.48509 1.00 (2): 7.87570 26.72 (3): 6.81021 49.07 (4): 0.00761 1.19


Identical to the extended property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.64841 1.00 (2): 0.27936 1.00 (3): 0.13353 1.00 (4): 0.00595 1.00
depends_on='child.name' -
(1): 15.42324 0.99 (2): 20.08643 71.90 (3): 4.19624 31.43 (4): 0.01049 1.76
observe='child.name' -
(1): 16.19206 1.03 (2): 15.96901 57.16 (3): 6.68527 50.07 (4): 0.00805 1.35
observe=trait('child').trait('name') -
(1): 16.36513 1.05 (2): 7.72487 27.65 (3): 6.68223 50.04 (4): 0.00738 1.24


On this branch:

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.17364 1.00 (2): 0.12738 1.00 (3): 0.00683 1.00
@on_trait_change('name') -
(1): 8.83223 1.08 (2): 0.74051 5.81 (3): 0.00617 0.90
@observe('name') -
(1): 8.87711 1.09 (2): 3.05000 23.94 (3): 0.00700 1.02
@observe(trait('name')) -
(1): 9.21969 1.13 (2): 2.97178 23.33 (3): 0.00632 0.93


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 9.48384 1.00 (2): 0.28387 1.00 (3): 0.12588 1.00 (4): 0.00628 1.00
@on_trait_change('child.name') -
(1): 9.63600 1.02 (2): 9.65629 34.02 (3): 2.60447 20.69 (4): 0.00977 1.55
@observe('child.name') -
(1): 9.63945 1.02 (2): 8.05027 28.36 (3): 6.76104 53.71 (4): 0.00859 1.37
@observe(trait('child').trait('name')) -
(1): 10.45232 1.10 (2): 8.45396 29.78 (3): 7.92926 62.99 (4): 0.00750 1.19


Given a subclass of HasTraits with a trait that is defined as Property
that depends on a simple trait, compare having the Property be defined
as Property(), Property(depends_on="name"), Property(observe="name"),
and Property(observe=trait('name')). (Note that this is a feature
currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.50994 1.00 (2): 0.13294 1.00 (3): 0.00452 1.00
depends_on='name' -
(1): 14.42461 0.99 (2): 6.18670 46.54 (3): 0.00623 1.38
observe='name' -
(1): 15.31489 1.06 (2): 2.93529 22.08 (3): 0.00556 1.23
observe=trait('name') -
(1): 15.02808 1.04 (2): 2.90651 21.86 (3): 0.00618 1.37


Identical to the simple property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.41725 1.00 (2): 0.12930 1.00 (3): 0.00509 1.00
depends_on='name' -
(1): 14.60439 1.01 (2): 12.38442 95.78 (3): 0.00686 1.35
observe='name' -
(1): 15.16266 1.05 (2): 2.93375 22.69 (3): 0.00617 1.21
observe=trait('name') -
(1): 15.26573 1.06 (2): 2.85343 22.07 (3): 0.00622 1.22


Given a subclass of HasTraits with a trait that is defined as Property
that depends on an extended trait, compare having the Property be
defined as Property(), Property(depends_on="child.name"),
Property(observe="child.name"), and
Property(observe=trait('child').trait('name')). (Note that this is a
feature currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.36892 1.00 (2): 0.27916 1.00 (3): 0.13284 1.00 (4): 0.00653 1.00
depends_on='child.name' -
(1): 15.43856 1.00 (2): 10.26390 36.77 (3): 2.20055 16.57 (4): 0.00879 1.35
observe='child.name' -
(1): 16.02348 1.04 (2): 7.97794 28.58 (3): 6.91317 52.04 (4): 0.00817 1.25
observe=trait('child').trait('name') -
(1): 16.53586 1.08 (2): 7.83010 28.05 (3): 6.83465 51.45 (4): 0.00754 1.15


Identical to the extended property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.75994 1.00 (2): 0.27890 1.00 (3): 0.13005 1.00 (4): 0.00641 1.00
depends_on='child.name' -
(1): 15.57283 0.99 (2): 20.48732 73.46 (3): 4.33380 33.32 (4): 0.00860 1.34
observe='child.name' -
(1): 16.04158 1.02 (2): 8.01820 28.75 (3): 6.89069 52.99 (4): 0.00805 1.26
observe=trait('child').trait('name') -
(1): 16.58640 1.05 (2): 7.91414 28.38 (3): 6.77448 52.09 (4): 0.00772 1.21


@aaronayres35
Copy link
Contributor Author

see #1343 (comment)

Im copying this comment here to continue the discussion on this PR. I realize the original question I wrote wasn't great. The functools.lru_cache docs say "Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls.". So the cache will hold 128 calls to the function, i.e. in response to:

Perhaps we could try to work out how many ObserverExpression the cache can hold

I believe it can hold maxsize ObserverExpression's, or more specifically it can hold the returned result of 128 different text inputs to the parse function.

I'm still unsure what a realistic number for maxsize would be in practice here

@kitchoi
Copy link
Contributor

kitchoi commented Nov 10, 2020

Thanks. For some reason I thought the limit unit was in memory space, sorry about that. What you said makes sense.
Then we should probably estimate how much memory a full cache is going to take? Even if it should result in no action (as in, it does not take a good deal of memory), it would be good to know for future reference. And if it should take up a lot of space, we might want to look into optimizing memory usage.

@aaronayres35
Copy link
Contributor Author

Then we should probably estimate how much memory a full cache is going to take? Even if it should result in no action (as in, it does not take a good deal of memory), it would be good to know for future reference. And if it should take up a lot of space, we might want to look into optimizing memory usage.

@kitchoi an instance of ObserverExpression is 56 bytes (according to sys.getsizeof, note from the docs: "Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.")

$ python
>>> import sys
>>> from traits.observation.expression import ObserverExpression
>>> sys.getsizeof(ObserverExpression())
56

likewise a sanity check of calling sys.getsizeof on what is returned by the parse method when running the benchmarking script with n=1 also gives 56 bytes.

As far as the full memory usage of the cache I'm not certain, as I don't know exactly what all the cache is storing (it has to at least also store the inputs to the function, I believe it keeps a dictionary of the form {input: output}). Based on this stack exchange discussion https://stackoverflow.com/questions/23477284/memory-aware-lru-caching-in-python it seems we could get a better estimate using psutil, but I am not sure if it is worth it to investigate. Another quick check I did was the following:

$ python
>>> import sys
>>> from traits.observation.expression import ObserverExpression
>>> sys.getsizeof({'some.string': ObserverExpression()})
240

@kitchoi
Copy link
Contributor

kitchoi commented Nov 10, 2020

I just tried to analyze the number of unique names used in a few ETS-using project using this script:

AST analyser

import ast
import os


_ON_TRAIT_CHANGE = "on_trait_change"
_OBSERVER = "observe"
_ALL_NAMES = (_ON_TRAIT_CHANGE, _OBSERVER)


class ImportAnalyzer(ast.NodeVisitor):
    """ NodeVisitor for analyzing an AST."""

    def __init__(self, filepath):
        self.filepath = filepath
        self.names = set()

    def visit_Call(self, node):
        """ Handle all calls, which include both decorator and method calls,
        and more...
        """
        if isinstance(node.func, ast.Name):
            # Decorator
            if node.func.id in _ALL_NAMES:
                self._add_if_node_is_constant(node.args[0])

        if isinstance(node.func, ast.Attribute):
            if node.func.attr == _ON_TRAIT_CHANGE:
                # instance method HasTraits.on_trait_change
                self._handle_method_call(node, "name")
            if node.func.attr == _OBSERVER:
                # instance method HasTraits.observe
                self._handle_method_call(node, "expression")

    def _handle_method_call(self, node, keyword_name):
        """ Handle HasTraits.observe or HasTraits.on_trait_change method call.

        Parameters
        ----------
        node : ast.Call
        keyword_name : str
            Name for the keyword argument, i.e. 'name' for on_trait_change,
            'expression' for observe.
        """
        if len(node.args) < 2:
            # Then there may be keyword arguments
            for keyword in node.keywords:
                if keyword.arg == keyword_name:
                    self._add_if_node_is_constant(keyword.value)
                    break
            else:
                # This may be the anytrait support in on_trait_change
                print(
                    f"Ignoring occurrence at "
                    f"{self.filepath}:{node.lineno}:{node.col_offset}"
                )
        else:
            self._add_if_node_is_constant(node.args[1])

    def _add_if_node_is_constant(self, node):
        """ For a given node that represents either the name or the expression
        given to on_trait_change and observe (respectively), extract the string
        used and add it to the names on this object.
        """
        if isinstance(node, ast.Constant):
            self.names.add(node.value)
        elif isinstance(node, ast.List):
            # Overloaded signature
            for item in node.elts:
                self._add_if_node_is_constant(item)
        elif isinstance(node, ast.Call):
            # this is the expression object for observe,
            # e.g. @observe(trait('name'))
            # This will be a dynamically generated value. Give up.
            print(
                f"Ignoring occurrence at "
                f"{self.filepath}:{node.lineno}:{node.col_offset}"
            )
        else:
            # This will be a dynamically generated value. Give up.
            print(
                f"Ignoring occurrence at "
                f"{self.filepath}:{node.lineno}:{node.col_offset}"
            )


def analyze(filepath):
    """ Return all the distinct names given to on_trait_change or observe

    Note that dynamically created names are not captured.

    Parameters
    ----------
    filepath : str
        Path to a python file.

    Returns
    -------
    names : set
        Set of names.
    """
    with open(filepath, "r", encoding="utf-8") as fp:
        content = fp.read()
    try:
        tree = ast.parse(content)
    except SyntaxError:
        print(f"SyntaxError in {filepath}")
        return set()
    analyzer = ImportAnalyzer(filepath)
    analyzer.visit(tree)
    return analyzer.names


def is_python_file(filename):
    """ Return true if the file is a Python file."""
    if filename.endswith('.py'):
        return True

    try:
        with open(filename, 'rb') as f:
            text = f.read(128)
            if not text:
                return False
    except IOError:
        return False


def iter_source_paths(path):
    """ Recursively discover Python files in the given path and yield
    the path for each file.

    If the given path is already a Python file, yield the path as is.
    """
    if os.path.isdir(path):
        for dirpath, dirnames, filenames in os.walk(path):
            for filename in filenames:
                full_path = os.path.join(dirpath, filename)
                if is_python_file(full_path):
                    yield full_path

    if os.path.isfile(path) and is_python_file(path):
        yield path


if __name__ == "__main__":
    """ Usage: python ast_analyzer.py path [path, ...] """

    import sys

    names = set()
    for path in sys.argv[1:]:
        for source_file in iter_source_paths(path):
            names |= analyze(source_file)
    print("Got these names", names)
    print("Total number of distinct names: ", len(names))

For example, I ran the above like this (having checked out ETS source code):

python ast_analysis.py traitsui/traitsui pyface/pyface envisage/envisage apptools/apptools mayavi/mayavi chaco/chaco enable/enable

For the ETS-using projects, the larger number of unique names I come across was 140.

But the ETS stack (that is, all of traitsui, pyface, envisage, apptools and mayavi code), add another 250 unique names (quite a lot of that could be coming from unused code, e.g. wx specific code vs qt specific code).

On the other hand, the ObserverExpression objects take up very little space (as @aaronayres35 has reported). Even taking into account the overhead of managing the LRU cache, 2048-100000 objects in the cache add about 10-30MB to memory usage, and probably most of that is the cache overhead (or other system activities).

For the latter, I ran this script and compared the result with and without the @lru_cache(maxsize=None) decorator (I changed maxsize so I get a larger value to compare).

Ad-hoc script to measure memory usage (worse case scenario: no hits, all misses)
import psutil
from traits.api import HasTraits, Str

class Foo(HasTraits):
    pass


names = ["name{}".format(i) for i in range(100000)]

for name in names:
    Foo.add_class_trait(name, Str())

foo = Foo()

def connect_observer():
    for name in names:
        foo.observe(print, name)


start = psutil.virtual_memory()
connect_observer()
end = psutil.virtual_memory()

# Print usage in MB
print((start.available - end.available) / 1024.0 / 1024.0)

We can probably afford a larger cache? 500? or 1000? (change it to the nearest power-of-two, of course)

@mdickinson
Copy link
Member

an instance of ObserverExpression is 56 bytes

Don't be fooled! It's much more than that. As you say, this doesn't take into account any of the attributes - it's just the object structure itself. That 56 bytes maps as something like 32 bytes for GC header, 8 bytes for type pointer, 8 bytes for reference count, 8 bytes for pointer to the __dict__, but it doesn't include the size of the __dict__ itself.

In Python 3.6, even an empty dict uses 240 bytes (though in Python 3.8+ it's smaller):

Python 3.6.12 (default, Sep  6 2020, 12:48:16) 
[GCC 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getsizeof({})
240

You're likely looking at around 1000 bytes for a simple ObserverExpression. (The way to find out would be to generate millions of "typical" ObserverExpression objects and see how the memory use reported by something like top changes.)

We also need to remember that for a while yet we're still going to have people wanting to use Traits on crazy-old-and-slow systems (ex: 32-bit Windows 7), so we probably want to be fairly conservative with the cache size. maxsize=128 feels about right to me.

@mdickinson
Copy link
Member

A quick not-very-scientific test for ObserverExpression size: after doing

>>> from traits.observation.api import parse
>>> exprs = [parse("bob.alice.items") for _ in range(10**6)]

my Python process memory usage (as reported by top) jumps from not very much to approximately 2.9 GB. So I that's somewhere in the region of 2000-3000 bytes per not-too-simple not-too-complicated ObserverExpression instance.

@mdickinson
Copy link
Member

This LGTM. I'd suggest doing this in addition to whatever else we do to speed up the parsing, rather than as an alternative. It's a cheap and easy solution that provides an immediate benefit, but the downside is that we get a potentially surprising sudden change in performance characteristics if/when the cache size is maxed out (so we're parsing 129 expressions repeatedly rather than <= 128). This isn't about the particular number "128" - we'll have that behaviour change at some point no matter what the limit is. My guess is that we may not ever actually hit that behaviour change, but if some project does, then it'll be somewhat painful to dig out the cause of the sudden performance regression. So we should still do what we can to speed things up even in the absence of a cache.

On the limit: I'd expect that 128 is plenty big enough given expected usage patterns. Our goal shouldn't be to have every pattern ever parsed in the cache, but just the recently used ones, on the basis that the problematic patterns involve creating lots of objects of the same type, usually in quick succession.

@kitchoi
Copy link
Contributor

kitchoi commented Nov 11, 2020

For future reference, my script in comment had a little bug: It uses the full system memory (using psutil.virtual_memory) instead of just the memory used by the process (psutil.Process.memory_info). So I was probably seeing more influences from other processes on the machine. It is fixed here:

Updated script to measure memory usage
import psutil
from traits.api import HasTraits, Str, Instance

class Foo(HasTraits):
    pass


names = ["name{}".format(i) for i in range(1000000)]

for name in names:
    Foo.add_class_trait(name, Str())
    Foo.add_class_trait(name + "_child", Instance(Foo))

foo = Foo()

def connect_observer():
    for name in names:
        foo.observe(print, name + "_child." + name)


psutil_process = psutil.Process()

start = psutil_process.memory_info()
connect_observer()
end = psutil_process.memory_info()

# Print usage in MB
print((end.vms - start.vms) / 1024.0 / 1024.0)

If I tried 1000000 objects that gives me ~3.6GB memory usage, which in order of magnitude in agreement with @mdickinson comment here.

@@ -254,6 +254,7 @@ def _handle_tree(tree, default_notifies=None):
tree.children, default_notifies=default_notifies)


@lru_cache(maxsize=128)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One (non-blocking) suggestion: it's probably worth pulling the 128 out into a (private) named constant, and defining that constant near the top of the file with a comment line explaining what it's for. E.g., something like:

#: Maximum number of parsed observation patterns stored in the LRU cache
_OBSERVE_PATTERN_CACHE_MAXSIZE = 128

(but I haven't put any thought into a good name, so feel free to come up with something better)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. It's really an expression cache, not a pattern cache. Given that it's a private variable and that the observation context should be clear, maybe _EXPRESSION_CACHE_MAXSIZE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too prefer a module level constant (the truly desperate can change the value if they needed to).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the module level constant, unfortunately since the cache is created via a decorator at import time (I think), there is no way to change it at run time by changing the constant - you'd need to patch the source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no way to change it at run time by changing the constant - you'd need to patch the source.

Ah yes you are right. Nevermind that then, let's leave that to the truly desperate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes, I made this change locally but forgot to push... I will open a trivial separate PR adding the module level constant

@kitchoi
Copy link
Contributor

kitchoi commented Nov 11, 2020

If I tried 1000000 objects that gives me ~3.6GB memory usage, which in order of magnitude in agreement with @mdickinson comment here.

Oops, I made a mistake there, yet again. Sorry about that. I should compare the difference with and without the @lru_cache, otherwise the number includes a whole bunch of objects that would be there anyway even without the @lru_cache.

The difference for 1000000 distinct expressions between with and without @lru_cache is more like 500MB, not 3GB.

This information was evaluated on Python 3.9, OSX 10.15.

(Just wanted to correct the data for the future reference.)

Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
@aaronayres35
Copy link
Contributor Author

This LGTM. I'd suggest doing this in addition to whatever else we do to speed up the parsing, rather than as an alternative.

My only question here is in regards to #1343. With that PR, we parse at decoration time, and not at instantiation time. This led to a comparable speed up for the time to instantiate as this PR, but a corresponding slow down in construction time. In theory every time an object is imported we expect it to be instantiated >=1 times so that should still be a net benefit. However, in that scenario I'm imagining the cache then has a very negligible impact. As the only time the cache would be used is if 2 different classes happened to have traits being listened to of the same name (i.e. resulting in the same string being passed to the parse function).

I guess as I am writing this comment I don't really see a cost to doing them both, so it makes sense to do them in addition to each other. In either case both solutions are just trying to decrease the number of times we parse in total, and neither makes the other have to parse more

@mdickinson
Copy link
Member

mdickinson commented Nov 11, 2020

I think it's worth doing both. I'm not too worried about the slowdown in construction time.

Do you have measurements indicating what kind of time we're talking about here? For example, on a typical application codebase, what sort of total impact on application startup time would you expect? I'd expect it to be on the order of a fraction of a second, but if it's more than that then we might need to dig deeper. Parsing these expressions really shouldn't be that big of a deal.

It's possible that the Lark package is giving us a really inefficient parser - I've never looked at the generated code. But we'd need to be sure that that's the case before doing anything drastic like writing our own parser, or using a different parser generator.

@aaronayres35
Copy link
Contributor Author

aaronayres35 commented Nov 11, 2020

I think it's worth doing both. I'm not too worried about the slowdown in construction time.

Do you have measurements indicating what kind of time we're talking about here? For example, on a typical application codebase, what sort of total impact on application startup time would you expect? I'd expect it to be on the order of a fraction of a second, but if it's more than that then we might need to dig deeper. Parsing these expressions really shouldn't be that big of a deal.

Currently the only measurements I have are the results of the benchmarking script (see the PR description on #1343). I can try it with an application though to see the impact on application startup time.

@kitchoi
Copy link
Contributor

kitchoi commented Nov 11, 2020

I don't know if this would be useful... I updated the first script in this comment to count the number of static decorators, static Property-depends_on and dynamic on_trait_change/observe, and run it against a project building end-user applications + its library stack.

This is the script I currently have:

AST analyzer
import ast
import os


_ON_TRAIT_CHANGE = "on_trait_change"
_OBSERVER = "observe"
_ALL_NAMES = (_ON_TRAIT_CHANGE, _OBSERVER)


class CodeAnalyzer(ast.NodeVisitor):
    """ NodeVisitor for analyzing an AST.

    Parameters
    ----------
    filepath : str
        Path to the Python file, for reporting purposes.
    type_ : str
        One of 'decorator', 'property', 'dynamic'
        decorator for @on_trait_change static usage.
        property for Property(..., depends_on=...) static usage
        dynamic for HasTraits.on_trait_change dynamic usage

    Attributes
    ----------
    unique_names : set
        Set of unique names encountered, e.g. 'child_items,child'
    n_occurrences : int
        Number of all occurrences captured.
    n_ignores : int
        Number of occurrences that where the name cannot be computed because
        it is dynamically composed.
    """

    def __init__(self, filepath, type_):
        self.filepath = filepath
        self.type = type_
        self.unique_names = set()
        self.n_occurrences = 0
        self.n_ignores = 0

    def visit_Call(self, node):
        """ Handle all calls, which include both decorator and method calls,
        and more...
        """
        if self.type == "decorator" and isinstance(node.func, ast.Name):
            # Decorator
            if node.func.id in _ALL_NAMES:
                self._add_if_node_is_constant(node.args[0])

        if self.type == "property" and isinstance(node.func, ast.Name):
            # Property depends_on
            if node.func.id == "Property":
                for keyword in node.keywords:
                    if keyword.arg == "depends_on":
                        self._add_if_node_is_constant(keyword.value)
                        break
                    if keyword.arg == "observe":
                        self._add_if_node_is_constant(keyword.value)
                        break

        if self.type == "dynamic" and isinstance(node.func, ast.Attribute):
            if node.func.attr == _ON_TRAIT_CHANGE:
                # instance method HasTraits.on_trait_change
                self._handle_method_call(node, "name")
            if node.func.attr == _OBSERVER:
                # instance method HasTraits.observe
                self._handle_method_call(node, "expression")

    def _handle_method_call(self, node, keyword_name):
        """ Handle HasTraits.observe or HasTraits.on_trait_change method call.

        Parameters
        ----------
        node : ast.Call
        keyword_name : str
            Name for the keyword argument, i.e. 'name' for on_trait_change,
            'expression' for observe.
        """
        if len(node.args) < 2:
            # Then there may be keyword arguments
            for keyword in node.keywords:
                if keyword.arg == keyword_name:
                    self._add_if_node_is_constant(keyword.value)
                    break
            else:
                # This may be the anytrait support in on_trait_change
                self.n_occurrences += 1
                self.n_ignores += 1
                print(
                    f"Ignoring occurrence at "
                    f"{self.filepath}:{node.lineno}:{node.col_offset}"
                )
        else:
            self._add_if_node_is_constant(node.args[1])

    def _add_if_node_is_constant(self, node):
        """ For a given node that represents either the name or the expression
        given to on_trait_change and observe (respectively), extract the string
        used and add it to the names on this object.
        """
        self.n_occurrences += 1
        if isinstance(node, ast.Constant):
            self.unique_names.add(node.value)
        elif isinstance(node, ast.List):
            # Overloaded signature
            for item in node.elts:
                self._add_if_node_is_constant(item)
        elif isinstance(node, ast.Call):
            # this is the expression object for observe,
            # e.g. @observe(trait('name'))
            # This will be a dynamically generated value. Give up.
            self.n_ignores += 1
            print(
                f"Ignoring occurrence at "
                f"{self.filepath}:{node.lineno}:{node.col_offset}"
            )
        else:
            # This will be a dynamically generated value. Give up.
            self.n_ignores += 1
            print(
                f"Ignoring occurrence at "
                f"{self.filepath}:{node.lineno}:{node.col_offset}"
            )


def analyze(filepath, type_):
    """ Return all the distinct names given to on_trait_change or observe

    Note that dynamically created names are not captured.

    Parameters
    ----------
    filepath : str
        Path to a python file.
    type_ : str
        One of 'decorator', 'property', 'dynamic'
        decorator for @on_trait_change static usage.
        property for Property(..., depends_on=...) static usage
        dynamic for HasTraits.on_trait_change dynamic usage

    Returns
    -------
    names : set
        Set of names.
    num_ignored : int
        Number of ignored nodes.
    """
    with open(filepath, "r", encoding="utf-8") as fp:
        content = fp.read()
    try:
        tree = ast.parse(content)
    except SyntaxError:
        print(f"SyntaxError in {filepath}")
        return set()
    analyzer = CodeAnalyzer(filepath, type_)
    analyzer.visit(tree)
    return analyzer


def is_python_file(filename):
    """ Return true if the file is a Python file."""
    if filename.endswith('.py'):
        return True

    try:
        with open(filename, 'rb') as f:
            text = f.read(128)
            if not text:
                return False
    except IOError:
        return False


def iter_source_paths(path):
    """ Recursively discover Python files in the given path and yield
    the path for each file.

    If the given path is already a Python file, yield the path as is.
    """
    if os.path.isdir(path):
        for dirpath, dirnames, filenames in os.walk(path):
            for filename in filenames:
                full_path = os.path.join(dirpath, filename)
                if is_python_file(full_path):
                    yield full_path

    if os.path.isfile(path) and is_python_file(path):
        yield path


if __name__ == "__main__":
    """ Usage: python ast_analyzer.py path [path, ...] """

    import sys

    names = set()
    n_property = 0
    n_decorator = 0
    n_method = 0
    n_total = 0
    for path in sys.argv[1:]:
        for source_file in iter_source_paths(path):

            analyzer = analyze(source_file, "property")
            names |= analyzer.unique_names
            n_property += analyzer.n_occurrences
            n_total += analyzer.n_occurrences

            analyzer = analyze(source_file, "decorator")
            names |= analyzer.unique_names
            n_decorator += analyzer.n_occurrences
            n_total += analyzer.n_occurrences

            analyzer = analyze(source_file, "dynamic")
            names |= analyzer.unique_names
            n_method += analyzer.n_occurrences
            n_total += analyzer.n_occurrences

    print("Got these names", names)
    print("Total number of distinct names: ", len(names))
    print("Total number of Property depends_on/observe: ", n_property)
    print("Total number of decorator: ", n_decorator)
    print("Total number of dynamic on_trait_change/observe: ", n_method)
    print("Total number of occurrences: ", n_total)

This is how I ran it:

python ast_analyzer.py <project-source-dir> <another-library-source-dir> traitsui/traitsui pyface/pyface envisage/envisage apptools/apptools  mayavi/mayavi/

And this is the result:

Total number of distinct names:  514
Total number of Property depends_on/observe:  261
Total number of decorator:  288
Total number of dynamic on_trait_change/observe:  664
Total number of occurrences:  1213

The number could be used against the benchmark for scaling?

@mdickinson
Copy link
Member

We should really move this discussion to the other PR. (And let's merge this one!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Enthought OSS Q4 2020
Sprint 3 : Nov 2 - Nov 13 2020
Development

Successfully merging this pull request may close these issues.

None yet

4 participants