Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Use a dict (as an ordered set) instead of a list for menus registry #74

Merged
merged 1 commit into from
Nov 9, 2022

Conversation

aganders3
Copy link
Contributor

@aganders3 aganders3 commented Nov 5, 2022

This changes the MenusRegistry class to use a dict in place of a list for the internal storage of menu items. The benefit here is when disposing it does not have to repeatedly call list.remove, which performs a linear search each time. Instead the dict acts as an ordered set and removing items takes constant time.

This is probably not an important optimization in general usage, but I believe it fixes a significant regression in napari test times. I think it's due to this autouse fixture introduced a few weeks ago: https://github.com/napari/napari/blob/1cc90a74211c6cbf3fc644da5866f9752cf1ac78/napari/conftest.py#L448
There may be other ways to fix/workaround it (maybe the fixture just needs better scoping?) so this PR is just meant to start a discussion.

Here's a chunk of some profiling output that pointed me to this bottleneck:
Screen Shot 2022-11-04 at 8 31 11 PM

Here's a link to a successful CI run using this change in my napari fork. CI times are somewhere around 25-30% (~4 min) faster for most platforms (macOS is still ridiculously slow for some reason):
https://github.com/aganders3/napari/actions/runs/3397096945

I saw similar results running tests locally with test time decreasing significantly:

% /usr/bin/time pytest --maxfail=0  # test with released app-model
...
      217.81 real       544.61 user        75.52 sys

% pip install -e ~/src/app-model  # install app-model from this branch
% /usr/bin/time pytest --maxfail=0  # test with app-model from this branch
...
      134.89 real       370.35 user        70.27 sys

@tlambert03 tlambert03 changed the title Use a dict (as an ordered set) instead of a list for menus registry refactor: Use a dict (as an ordered set) instead of a list for menus registry Nov 7, 2022
@tlambert03
Copy link
Member

This is probably not an important optimization in general usage, but I believe it fixes a significant regression in napari test times... There may be other ways to fix/workaround it (maybe the fixture just needs better scoping?) so this PR is just meant to start a discussion.

Thanks for opening the PR @aganders3

I would like to have two parallel discussions about it:

  • one question whether this is really the best way to model the menu registry. I implemented this as a sequence rather than a set, because the vscode menu registry that was the inspiration for this pattern also allows for multiple registrations. At first glance, I find it hard to think of a reason why we would ever need to register the same MenuOrSubmenu twice (so I'm inclined to merge this PR), but I'd us to consider what is lost in the process of making it set-like before doing it just for the sake of napari's performance.

  • the second is to emphasize that napari should use app-model as it wants, but that development of app-model itself should be considered independently of napari (i.e. app-model doesn't exist purely for napari). So, I definitely do think that there are things that could be done with that fixture to improve test performance in napari. That auto-use fixture was a quick-and-dirty approach. You could consider either just not auto-using it (which would probably be the best thing to do), and also consider mocking/patching any of the expensive cleanup that you guys don't care about for the average test case. In other words, take care of napari performance on the napari side, and if you find something that is truly in error in app-model, fix it here.

@aganders3
Copy link
Contributor Author

aganders3 commented Nov 8, 2022

Great points - thank you for the guidance!

Point 1 is good context - I didn't even consider multiple registrations. I will defer to you on whether this is something you want to handle or if this performance benefit is worth the tradeoff (maybe there's some way to have both).

This also prompted a bit more reflection on my implementation here and I see some issues/errors. First, I think I missed at least one type hint. Also, while dictionaries preserve insertion order this is not always intuitive. For example even if we disallow multiple registrations we would want to decide the proper order (i.e. preserve position by the earliest or latest insertion).

I also think it's a little ugly to implement the ordered set this way (dict with None for all values) so perhaps a small OrderedSet class would be worthwhile.

For point 2 I will open an issue in the napari repo for further discussion on the test suite performance.

@aganders3
Copy link
Contributor Author

Here's another idea using a doubly-linked list that maintains the performance benefits while allowing multiple insertions. I'm going to convert the PR to draft for now pending further discussion.

diff --git a/src/app_model/registries/_menus_reg.py b/src/app_model/registries/_menus_reg.py
index 3386615..0957ba0 100644
--- a/src/app_model/registries/_menus_reg.py
+++ b/src/app_model/registries/_menus_reg.py
@@ -1,5 +1,6 @@
 from __future__ import annotations
 
+from dataclasses import dataclass
 from typing import (
     Any,
     Callable,
@@ -22,6 +23,47 @@ from ..types._constants import DisposeCallable
 MenuId = str
 
 
+@dataclass
+class _MenusRegistryNode:
+    """Doubly-linked node containing a reference to a MenuOrSubmenu item."""
+
+    item: Optional[MenuOrSubmenu] = None
+    prv: Optional[_MenusRegistryNode] = None
+    nxt: Optional[_MenusRegistryNode] = None
+
+    def _remove(self) -> None:
+        """Remove this node from its list."""
+        if self.prv and self.nxt:
+            self.prv.nxt, self.nxt.prv = self.nxt, self.prv
+        else:
+            raise LookupError("node is not in a list")
+
+
+class _MenusRegistryDLL:
+    """Doubly-linked list of nodes referencing MenuOrSubmenu items."""
+
+    def __init__(self) -> None:
+        # head and tail are dummy nodes
+        self._head = _MenusRegistryNode()
+        self._tail = _MenusRegistryNode()
+        self._head.nxt, self._tail.prv = self._tail, self._head
+
+    def append(self, item: MenuOrSubmenu) -> _MenusRegistryNode:
+        """Append a new item to the end of the list, returning the newly created node."""
+        new = _MenusRegistryNode(item, self._tail.prv, self._tail)
+        if new.prv:
+            new.prv.nxt = new
+        self._tail.prv = new
+        return new
+
+    def __iter__(self) -> Iterator[MenuOrSubmenu]:
+        # skip over the head, yield until we hit the tail (node with None item)
+        cur = self._head.nxt
+        while cur and cur.nxt and cur.item:
+            yield cur.item
+            cur = cur.nxt
+
+
 class MenusRegistry:
     """Registry for menu and submenu items."""
 
@@ -29,7 +71,7 @@ class MenusRegistry:
     menus_changed = Signal(set)
 
     def __init__(self) -> None:
-        self._menu_items: Dict[MenuId, List[MenuOrSubmenu]] = {}
+        self._menu_items: Dict[MenuId, _MenusRegistryDLL] = {}
 
     def append_menu_items(
         self, items: Sequence[Tuple[MenuId, MenuOrSubmenu]]
@@ -49,15 +91,12 @@ class MenusRegistry:
         changed_ids: Set[str] = set()
         disposers: List[Callable[[], None]] = []
         for menu_id, item in items:
-            item = MenuItem.validate(item)  # type: ignore
-            menu_list = self._menu_items.setdefault(menu_id, [])
-            menu_list.append(item)
             changed_ids.add(menu_id)
+            item = MenuItem.validate(item)  # type: ignore
+            menu_list = self._menu_items.setdefault(menu_id, _MenusRegistryDLL())
+            item_node = menu_list.append(item)
 
-            def _remove(lst: list = menu_list, _item: Any = item) -> None:
-                lst.remove(_item)
-
-            disposers.append(_remove)
+            disposers.append(item_node._remove)
 
         def _dispose() -> None:
             for disposer in disposers:
@@ -83,7 +122,7 @@ class MenusRegistry:
     def get_menu(self, menu_id: MenuId) -> List[MenuOrSubmenu]:
         """Return menu items for `menu_id`."""
         # using method rather than __getitem__ so that subclasses can use arguments
-        return self._menu_items[menu_id]
+        return list(self._menu_items[menu_id])
 
     def __repr__(self) -> str:
         name = self.__class__.__name__

@aganders3 aganders3 marked this pull request as draft November 8, 2022 19:14
@tlambert03
Copy link
Member

hmm, while I admire the doubly-linked list implementation :) I think it's a bit "heavy" for this here. that is, python already has enough containers, and I'm not sure this relatively inconsequential part of the code really merits a whole new container object that someone should familiarize themselves with.

I think I prefer the dict-as-ordered-set approach better. To respond to some of your earlier comments/concerns: I think it's a common enough pattern that it's intuitive "enough" (at least as much as the doubly-linked list), and I'm not yet too concerned about the rare case of losing insertion order (which should only happen in the case of re-registration?). Also, don't forget that the order of appearance in a menu in the app is determined by _MenuItemBase.order anyway... so order of registration probably isn't "mission critical" here.

Basically, I think your original changes in this PR are probably the most intuitive and straight-forward, and i think my original sequence-based implementation was probably a bit "YAGNI".

I'm happy to merge it as is once you're ready

@aganders3 aganders3 marked this pull request as ready for review November 9, 2022 01:19
@aganders3
Copy link
Contributor Author

Yeah I kind of figured the linked-list was over-baked, haha. Writing linked lists in Python is always fun though.

I'm obviously new to the project, so if this is good enough for you that works for me. Just one question because I actually had missed this part:

Also, don't forget that the order of appearance in a menu in the app is determined by _MenuItemBase.order anyway... so order of registration probably isn't "mission critical" here.

Does the order matter at all then? Could this just be a normal set?

Thanks for all the review!

@tlambert03
Copy link
Member

Does the order matter at all then? Could this just be a normal set?

good question... lemme take a look

@tlambert03
Copy link
Member

ok, I took a look... technically, it shouldn't matter, but as a "fallback" ordering it's nice to maintain insert order.
I did find that a couple tests do fail intermittently if you use a set and run the tests multiple times (tests/test_app.py::test_sorting and tests/test_qt/test_qmenu.py::test_menu).

The sorting of menu items is defined here:

https://github.com/napari/app-model/blob/194385eb4dd378a8219c8d1039e3d45c6d04958f/src/app_model/registries/_menus_reg.py#L141-L152

and basically, if a menu item or submenu item don't declare a group or order ... then it will fallback to whatever the insert order was. So, I think using a dict does remove one more possible confusing thing (I can imagine an app not using either order or group and wondering why their menus are ordered differently each time).

so let's go with the None-valued dict!

@tlambert03 tlambert03 merged commit 4c20668 into pyapp-kit:main Nov 9, 2022
@aganders3 aganders3 deleted the dictmenu branch November 9, 2022 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

2 participants