Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache project-crumbs #18

Closed
roife opened this issue Sep 30, 2023 · 25 comments
Closed

Add cache project-crumbs #18

roife opened this issue Sep 30, 2023 · 25 comments

Comments

@roife
Copy link
Contributor

roife commented Sep 30, 2023

I noticed that the breadcrumb--project-crumbs-1 function is being called frequently, which leading to performance overhead. However, the project-crumbs does not change frequently, so how about caching it to optimize the performance?

@joaotavora
Copy link
Owner

joaotavora commented Sep 30, 2023 via email

@roife
Copy link
Contributor Author

roife commented Oct 5, 2023

Well, I opened org.el, the built-in file of Emacs. Then I moved the cursor around and read the code. Below is the performance analysis report I obtained using profiler-report:

        1333  47% - redisplay_internal (C function)
        1222  43%  - eval
        1222  43%   - if
        1157  41%    - breadcrumb--header-line
        1153  40%     - mapcar
        1153  40%      - funcall
        1137  40%       - breadcrumb-project-crumbs
        1127  39%        - breadcrumb--project-crumbs-1
        1072  38%         - project-current
        1072  38%          - project--find-in-directory
        1072  38%           - run-hook-with-args-until-success
        1072  38%            - project-try-vc
         762  27%             - locate-dominating-file
         739  26%                #<compiled 0x15c08eb5e5ecdc67>
          16   0%                abbreviate-file-name
         265   9%             + project--value-in-dir
          25   0%               vc-file-getprop
           4   0%             + mapcar
           3   0%             + mapconcat
          44   1%         - file-relative-name
          11   0%            file-remote-p
           4   0%            #<compiled 0xa5e3fa2baf7af>
           4   0%         - file-name-base
           4   0%          - file-name-sans-extension
           4   0%             file-name-sans-versions
           3   0%           breadcrumb--format-project-node
           7   0%          breadcrumb--summarize
          16   0%       + breadcrumb-imenu-crumbs
           4   0%     + cl-remove-if
          61   2%    + +mode-line-active
          80   2%  + jit-lock-function
          19   0%  + tab-bar-make-keymap
          ... (misc, omitted)

@joaotavora
Copy link
Owner

Can't reproduce:

path/to/emacs -Q -l path/to/breadcrumb.el -f breadcrumb-mode path/to/emacs/org/org.el -f profiler-start

Now, select "cpu", scroll down to line 300 more or less, then

M-x profiler-stop RET
M-x profiler-report RET

I get this:

        1394  49% - redisplay_internal (C function)
         455  16%  - eval
         441  15%   - breadcrumb--header-line
         441  15%    - let
         438  15%     - cl-remove-if
         429  15%      - mapcar
         429  15%       - funcall
         274   9%        - breadcrumb-project-crumbs
         269   9%         - breadcrumb--summarize
         240   8%          - if
         240   8%           - breadcrumb--project-crumbs-1
         240   8%            - catch
         240   8%             - let*
         105   3%              + file-relative-name
          81   2%              + while
          29   1%              - project-current
          26   0%               - project--find-in-directory
          26   0%                - run-hook-with-args-until-success
          26   0%                 - project-try-vc
          26   0%                  - vc-file-getprop
          26   0%                     expand-file-name
          13   0%                split-string
           5   0%              + throw
           3   0%              + if
          22   0%          - let
          22   0%           + let*
           7   0%            breadcrumb--length
         155   5%        - breadcrumb-imenu-crumbs
         155   5%         - let*
         151   5%          + if
           4   0%          + and
           9   0%      - apply
           9   0%         cl-remove
           3   0%       mapconcat
          10   0%   + unless
           2   0%   + mode-line-eol-desc
           2   0%   + if
          56   1%  + jit-lock-function
          10   0%  + file-remote-p
           3   0%  + mode-line-default-help-echo
           2   0%    redisplay--pre-redisplay-functions
         730  25%   Automatic GC
         368  12% - command-execute
         365  12%  - call-interactively
         261   9%   - funcall-interactively
         257   9%    - next-line
         252   8%     - line-move
         125   4%      + line-move-visual
          95   3%        line-pixel-height
          18   0%      + line-move-partial
           4   0%      + truncated-partial-width-window-p
           2   0%      + default-line-height
           4   0%    + execute-extended-command
          95   3%   + byte-code
         315  11% - timer-event-handler
         315  11%  - apply
         311  10%   - #<lambda 0x195111a913d285>
         311  10%    - if
         311  10%     - progn
         311  10%      - save-current-buffer
         311  10%       + let
           4   0%   - #<compiled -0x13ed6d2b6d5f54fe>
           4   0%    + eldoc-print-current-symbol-info
          11   0% - jit-lock--antiblink-post-command
           4   0%  + syntax--lbp
           2   0%  + syntax-ppss
           8   0% + ...
           6   0% - undo-auto--add-boundary
           6   0%  + undo-auto--boundaries
           4   0%   eldoc-pre-command-refresh-echo-area
           4   0% - clear-minibuffer-message
           4   0%    timerp

@joaotavora
Copy link
Owner

This was with Emacs 30.

I can reproduce your problem with Emacs 28 though.

It seems that your project.el doesn't have a caching strategy. You can get a newer version of project.el from GNU ELPA on Emacs 28, though by invoking M-x package-list-packages and navigating that menu. It's not easy (M-x package-install will notably not work, because reasons) so I'll ping @dgutov, project.el's maintainer for some easier tips on how to upgrade project.el.

I think you have to restart to ensure the newer project.el is loaded after you do this.

Anyway, when you do upgrade, the problem goes away. No need to duplicate the caching in breadcrumb. Let's keep it lower down where it can be useful to many other packages.

@roife
Copy link
Contributor Author

roife commented Oct 7, 2023

Thank you for your contribution! I will consider upgrading my Emacs version first.

@alastairdb
Copy link

A bit more information:

  • Emacs version: 29.2
  • Breadcrumb version: 1.0.1 (dcb6e2e)
  • Project version: 0.10.0

Following the profile report above:

> emacs -Q -l breadcrumb.el -f breadcrumb-mode /nix/store/wq7b9287kw2403mk52qllcml6jqz8rbk-emacs-gtk3-29.1/share/emacs/29.1/lisp/jka-compr.el.gz -f profiler-start

The file being edited is problematic because one of the directores above it in the directory tree (/nix/store) is very large:

> ls -1 /nix/store | wc -l
67284

        7546  94% - command-execute
        7488  93%  - funcall-interactively
        7475  93%   - mwheel-scroll
        7419  92%    - scroll-up
        7419  92%     - eval
        7419  92%      - breadcrumb--header-line
        7419  92%       - let
        7419  92%        - cl-remove-if
        7419  92%         - mapcar
        7419  92%          - funcall
        7415  92%           - breadcrumb-project-crumbs
        7415  92%            - breadcrumb--summarize
        7411  92%             - if
        7411  92%              - breadcrumb--project-crumbs-1
        7411  92%               - catch
        7411  92%                - let*
        7391  92%                 - project-current
        7391  92%                  - project--find-in-directory
        7391  92%                   - project-try-vc
        7347  91%                    - locate-dominating-file
        7347  91%                       #<compiled 0x15c08eb5e7da5c67>
          44   0%                    + project--value-in-dir
          16   0%                 + while
           4   0%             + let
           4   0%           + breadcrumb-imenu-crumbs
           4   0%      mouse-wheel--get-scroll-window
           4   0%    - run-with-timer
           4   0%       run-at-time
          13   0%   - execute-extended-command
          13   0%    - command-execute
          13   0%       funcall-interactively
          58   0%  + byte-code
         410   5% - ...
         410   5%    Automatic GC
          31   0% + normal-top-level
          24   0% + timer-event-handler
           7   0% + jit-lock-function

My current fix is to advise the function project--find-in-directory so that it ignores all directories starting with /nix/store

@joaotavora
Copy link
Owner

Project version: 0.10.0

In that case, you do seem to be running the latest project.el ,so I don't understand this part:

 7391  92%                   - project-try-vc
        7347  91%                    - locate-dominating-file

Because when project.el is doing the caching described in eariler comments, you should be seeing something like:

  26   0%                 - project-try-vc
          26   0%                  - vc-file-getprop

In any case, this seems much more like a project.el issue than a breadcrumb.el one, so I'll refer you to @dgutov or to the Emacs bug tracker.

@dgutov
Copy link
Collaborator

dgutov commented Mar 4, 2024

@alastairdb Could you try this patch?

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index aa92a73336e..2e1b4d96e4a 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -556,7 +556,7 @@ project-try-vc
                  ;; Maybe limit count to 100 when we can drop Emacs < 28.
                  (setq last-matches
                        (condition-case nil
-                           (directory-files d nil marker-re t)
+                           (directory-files d nil marker-re t 100)
                          (file-missing nil))))))
              (backend
               (cl-find-if

It requires Emacs 28+, that's why I originally dropped this optimization. But it would be useful to know whether it really helps in your case, or if some other solution is needed.

@alastairdb
Copy link

alastairdb commented Mar 5, 2024

Yes, thanks a lot. The code is still spending a lot of time in locate-dominating-file, but a lot less. And scrolling is smooth.

Some other figures:

  • The file I am scrolling through is ca 500 lines long.
  • It takes me about 20 turns of the mouse scroll wheel to get from top to bottom
  • During this time the breadcrumb header is recalculated 200 times
  • Each time there is a search for a project that is never found

It seems that when a project is encountered in a directory, this is recorded via vc-file-setprop. I wondered why the non-existence of a project in a directory is not recorded. Then the search for a project would only happen once while scrolling, not 200 times.

        4865  93% - command-execute
        4849  93%  - byte-code
        4849  93%   - read-extended-command
        4849  93%    - read-extended-command-1
        4801  92%     - completing-read-default
        4744  91%      - redisplay_internal (C function)
        4744  91%       - eval
        4744  91%        - breadcrumb--header-line
        4744  91%         - let
        4744  91%          - cl-remove-if
        4744  91%           - mapcar
        4744  91%            - funcall
        4744  91%             - breadcrumb-project-crumbs
        4744  91%              - progn
        4744  91%               - breadcrumb--summarize
        4744  91%                - if
        4744  91%                 - breadcrumb--project-crumbs-1
        4744  91%                  - catch
        4744  91%                   - let*
        4740  91%                    - project-current
        4740  91%                     - project--find-in-directory
        4740  91%                      - project-try-vc
        4689  90%                       - locate-dominating-file
        4689  90%                          #<compiled 0x15c08eb5e7da5c67>
          41   0%                       + project--value-in-dir
           4   0%                         #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_23>
          27   0%      + command-execute
           4   0%      + undo-auto--undoable-change
          16   0%  - funcall-interactively
          12   0%   - execute-extended-command
          12   0%    - command-execute
          12   0%       funcall-interactively
           4   0%     scroll-up-command
         241   4% - ...
         241   4%    Automatic GC
          28   0% + normal-top-level
          26   0% + redisplay_internal (C function)
          16   0% + timer-event-handler

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

The code is still spending a lot of time in locate-dominating-file, but a lot less. And scrolling is smooth.

That's great!

I wondered why the non-existence of a project in a directory is not recorded. Then the search for a project would only happen once while scrolling, not 200 times.

Basically, because we don't have a solid cache invalidation strategy yet. Nobody notices that because removing project markers (deleting a repo, etc) is a very rare even, whereas creating one is something everybody does on an occasion (e.g. with git init), so caching that a file belongs to no project means that whenever somebody creates a new project marker in a directory with some existing visited buffers, they'll have to restart Emacs. Improving that is TBD.

For now, I will check in (EDIT: a version of) that patch (which should take care of the most egregious slowdown) and perhaps recommend that the clients of project.el sensitive to such low latency do some caching on their own. Even caching the current project for the buffer for 1 second should make this overhead go away, if the scrolling seems smooth enough for you already.

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

@alastairdb

Hmm, could you verify, though, that the patch really makes a difference?

That is, that the call (directory-files d nil marker-re t 100) takes less time than (directory-files d nil marker-re t), when d is that large directory you were referring to.

The reason I felt safe to skip this limit originally, is because the call is supposed to only return the files whose names match marker-re. Having a lot of them (> 100?!) should be a very rare situation. Does /nix/store have a lot of files with matching names?

@alastairdb
Copy link

@dgutov I think you are correct. marker-re does not match any files in that big directory. And the following experiment seems to show that adding the count does not help:

ELISP> (setq test-file "/nix/store/wq7b9287kw2403mk52qllcml6jqz8rbk-emacs-gtk3-29.1/share/emacs/29.1/lisp/jka-compr.el.gz")
"/nix/store/wq7b9287kw2403mk52qllcml6jqz8rbk-emacs-gtk3-29.1/share/emacs/29.1/lisp/jka-compr.el.gz"
ELISP> (with-current-buffer (find-file-noselect test-file)
         (let (result)
           (dolist (fun (list #'project-try-vc  #'project-try-vc-patched))
             (setq project-find-functions (list fun))
             (push (cons fun (benchmark-elapse
                                 (dotimes (i 200)
                                   (project-current)))) result))
           result))
((project-try-vc-patched . 47.261065855)
 (project-try-vc . 48.136315776))

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

But you said that scrolling is smooth now? Did that change due to some other reason, or did you miss the failing scenario last time?

If you simply counted the number of calls (but they are in practice fast enough), then I think I'll just recommend the second step: that breadcrump adds a 1-second cache for the current project's value.

@joaotavora
Copy link
Owner

then I think I'll just recommend the second step: that breadcrump adds a 1-second cache for the current project's value.

Why doesn't project add this?

@alastairdb
Copy link

But you said that scrolling is smooth now? Did that change due to some other reason, or did you miss the failing scenario last time?

Yes. I was a bit too hasty in my assessment. If you scroll in lots of small increments the delay is not so noticeable. That's why I ended up calling project-current 200 times in my last message. This does not exactly match the manual scrolling behaviour but it is close to it and is a more reliable yardstick than scrolling by hand

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

@joaotavora

Why doesn't project add this?

Like I said: Improving that is TBD.

But an optimal caching strategy can depend on the application, so it's not a problem when an application does some caching.

@joaotavora
Copy link
Owner

It's project.el who knows how much work it takes to request a project, and it's project.el who knows about it's limitations, so it should be project.el to know do whatever (optimal/suboptimal) caching it can. This seems clear to me. Then remove it with the TBD happens.

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

A solution in project.el should be application-independent. So, pretty much perfect. That's harder than doing a single application-specific strategy.

@joaotavora
Copy link
Owner

I just think it's odd for a package that knows its limitations not to be the one addressing them. Maybe not all backends require this caching or face this limitation. Maybe the limitation is worse for remote files. Eglot also uses a lot of project-current btw. Anyway, your call, I'll give you commit rights here and you can hack this in breadcrumb.el if you want.

@joaotavora
Copy link
Owner

A solution in project.el should be application-independent. So, pretty much perfect.

This really makes no sense to me. Like, right now, there is no solution. So there is 100% imperfection, 0%perfection. You'll only jump for 0% to 100% directly, no middle ground?

@dgutov
Copy link
Collaborator

dgutov commented Mar 7, 2024

The current situation is a middle ground: it does some caching. We could review some other middle-ground too, but it probably should have a better argument than "one application benefits from it".

Anyway, your call, I'll give you commit rights here and you can hack this in breadcrumb.el if you want.

Thank you for the invitation, I'll see if I find the time.

@joaotavora
Copy link
Owner

should have a better argument than "one application benefits from it".

As far as I understand, everyone calling project-current more than once for a given project-less buffer with will benefit. I presume you introduced the existing caching before breadcrumb.el was born, so whoever you were benefiting is also suffering in "no project" scenarios. Maybe the suffering is not in terms of freezing Emacs like it is here, but it's probably something.

@shipmints
Copy link

@dgutov I can confirm that project-0.11.1 (with upgraded xref) works much better. Thank you for this additional work.

One thing which I'd like to consider is for breadcrumb to have an option to inhibit project and another to inhibit file name entirely on the breadcrumb line. This is often duplicative to information on frame title, mode-line, tab name, etc. I could submit a PR for that if any interest or perhaps this is already underway. If enabled, I might not even have noticed the non-imenu performance issues.

My cache used buffer-local variables vs. project dir properties. I was able to leverage the locals for other reasons and avoid constantly calling project-current. I still might do that but I will use the native cache for a while and see how it feels.

I have an alternative project find function that goes first in 'project-find-functions list that uses the same project-vc-extra-root-markers. I've absconded (vc-file-getprop dir 'project-vc) (vc-file-setprop dir 'project-vc project) to use the project cache inside that function.

The reason I have that function is that we use a variety of project configurations with the more complex ones in a large monorepo hierarchy and where independent, discrete subprojects are convenient to identify; e.g., web front end vs. server back end. Using private markers such as "venv" (for python) or ".project" (custom marker) allow us to do this.

I have a "super root" marker, tagged as ".project.root", that allows tooling to identify non-vc master project roots vs subproject roots, tagged as ".project" tag; e.g., documentation files that do not live in vc but for which project tools work nicely.

magit doesn't care how we break up subprojects in this way, so this works swimmingly.

I mention this use case FYI so future project work can respect such complex project arrangements.

@joaotavora
Copy link
Owner

@dgutov I can confirm that project-0.11.1 (with upgraded xref) works much better. Thank you for this additional work.

I don't understand what the slowdown was (did you read the "reproducible benchmarks or it didn't happen"? It still goes for whatever you are trying to communicate in #38). But all the same, good news and cheers to @dgutov whatever you did.

One thing which I'd like to consider is for breadcrumb to have an option to inhibit project and another to inhibit file name entirely on the breadcrumb line. This is often duplicative to information on frame title, mode-line, tab name, etc. I could submit a PR for that if any interest or perhaps this is already underway. If enabled, I might not even have noticed the non-imenu performance issues.

A valid concern, surely. But completely outside the scope of this issue -- which deals (or dealt) specifically with the slowdown introduced by the fact that project.el doesn't (didn't?) cache the absence of a project for a given file (because that cache is hard to invalidate reliably). And thus, in that particular case, breadcrumb performance would suffer substantially. So there was the idea to introduce an expiring cache in breadcrumb.el (or in project.el). That idea hasn't materialized yet, and maybe it doesn't need to

For these "duplicate information" concerns, see the README for how to build your own breadcrumbish modeline constructs. Open a new issue/conversation if that's not enough.

@dgutov
Copy link
Collaborator

dgutov commented Jul 22, 2024

@joaotavora

But all the same, good news and cheers to @dgutov whatever you did.

I'm not sure what that was, actually. Perhaps the fix for https://debbugs.gnu.org/69740? Otherwise it depends on which version the upgrade was from. And the version of xref shouldn't matter.

I mean, there were some significant improvements for project-files recently (in large repos), but this package doesn't call project-files.

@shipmints

I have a "super root" marker, tagged as ".project.root", that allows tooling to identify non-vc master project roots vs subproject roots, tagged as ".project" tag; e.g., documentation files that do not live in vc but for which project tools work nicely.

Without going into too much detail (which I'd be happy to address somewhere else, but using this bug tracker seems like an imposition), a non-vc master project is indeed a use case which the current features don't cover, and a custom project-find-functions might be the best choice for now, especially considering that there's no easy way (in such projects) to use git ls-files to speed up file listing either. You're welcome to file a feature request, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants