Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is LSP doing too much work when watching folders #2741

Closed
rtpg opened this issue Mar 25, 2021 · 9 comments
Closed

Is LSP doing too much work when watching folders #2741

rtpg opened this issue Mar 25, 2021 · 9 comments

Comments

@rtpg
Copy link
Contributor

rtpg commented Mar 25, 2021

So my understanding is that when file watching is happening, we only watch directories through setting up a notification watch on various sub directories.

However, in lsp-watch-root-folder, I'm seeing some of the following:

  • calling (lsp--directory-files-recursively dir ".*" t) which includes directories but also just includes files as well
  • checking the file watch threshhold against the number of files, instead of the number of directories (if we're thinking of a conservative estimate of 5-10 files in each folder, we're looking at much more)
  • On top of that when we're setting up the watches themselves, instead of directly generating a list of directories to watch, we are again building up a list of files and then filtering that list back down.

the full list of files thing, in particular, has caused my machine to basically hang on certain degenerate cases (in particular where I had a folder with a bunch of test data all inside a single folder)

I think that it would make sense to rework this function to make it much less likely for people to hit file watch limit warnings (that might not actually be applicable) but I wanted to first throw it by people here to make sure I wasn't missing an important detail here.

I'm including a reference to the function I was looking over here:

lsp-mode/lsp-mode.el

Lines 1712 to 1754 in ed9308e

(defun lsp-watch-root-folder (dir callback &optional watch warn-big-repo?)
"Create recursive file notification watch in DIR.
CALLBACK will be called when there are changes in any of
the monitored files. WATCHES is a hash table directory->file
notification handle which contains all of the watch that
already have been created."
(let* ((dir (if (f-symlink? dir)
(file-truename dir)
dir))
(watch (or watch (make-lsp-watch :root-directory dir))))
(lsp-log "Creating watch for %s" dir)
(when (or
(not warn-big-repo?)
(not lsp-file-watch-threshold)
(let ((number-of-files (length (lsp--directory-files-recursively dir ".*" t))))
(or
(< number-of-files lsp-file-watch-threshold)
(condition-case _err
(lsp--ask-about-watching-big-repo number-of-files dir)
('quit)))))
(condition-case err
(progn
(puthash
dir
(file-notify-add-watch dir
'(change)
(lambda (event)
(lsp--folder-watch-callback event callback watch)))
(lsp-watch-descriptors watch))
(seq-do
(-rpartial #'lsp-watch-root-folder callback watch)
(seq-filter (lambda (f)
(and (file-directory-p f)
(not (gethash (if (f-symlink? f)
(file-truename f)
f)
(lsp-watch-descriptors watch)))
(not (lsp--string-match-any (lsp-file-watch-ignored-directories) f))
(not (-contains? '("." "..") (f-filename f)))))
(directory-files dir t))))
(error (lsp-log "Failed to create a watch for %s: message" (error-message-string err)))
(file-missing (lsp-log "Failed to create a watch for %s: message" (error-message-string err)))))
watch))

@yyoncho
Copy link
Member

yyoncho commented Mar 25, 2021

Refer to #713 (comment)

Closing as duplicate.

@yyoncho yyoncho closed this as completed Mar 25, 2021
@yyoncho
Copy link
Member

yyoncho commented Mar 25, 2021

This is the correct comment link: #713 (comment)

@rtpg
Copy link
Contributor Author

rtpg commented Mar 25, 2021

Sorry, I think that that comment is related to more specific optimizations, but this issue in particular is just about how (based on my reading of the elisp) we are only creating file watches on directories in the current version anyways. So my proposal here would be to modify the function that builds up the watch list to only return directories (since other files will be filtered out anyways), since that function is only being called in one place. And then the "large file warning" could also happen less often.

Would you be interested in a proposal patch for this? I think it would be self-contained and I have time to work on this.

@yyoncho
Copy link
Member

yyoncho commented Mar 25, 2021

I am not sure whether OS uses the same resources to watch one folder with 5 files and one folder with 1000 files. Thus it is not clear whether this solution is correct. Someone proposed changing the function which counts the files to stop when we hit the limit (this will require a change of wording in the error message). I think that this is a good improvement and also it will solve your issue as well. WDYT?

@rtpg
Copy link
Contributor Author

rtpg commented Mar 26, 2021

So I looked around the various notification libraries used by emacs in . I am not 100% sure of this but my reading up of all the libraries involved and trying to figure out the implementations leans me to believe that directory watches' costs don't increase based on the number of files (basically I think the internals are like "when file edit happens, signal to directory as well". The actual file watch is a single struct for inotify and kqueue at least)

I tried messing around with this problem and was able to get a different watch function that behaves identically in my tests, but is about 20% faster than the existing function. 20% of the time in this implementation is spent just getting files, meaning 80% are spent doing other stuff... I feel like it should be possible to do better on this though. I also think there's a way that we should be able to operate strictly on absolute paths....

Just for a memo, here's what I ended up with (please forgive the spaghetti e-lisp...

(defun path-is-watchable-directory (path basedir)
  (let
      ((full-path (f-join basedir path)))
  (and (f-dir-p full-path)
       (not (equal path "."))
       (not (equal path ".."))
       ;; (not (member path '("./" "../")))
       (not (lsp--string-match-any (lsp-file-watch-ignored-directories) (f-join basedir path)))
       )
  )
  )

(defun files-in-dir (dir)
  (directory-files dir)
  )
(defun new-directory-files-recursively (dir)
  "Copy of `directory-files-recursively' but it skips `lsp-file-watch-ignored-directories'."
  (let* ((dir (directory-file-name dir))
         ;; When DIR is "/", remote file names like "/method:" could
         ;; also be offered.  We shall suppress them.
         (tramp-mode (and tramp-mode (file-remote-p (expand-file-name dir)))))
    (apply #'nconc
           ;; we provide the based directory...
           (list dir) ;; add the base directory
           ;; and then apply this function to everything that is watchable
           (-map
            #'new-directory-files-recursively
            (-map
             (lambda (path) (f-join dir path))
             ;; filter out anything that doesn't apply
             (-filter (lambda (path) (path-is-watchable-directory path dir))
                      (files-in-dir dir))
             )
            )
           )
    )
  )
(defun new-watch-root (dir callback &optional watch warn-big-repo?)
  "Create recursive file notification watch in DIR.
CALLBACK will be called when there are changes in any of
the monitored files. WATCHES is a hash table directory->file
notification handle which contains all of the watch that
already have been created."
  (let* ((dir (if (f-symlink? dir)
                  (file-truename dir)
                dir))
         (watch (or watch (make-lsp-watch :root-directory dir)))
         (dirs-to-watch (new-directory-files-recursively dir)))
    (lsp-log "Creating watch for %s" dir)
    (when (or
           (not warn-big-repo?)
           (not lsp-file-watch-threshold)
           (let ((number-of-files (length dirs-to-watch)))
             (or
              (< number-of-files lsp-file-watch-threshold)
              (condition-case _err
                  (lsp--ask-about-watching-big-repo number-of-files dir)
                ('quit)))))
      (dolist (current-dir dirs-to-watch)
        (condition-case err
                (progn
                (puthash
                current-dir
                (file-notify-add-watch current-dir
                                        '(change)
                                        (lambda (event)
                                        (lsp--folder-watch-callback event callback watch)))
                (lsp-watch-descriptors watch))
                )
                (error (lsp-log "Failed to create a watch for %s: message" (error-message-string err)))
                (file-missing (lsp-log "Failed to create a watch for %s: message" (error-message-string err))))
        ))
    watch))

(defun profile-functions ()
    (let* ((dir "/home/rtpg/proj/my_proj")
           (callback (lambda (elt) (print! "changed!"))))
      (profiler-reset)
      (profiler-start 'cpu)
      (new-watch-root dir callback)
      (lsp-log "Done")
      (lsp-watch-root-folder dir callback)
      (lsp-log "Done with root as well")
      (profiler-stop)
      (profiler-report)
    )
   )

@rtpg
Copy link
Contributor Author

rtpg commented Mar 26, 2021

image
And here are the profiling results... I mean it's definitely faster! But still a bit iffy in my opinion. This might be enough perf for my usages though

@yyoncho
Copy link
Member

yyoncho commented Mar 26, 2021

Ok, let's try it. Willing to turn this into a PR?

@yyoncho
Copy link
Member

yyoncho commented Mar 26, 2021

As a side note, if you are interested - you may try also to incorporate emacs threads in establishing the watches. I have that on my todo list. The idea is to do all the IO work in bkg thread and do thread-yeild from time to time to allow user events to be handled.

@rtpg
Copy link
Contributor Author

rtpg commented Mar 28, 2021

Alright! I'll write up a PR and look at the background thread stuff. Thanks for being so responsive, it's very motivating to try and work through this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants