Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REPLCompletions: async cache PATH scan #52833

Merged
merged 12 commits into from
Jan 13, 2024

Conversation

IanButterworth
Copy link
Sponsor Member

@IanButterworth IanButterworth commented Jan 10, 2024

Fixes #52765

In shell> when first typing a word tab completion hinting searches the entire PATH list for completions again and again for every char entry.

On slower systems the hinting can make typing quite slow because of this i.e. #52765

This treats the contents reachable by PATH as static for a given session unless the PATH string changes, by caching it once at startup async and then if PATH changes at tab-completion time.

A modification would be to instead do something like update the cache on an async timer.

Base.generating_output() || Timer(t->REPLCompletions.cache_path(), 0; interval=60)

@IanButterworth
Copy link
Sponsor Member Author

IanButterworth commented Jan 11, 2024

Some numbers

Natively on an M2 Mac

% JULIA_DEBUG=REPLCompletions ./julia -q -t2
julia> ┌ Debug: caching PATH files took 0.011616084 seconds
│   length(pathdirs) = 13
│   length(path_cache) = 1563
└ @ REPL.REPLCompletions ~/Documents/GitHub/julia/usr/share/julia/stdlib/v1.11/REPL/src/REPLCompletions.jl:334
julia> 

with -t1 it takes 0.036s.

On a Windows VM via Parallels on the same M2 Mac

> $env:JULIA_DEBUG="REPLCompletions"
> ./julia -q -t2
julia> ┌ Debug: caching PATH files took 0.245052708 seconds
│   length(pathdirs) = 16
│   length(path_cache) = 5127
└ @ REPL.REPLCompletions \\Mac\Home\Downloads\julia-1efddebd6b\share\julia\stdlib\v1.11\REPL\src\REPLCompletions.jl:334
julia>

with -t1 it takes 0.46s

Clearly it's a good thing to do based on these Windows numbers, and I think @topolarity has durations in the many seconds on WSL.

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much don't like this happening (even in the background) just because REPL happens to be loaded. But I like that it would be cached. Probably should just be cached per-statement though? Seems very awkward if rm("foo") and touch("foo") don't work correctly

@IanButterworth
Copy link
Sponsor Member Author

IanButterworth commented Jan 12, 2024

Moving to per statement sounds good. I think we do want to do it in the background still, just because of the pathological case @topolarity has on WSL where the PATH scan takes > 5s

Seems very awkward if rm("foo") and touch("foo") don't work correctly

Note that this is for files reached by PATH only. Files in the current dir, or a specific path aren't cached, so your examples won't change behavior.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 12, 2024

Yes, the change to move to the background (when first needed by that statement) is very good too. We should try not to run arbitrary computations in the REPL/UI task which more properly belong to a worker task (until the user hits tab).

@IanButterworth
Copy link
Sponsor Member Author

Ok. I've moved it to cache in a worker thread at the first moment it would be needed, which is likely to be a hint when typing initially in shell>, meaning the completion might not be available for that char because it doesn't wait at all, but should be on most systems for the next char. (with -t2 caching takes 0.01s for me).

Side note. I noticed we don't provide these PATH completions for command strings. That seems logical to add.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 12, 2024

meaning the completion might not be available for that char because it doesn't wait at all

Not waiting does seem optimal to me. Lag is generally bad. Can we hook into the refresh_wait worker Task though, or similar, so that as soon as the completion becomes available that is gets the completion printed (unless it is canceled and restarted by the next character being pressed)?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 12, 2024

Side note. I noticed we don't provide these PATH completions for command strings. That seems logical to add.

Side side note: we use completely different code for completions inside shell> and backtick'd command strings, where the completions for shell are significantly better than those available for backticks. We should fix that.

@IanButterworth
Copy link
Sponsor Member Author

Can we hook into the refresh_wait worker Task though..

Sounds good. I think that's what #52847 is trying to achieve

@IanButterworth
Copy link
Sponsor Member Author

Come to think of it, if #52847 is done, then cache_PATH() could be done (if needed) synchronously in complete_path and the whole completion calculation would be async on a worker thread, with the appropriate delayed update/cancel.

@topolarity
Copy link
Member

Yeah, crawling PATH is quite slow on my WSL2 instance since WSL automatically appends the Windows PATH to the Linux one:

┌ Debug: caching PATH files took 25.125992393 seconds
│   length(pathdirs) = 36length(PATH_cache) = 7730
└ @ REPL.REPLCompletions ~/repos/julia/usr/share/julia/stdlib/v1.11/REPL/src/REPLCompletions.jl:338

Even so, with the latest changes the shell feels immediately responsive 👍

The Julia start-up time also looks comparable to before this PR, and it still seems to have useful auto-complete for many commands while it continues scanning.

Overall, it feels much better than before. Thanks!

@IanButterworth
Copy link
Sponsor Member Author

0.01s vs 25s... YMMV... Good to know we can handle that case nicely.

I'll go ahead and merge when CI is green

@IanButterworth IanButterworth merged commit 893e720 into JuliaLang:master Jan 13, 2024
6 of 7 checks passed
@IanButterworth IanButterworth deleted the ib/repl_cache_path branch January 13, 2024 02:56
# Look for files in PATH as well. These are cached in `cache_PATH` in a separate task in REPL init.
# If we cannot get lock because its still caching just pass over this so that initial
# typing isn't laggy. If the PATH string has changed since last cache re-cache it
cached_PATH_changed() && Base.errormonitor(Threads.@spawn REPLCompletions.cache_PATH())
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also rescan every command? How do we make sure this gets periodically refreshed so when the user changes the filesystem (e.g. installs a missing package), the completions aren't wrong afterwards?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Where would you call this? I'm a little lost in where to.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we spawn it every time it's needed, unless its already running?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it could be interesting to spawn an update almost continually if the past one is finished

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shell> prompt very sluggish when typing first word on WSL
3 participants