Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor globbing performance in nfs-mounted directory with lots of subdirectories. #9891

Closed
kaysond opened this issue Jul 14, 2023 · 4 comments
Closed
Labels
bug Something that's not working as intended performance Purely performance-related enhancement without any changes in black box output
Milestone

Comments

@kaysond
Copy link

kaysond commented Jul 14, 2023

Running fish 3.6.1 on SLES 12 (with clean home dir). For additional context, please see IlanCosman/tide#420

I have an nfs-mounted directory with ~4000 subdirectories. If I do something like set -l glob /path/a*/, where /path/ has lots of subdirs starting with a, it takes upwards of 2 min to complete. Doing an ls /path even at its worst, never takes more than ~10s. I'm not sure what the globbing is doing that is slowing it down.

The globbing does not appear to cache the results between commands, even though the filesystem itself does. If I do set -l glob /path/a*/; set l glob /path/ab*/ where the result is the same number of directories, the second glob runs instantly. When I run the same command a second time, though, it still takes 2 minutes, even though its immediately after.

@faho
Copy link
Member

faho commented Jul 14, 2023

What does strace say? This is really gonna be one where we would need to reduce the syscalls.

You want something like strace -c fish -c 'set glob /path/a*/'

@faho
Copy link
Member

faho commented Jul 14, 2023

Sidenote:

The globbing does not appear to cache the results between commands, even though the filesystem itself does

Yeah, as a rule we don't do caching like that - in most cases the filesystem cache will do better than we do and cache eviction is terrible.

I don't believe a cache is the correct answer here or most other places and don't want to add it.

@faho faho closed this as completed in 6823f5e Jul 14, 2023
@faho
Copy link
Member

faho commented Jul 14, 2023

Okay, you're gonna love this one, we checked if a file existed that we already knew existed.

Since we're now using the same number of syscalls as "ls", and it's all stat-ish (newfstatat vs statx), I'm gonna call this one done for now.

Thanks for filing!

@faho faho added bug Something that's not working as intended performance Purely performance-related enhancement without any changes in black box output and removed needs more info labels Jul 14, 2023
@faho faho added this to the fish 3.6.2 milestone Jul 14, 2023
@kaysond
Copy link
Author

kaysond commented Jul 14, 2023

Okay, you're gonna love this one, we checked if a file existed that we already knew existed.

Since we're now using the same number of syscalls as "ls", and it's all stat-ish (newfstatat vs statx), I'm gonna call this one done for now.

Thanks for filing!

Saw the same thing in strace! Thanks for the quick fix!

faho added a commit that referenced this issue Oct 1, 2023
This confirmed that a file existed via access(file, F_OK).

But we already *know* that it does because this is the expansion for
the "trailing slash" - by definition all wildcard components up to
here have already been checked.

And it's not checking for directoryness either because it does F_OK.

This will remove one `access()` per result, which will cut the number
of syscalls needed for a glob that ends in a "/" in half.

This brings us on-par with e.g. `ls` (which uses statx while we use
newfstatat, but that should have about the same results)

Fixes #9891.

(cherry picked from commit 6823f5e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that's not working as intended performance Purely performance-related enhancement without any changes in black box output
Projects
None yet
Development

No branches or pull requests

2 participants