Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of calls to the filesystem to see if a module file exists #14582

Merged
merged 1 commit into from
Oct 27, 2022

Conversation

schveiguy
Copy link
Member

In a work project, this reduces the number of stat calls from 127_416 to 9_909.

This works by short-circuiting the stat calls if the parent directory is known not to exist.

With current dmd and this project, due to the -I parameters for each dependency (of which there are many), importing one module from phobos would result in 335 stat calls.

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @schveiguy!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#14582"

@@ -139,7 +234,6 @@ nothrow:
}
FileName.free(n2.ptr);
}
FileName.free(n.ptr);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, I moved this to a scope(exit) on line 217 because it was a memory leak if the package.d file was found.

@nordlow
Copy link
Contributor

nordlow commented Oct 21, 2022

Why do we need to look at all the parent paths rather than just bottom-most parent?

@schveiguy
Copy link
Member Author

Top-most parent might exist, even though the module isn't there. e.g. https://github.com/MartinNowak/io/tree/master/src/std/io

What we really need is to determine if any parent doesn't exist.

@nordlow
Copy link
Contributor

nordlow commented Oct 21, 2022

Sorry, I don't understand this without a use case.

Do you want to cache calls to directory scans?

@schveiguy
Copy link
Member Author

schveiguy commented Oct 21, 2022

So if project a has source modules in:

projecta/source/foo/a

and project b has source modules in:

projectb/source/foo/b

and you import foo.b.bar, then this will check in order:

1. is projecta/foo/b cached? No
2. is projecta/foo cached? No
3. is projecta cached? No
4. is projecta a directory? Yes, cache it (stat)
5. is projecta/foo a directory? Yes, cache it (stat)
6. is projecta/foo/b a directory?  No -> cache and stop (stat)

7.  is projectb/foo/b cached? No
8.  is projectb/foo cached? No
9.  is projectb cached? No
10. is projectb a directory? Yes, cache it (stat)
11. is projectb/foo a directory? Yes, cache it (stat)
12. is projectb/foo/b a directory?  Yes -> cache and start looking for module. (6 stats)

Your proposal would have it just do:

1. is projecta/foo cached? No
2. is projecta/foo a directory? Yes, cache it and start looking for module (6 stats)

3. is projectb/foo cached? No
4. is projectb/foo a directory? Yes, cache it and start looking for module (6 stats)

Less steps, but the stats are more (12 vs 11).

Think about vibe.d, which has all its projects start with vibe.xxx, those would see no benefit from the caching.

@schveiguy
Copy link
Member Author

Do you want to cache calls to directory scans?

There are no directory scans here.

@maxhaton
Copy link
Member

@schveiguy I retriggered failing CircleCI

}
}
// found a parent that is cached (or reached the end of the stack).
// step 2, traverse back up the stack storing either false if the
Copy link
Contributor

@nordlow nordlow Oct 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

traverse back up

This should be

"traverse back down"

right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah could be. Not really great terminology.

@nordlow
Copy link
Contributor

nordlow commented Oct 22, 2022

With current dmd and this project, due to the -I parameters for each dependency (of which there are many), importing one module from phobos would result in 335 stat calls.

I guess it would be value to the reviewers of this PR documenting how much this number drop when compiling Phobos with the branch of this PR.

@dlang-bot dlang-bot merged commit a2865d7 into dlang:master Oct 27, 2022
@schveiguy schveiguy deleted the lowerstatcount branch October 27, 2022 17:01
@nordlow
Copy link
Contributor

nordlow commented Oct 29, 2022

Great work getting this merged so quickly.

On Ubuntu 22.10, I just tested dmd master as

echo > app.d && strace -e newfstatat dmd -c app.d -o-

at it seems there's still some redundant stats of .d files happening as show in the output

...
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/phobos/core", 0x7ffce054b990, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core", {st_mode=S_IFDIR|0775, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/attribute.di", 0x7ffce054bc90, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/attribute.d", {st_mode=S_IFREG|0664, st_size=8456, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/attribute.d", {st_mode=S_IFREG|0664, st_size=8456, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=8456, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal", {st_mode=S_IFDIR|0775, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/hash.di", 0x7ffce054bc90, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/hash.d", {st_mode=S_IFREG|0664, st_size=28155, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/hash.d", {st_mode=S_IFREG|0664, st_size=28155, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=28155, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/traits.di", 0x7ffce054bbe0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/traits.d", {st_mode=S_IFREG|0664, st_size=25079, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/traits.d", {st_mode=S_IFREG|0664, st_size=25079, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=25079, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/entrypoint.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/entrypoint.d", {st_mode=S_IFREG|0664, st_size=1155, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/entrypoint.d", {st_mode=S_IFREG|0664, st_size=1155, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=1155, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array", {st_mode=S_IFDIR|0775, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/appending.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/appending.d", {st_mode=S_IFREG|0664, st_size=6055, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/appending.d", {st_mode=S_IFREG|0664, st_size=6055, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=6055, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/comparison.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/comparison.d", {st_mode=S_IFREG|0664, st_size=6812, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/comparison.d", {st_mode=S_IFREG|0664, st_size=6812, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=6812, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/equality.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/equality.d", {st_mode=S_IFREG|0664, st_size=7478, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/equality.d", {st_mode=S_IFREG|0664, st_size=7478, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=7478, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/casting.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/casting.d", {st_mode=S_IFREG|0664, st_size=4048, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/casting.d", {st_mode=S_IFREG|0664, st_size=4048, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=4048, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/concatenation.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/concatenation.d", {st_mode=S_IFREG|0664, st_size=2825, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/concatenation.d", {st_mode=S_IFREG|0664, st_size=2825, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=2825, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/construction.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/construction.d", {st_mode=S_IFREG|0664, st_size=7811, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/construction.d", {st_mode=S_IFREG|0664, st_size=7811, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=7811, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/arrayassign.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/arrayassign.d", {st_mode=S_IFREG|0664, st_size=9662, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/arrayassign.d", {st_mode=S_IFREG|0664, st_size=9662, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=9662, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/capacity.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/capacity.d", {st_mode=S_IFREG|0664, st_size=3066, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/array/capacity.d", {st_mode=S_IFREG|0664, st_size=3066, ...}, 0) = 0
newfstatat(3, "", {st_mode=S_IFREG|0664, st_size=3066, ...}, AT_EMPTY_PATH) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/dassert.di", 0x7ffce054bcb0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/dassert.d", {st_mode=S_IFREG|0664, st_size=17769, ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/per/.local/dlang/linux/bin64/src/druntime/import/core/internal/dassert.d", {st_mode=S_IFREG|0664, st_size=17769, ...}, 0) = 0
...

. In this run, Every .d under /home/per/.local/dlang/linux/bin64/src/druntime/import file is checked twice, @schveiguy.

@schveiguy
Copy link
Member Author

schveiguy commented Oct 31, 2022

.d under /home/per/.local/dlang/linux/bin64/src/druntime/import file is checked twice

I think that's the part that's reading the file data, not from this PR. Was it happening before this change?

The checks for the empty path also don't seem to be reasonable, but I don't know where those come from either.

@nordlow
Copy link
Contributor

nordlow commented Nov 2, 2022

.d under /home/per/.local/dlang/linux/bin64/src/druntime/import file is checked twice

I think that's the part that's reading the file data, not from this PR. Was it happening before this change?

The checks for the empty path also don't seem to be reasonable, but I don't know where those come from either.

You can have gdb stop on a system call using catchpoints. For details see

According to the docs there is no way to filter on specific (string) arguments passed to the syscall, though. That would be very convenient in this case. Do you know of a way to accomplish such filtering, @ibuclaw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants