-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving listDirectoriesInDirectory
by using std::fs
#7974
Improving listDirectoriesInDirectory
by using std::fs
#7974
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally sure I understand -- is this mostly changing how the allocations work or is there something more nuanced?
if (!listDirectoriesInDirectory(site, directories, true).ok()) { | ||
return; | ||
|
||
if (isPlatform(PlatformType::TYPE_WINDOWS)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a change that we should apply on all platforms? Is there something windows specific about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the listDirectoriesinDirectory
helper takes a long time to complete on Windows. This is a field issue reported by Fleet customers - see here. I decided to play safe and just use the new implementation on Windows only.
@@ -84,12 +84,43 @@ void genPackage(const std::string& path, Row& r, Logger& logger) { | |||
} | |||
} | |||
|
|||
Status stdListDirectoriesInDirectory(const fs::path& path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this useful enough we should put in filesystem.cpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to release this new logic only and then wait a couple of releases to see how it behaves. I can benchmark it on different platforms before deciding on using it on filesystems.cpp
Thanks @directionless for looking into this! The overall goal is to fix a field issue found by fleet customers using python_packages tables on Windows environment - see here. |
My gut sense is that we should see if we need this everywhere, and not just here. But I can be persuaded to just approve it here. @alessandrogario or @Smjert either of you have an opinion? |
I think that as long as the logic has to just return a list of directories without the need to parse any regex from SQL, it should definitely not use the The other thing is to make sure that the behavior is the same (with tests), particularly around symlinks/junctions. |
…tem helpers. Adding testcases to cover different directories listing scenarios
@Smjert I've replaced the platform's listDirectoriesInDirectory implementation with the new logic and added testcases to check different directory listing scenarios |
Potential behavioral changes:
|
listDirectoriesInDirectory
by using std::fs
I have just run a quick test on this and found the following:
I also found that the two implementations behave in the same way when it comes to listing dot-directories |
As discussed in office hours, the symlink listing is a little unclear. Is there a risk of looping? Like if |
…s to test this scenario and to compare the previous logic with the current one
I have also included a test case to compare the previous logic with the new one and ensure that both list the same number of regular and junctions directories. |
@Smjert @directionless This PR is something we can consider for inclusion in the upcoming release |
This relates to #6679 and Fleet PR #8243
There was a 10x performance increase when using this logic (see osqueryi_custom.exe results below)