Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use different search algorithm in Strings::contains_any #1255

Merged
merged 14 commits into from
Oct 31, 2023

Conversation

Thomas1664
Copy link
Contributor

@Thomas1664 Thomas1664 commented Oct 28, 2023

Using std::boyer_moore_horspool_searcher results in approx. 3x speedup of Strings::contains_any() which is used by contains_any_ignoring_c_comments() during post build lint.

Note that this search algorithm is only useful when dealing with long strings: cppreference

@Thomas1664

This comment was marked as outdated.

for (const auto& subject : to_find)
{
auto found =
std::search(source.begin(), source.end(), boyer_moore_horspool_searcher(subject.begin(), subject.end()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating the searcher should be hoist out of the loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind I'm an idiot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I do think this should be hoist to just after

auto files = fs.get_regular_files_recursive(dir, IgnoreErrors{});

in postbuildlint.cpp.

My concern is that contains_any looks like it's doing contains, not boyer_moore things, and I could easily see that being used in places boyer_moore would be inappropriate.

Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I either think contains_any should be renamed to indicate that it only makes sense for long haystacks now, and/or it should be given a View<boyer_moore_horspool_searcher> instead of the raw text.

@BillyONeal BillyONeal merged commit fe0ad6c into microsoft:main Oct 31, 2023
5 checks passed
@BillyONeal
Copy link
Member

Thanks!

@Thomas1664 Thomas1664 deleted the search-alg branch October 31, 2023 06:36
@autoantwort
Copy link
Contributor

I now get 467 warnings of the form

/Users/leanderSchulten/git_projekte/vcpkg-tool/include/vcpkg/base/strings.h:51: warning: 'boyer_moore_horspool_searcher<std::__wrap_iter<const char *>>' is deprecated: std::experimental::boyer_moore_horspool_searcher will be removed in LLVM 17. Use std::boyer_moore_horspool_searcher instead [-Wdeprecated-declarations]
In file included from /Users/leanderSchulten/git_projekte/vcpkg-tool/src/vcpkg/base/hash.cpp:6:
/Users/leanderSchulten/git_projekte/vcpkg-tool/include/vcpkg/base/strings.h:51:62: warning: 'boyer_moore_horspool_searcher<std::__wrap_iter<const char *>>' is deprecated: std::experimental::boyer_moore_horspool_searcher will be removed in LLVM 17. Use std::boyer_moore_horspool_searcher instead [-Wdeprecated-declarations]
    using boyer_moore_horspool_searcher = std::experimental::boyer_moore_horspool_searcher<std::string::const_iterator>;
                                                             ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/experimental/functional:341:7: note: 'boyer_moore_horspool_searcher<std::__wrap_iter<const char *>>' has been explicitly marked deprecated here
class _LIBCPP_DEPRECATED_BOYER_MOORE_HORSPOOL_SEARCHER _LIBCPP_TEMPLATE_VIS boyer_moore_horspool_searcher {
      ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/experimental/functional:90:60: note: expanded from macro '_LIBCPP_DEPRECATED_BOYER_MOORE_HORSPOOL_SEARCHER'
#  define _LIBCPP_DEPRECATED_BOYER_MOORE_HORSPOOL_SEARCHER _LIBCPP_DEPRECATED_("std::experimental::boyer_moore_horspool_searcher will be removed in LLVM 17. Use std::boyer_moore_horspool_searcher instead")
                                                           ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__config:836:53: note: expanded from macro '_LIBCPP_DEPRECATED_'
#      define _LIBCPP_DEPRECATED_(m) __attribute__((deprecated(m)))
                                                    ^

@Thomas1664
Copy link
Contributor Author

Honestly, I don't know what the best solution is. I could check for the feature test macro but then we don't get informed when we can remove this workaround.

BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Mar 26, 2024
This should have the effect of disabling this on macOS which hopefully fixes the Python2 hang.

Related: microsoft#1255
BillyONeal added a commit that referenced this pull request Mar 28, 2024
This should have the effect of disabling this on macOS which hopefully fixes the Python2 hang.

Related: #1255
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants