-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
I'd like to propose a new group of string search functions in the strings package that can efficiently search for multiple needles.
Rationale
Looking at StackOverflow, the common advice given to people trying to search for multiple needles in a string is to match each one in turn. This is inefficient, but it looks like it's the common implementation. We could provide an efficient implementation, probably saving enough CPU cycles worldwide to make it worthwhile.
API sketch
I propose adding the following:
// ContainsAnySubstring reports whether any of the substrings are within s.
ContainsAnySubstring(s string, substrings []string) bool
// IndexAnySubstring returns the index of the first instance of any substring in s, of -1 if no substring is present in s.
IndexAnySubstring(s string, substrings []string) int
Implementation Sketch
The strings library already implements the Rabin-Karp algorithm, which is efficient in the average case. (And no less efficient than the common naive implementation in the worst case, which is anyway exceedingly unlikely.) It looks like both functions I propose could be built with minimal changes to the existing internals.