Skip to content

proposal: strings: add multiple string search ContainsAnySubstring and IndexAnySubstring #51143

@the80srobot

Description

@the80srobot

I'd like to propose a new group of string search functions in the strings package that can efficiently search for multiple needles.

Rationale

Looking at StackOverflow, the common advice given to people trying to search for multiple needles in a string is to match each one in turn. This is inefficient, but it looks like it's the common implementation. We could provide an efficient implementation, probably saving enough CPU cycles worldwide to make it worthwhile.

API sketch

I propose adding the following:

// ContainsAnySubstring reports whether any of the substrings are within s.
ContainsAnySubstring(s string, substrings []string) bool

// IndexAnySubstring returns the index of the first instance of any substring in s, of -1 if no substring is present in s.
IndexAnySubstring(s string, substrings []string) int

Implementation Sketch

The strings library already implements the Rabin-Karp algorithm, which is efficient in the average case. (And no less efficient than the common naive implementation in the worst case, which is anyway exceedingly unlikely.) It looks like both functions I propose could be built with minimal changes to the existing internals.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions