Skip to content

bytes, strings: optimize Trim for ASCII cutsets #46446

@dsnet

Description

@dsnet

I analyzed usages of strings.Trim (and related functions in both strings and bytes) in all source code known by the module proxy:

  • 76.6% have a cutset of len=1, and
  • 13.4% have a cutset of len=2.

Given that a vast majority of usages only have a cutset of len=1, I argue we should more heavily optimize for that situation. Currently, there is some optimization for cutsets of len=1, but it’s within the internal makeCutsetFunc function. This is sub-optimal as it incurs an allocation in the internal makeCutsetFunc for the closure over that single byte. I believe we should place special handling of cutsets with len=1 directly in Trim, TrimRight, and TrimLeft.

Example benchmark for strings.TrimRight("hello==", "="):

BenchmarkStd   	18299240	        69.16 ns/op	      24 B/op	       1 allocs/op
BenchmarkOpt   	226575349	         5.330 ns/op	       0 B/op	       0 allocs/op

The suggested optimization results in a >10x speedup for this common case.

P.S. There is currently an optimization for cutsets that are all ASCII. This optimization is still justified to keep as 99.3% of all cutsets are pure ASCII.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions