Skip to content

bytes, strings: optimize Contains with fast-path for sub-slices  #24979

@dsnet

Description

@dsnet

Consider the following:

func Benchmark(b *testing.B) {
	buf := make([]byte, 1<<20)
	rand.Read(buf)
	for n := 64; n <= len(buf); n <<= 1 {
		b.Run(fmt.Sprintf("%d", n), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				bytes.Contains(buf, buf[n-64:n])
			}
		})
	}
}

On my machine, this prints:

Benchmark/64-8         	100000000	        11.8 ns/op
Benchmark/128-8        	100000000	        22.3 ns/op
Benchmark/256-8        	50000000	        29.1 ns/op
Benchmark/512-8        	30000000	        56.9 ns/op
Benchmark/1024-8       	20000000	       101 ns/op
Benchmark/2048-8       	10000000	       195 ns/op
Benchmark/4096-8       	 5000000	       393 ns/op
Benchmark/8192-8       	 2000000	       976 ns/op
Benchmark/16384-8      	 1000000	      1749 ns/op
Benchmark/32768-8      	  300000	      3968 ns/op
Benchmark/65536-8      	  200000	      8312 ns/op
Benchmark/131072-8     	  100000	     18300 ns/op
Benchmark/262144-8     	   50000	     35918 ns/op
Benchmark/524288-8     	   20000	     75942 ns/op
Benchmark/1048576-8    	   10000	    156854 ns/op

In this situation, the substring is sliced out of the parent slice. It should be know that the parent contains the substring in O(1) with something similar to:

func Contains(b, subslice []byte) bool {
	if (*reflect.SliceHeader).(unsafe.Pointer(&b)).Data <= (*reflect.SliceHeader).(unsafe.Pointer(&subslice)).Data && (*reflect.SliceHeader).(unsafe.Pointer(&b)).Data + uintptr(len(b)) >= (*reflect.SliceHeader).(unsafe.Pointer(&subslice)).Data + uintptr(len(subslice)) {
		return true
	}
	return Index(b, subslice) != -1
}

(the above code is not correct as there is special consideration to manipulating unsafe.Pointer, but the general approach is the same)

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.Performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions