The issue is caused by the overhead of calling time.Now() in StopTimer/StartTimer. The actual measured time of a target code equals the execution time of the target code plus the overhead of calling time.Now(). One can verify this using pprof.
Assume the target code consumes in T ns, and the overhead of calling time.Now() is t ns. If the target code runs N times, the total measured time is T*N+t, then the average of a single iteration of the target code is T+t/N. Thus, the systematic measurement error becomes t/N. Therefore, with a higher N, a less systematic error is obtained.
Another approach to address this is to measure the overhead of a calling time.Now(), then subtract the overhead from the benchmark result.
Since this issue can only impact microbenchmarks, people who do such a test tend to know what they are doing. Close because no further action to take.