Rule S6608: Benchmark is benchmarking the wrong things #9390

Apollo3zehn · 2024-06-04T10:11:35Z

I got a S6608 warning using SolarLink in VSCode and ran the benchmark referenced in the documentation.

The result is:

Method	SampleSize	LoopSize	Mean	Error	StdDev
ElementAt	1000000	1000	11,614.7 ns	199.34 ns	204.71 ns
Index	1000000	1000	7,756.9 ns	54.88 ns	45.83 ns
First	1000000	1000	6,223.9 ns	23.47 ns	19.60 ns
First_Index	1000000	1000	483.3 ns	9.31 ns	14.50 ns
Last	1000000	1000	6,483.0 ns	128.93 ns	188.98 ns
Last_Index	1000000	1000	380.2 ns	7.19 ns	7.06 ns

The results suggest that ElementAt and Index are slow methods but the actual problem is that on every loop iteration one random value is being generated which is much slower than the actual array access:

sonar-dotnet/analyzers/rspec/cs/S6608.html

Line 174 in c8fc28e

var index = random.Next(0, SampleSize);

A simple benchmark tells me that random.Next(0, SampleSize) itself takes around 7000 ns so it has a huge impact on the actual benchmark results. Without random.Next(0, SampleSize) the method ElementAt becomes much less slow and Index is on par with First_Index and Last_Index.

Method	SampleSize	LoopSize	Mean	Error	StdDev
ElementAt	1000000	1000	3,450.9 ns	68.35 ns	104.38 ns
Index	1000000	1000	530.0 ns	9.63 ns	8.54 ns
First	1000000	1000	6,739.5 ns	134.67 ns	209.66 ns
First_Index	1000000	1000	528.3 ns	3.57 ns	2.78 ns
Last	1000000	1000	6,803.1 ns	44.47 ns	39.42 ns
Last_Index	1000000	1000	459.6 ns	1.66 ns	1.30 ns

The modified code is shown below. The only disadvantage of not using a random index is that now the CPU cache is being used in an optimal way which makes the result look a bit better than for a real random access.

To avoid the risk of code being optimized out I added a return value to all benchmark methods (https://fransbouma.github.io/BenchmarkDotNet/RulesOfBenchmarking.htm#avoid-dead-code-elimination).

private List<byte> data;
private Random random;

[Params(1_000_000)]
public int SampleSize;

[Params(1_000)]
public int LoopSize;

[GlobalSetup]
public void Setup()
{
    random = new Random(42);
    var bytes = new byte[SampleSize];
    random.NextBytes(bytes);
    data = bytes.ToList();
}

[Benchmark]
public int ElementAt()
{
    int result = default;

    for (int i = 0; i < LoopSize; i++)
    {
        result = random.Next(0, SampleSize);
    }

    return result;
}

[Benchmark]
public int Index()
{
    int result = default;

    for (int i = 0; i < LoopSize; i++)
    {
        result = data[i];
    }

    return result;
}

[Benchmark]
public int RandomNext()
{
    int result = default;

    for (int i = 0; i < LoopSize; i++)
    {
        result = data.First();
    }

    return result;
}

The text was updated successfully, but these errors were encountered:

costin-zaharia-sonarsource · 2024-06-06T12:07:44Z

Hi @Apollo3zehn,

Thanks for the feedback. Designing a good benchmark is not an easy task and it's easy to fall into all kinds of traps :)

I think the current approach, in the case of ElementAt and Index, was comparing apples with apples, in the sense that random numbers were generated in the same way in both cases. If just the relative difference is considered then they do reflect which method is faster.

I see your point that Random.Next() is much more expensive compared with the element access, giving a false impression that accessing the array elements it's much more expensive than it actually is.

Our first thought to improve this was to use a predefined list of randomly generated numbers (done in the setup), but that would also mean accessing arrays twice in each of the benchmarked methods. Leading again to numbers greater than they actually are.

Another approach we considered was to use a pseudo-number generator fast enough to not affect the numbers too much and still provide a degree of randomness. An example would be the one invented by George Marsaglia.

We ended up with the simple solution, the one you suggested that removes the calls to the random number generator.

On the PC I use, the numbers look like this, which I agree is more consistent when providing a general view of all the benchmarked methods.

BenchmarkDotNet=v0.13.5, OS=Windows 10 (10.0.19045.4412/22H2/2022Update)
11th Gen Intel Core i7-11850H 2.50GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.301
  [Host]   : .NET 8.0.6 (8.0.624.26715), X64 RyuJIT AVX2
  .NET 8.0 : .NET 8.0.6 (8.0.624.26715), X64 RyuJIT AVX2

|      Method |      Job |  Runtime | SampleSize | LoopSize |       Mean |     Error |    StdDev |
|------------ |--------- |--------- |----------- |--------- |-----------:|----------:|----------:|
|   ElementAt | .NET 8.0 | .NET 8.0 |    1000000 |     1000 | 3,403.4 ns |  28.52 ns |  26.67 ns |
|       Index | .NET 8.0 | .NET 8.0 |    1000000 |     1000 |   478.0 ns |   6.93 ns |   6.48 ns |
|       First | .NET 8.0 | .NET 8.0 |    1000000 |     1000 | 6,160.0 ns |  57.66 ns |  53.93 ns |
| First_Index | .NET 8.0 | .NET 8.0 |    1000000 |     1000 |   485.7 ns |   5.81 ns |   5.15 ns |
|        Last | .NET 8.0 | .NET 8.0 |    1000000 |     1000 | 6,034.3 ns |  20.34 ns |  16.98 ns |
|  Last_Index | .NET 8.0 | .NET 8.0 |    1000000 |     1000 |   408.3 ns |   2.54 ns |   2.38 ns |

I do have a question about the returning value. I'm not sure if it makes a difference, as we are also assigning the retrieved value in the current implementation. I did a test and the numbers look very similar:

|      Method | SampleSize | LoopSize |       Mean |    Error |   StdDev |
|------------ |----------- |--------- |-----------:|---------:|---------:|
|   ElementAt |    1000000 |     1000 | 3,755.0 ns | 48.69 ns | 40.66 ns |
|       Index |    1000000 |     1000 |   481.4 ns |  6.45 ns |  5.72 ns |
|       First |    1000000 |     1000 | 6,240.3 ns | 57.20 ns | 53.51 ns |
| First_Index |    1000000 |     1000 |   433.8 ns |  1.15 ns |  1.08 ns |
|        Last |    1000000 |     1000 | 6,472.4 ns | 19.50 ns | 18.24 ns |
|  Last_Index |    1000000 |     1000 |   359.9 ns |  0.82 ns |  0.68 ns |

The last example is the one without a return statement.

Do you have more context about this? Or is there something that I'm missing?

Apollo3zehn · 2024-06-06T17:58:34Z

Our first thought to improve this was to use a predefined list of randomly generated numbers (done in the setup), but that would also mean accessing arrays twice in each of the benchmarked methods. Leading again to numbers greater than they actually are.

That was also my first idea and then I came to the same conclusion :-)

I do have a question about the returning value. I'm not sure if it makes a difference, as we are also assigning the retrieved value in the current implementation
Do you have more context about this? Or is there something that I'm missing?

In the current implementation you assign the variable to a discard variable (_) and the method has no return value. This could make the compiler optimizing out the actual LOC we want to benchmark. So good practice for benchmarks is to make use of the return value as stated also here: Avoid dead code elimination.

I agree that there is not difference in this specific benchmark with the current .NET 8 compiler but that might change without us noticing.

costin-zaharia-sonarsource assigned costin-zaharia-sonarsource and sebastien-marichal Jun 6, 2024

This was referenced Jun 6, 2024

S6608: Update benchmarks code sample and results SonarSource/rspec#3980

Merged

Rule S6608: update benchmark sample code and report #9412

Merged

github-actions bot assigned sebastien-marichal and unassigned costin-zaharia-sonarsource and sebastien-marichal Jun 6, 2024

github-actions bot added this to Review in progress in Best Kanban Jun 6, 2024

github-actions bot assigned costin-zaharia-sonarsource and unassigned sebastien-marichal Jun 6, 2024

github-actions bot moved this from Review in progress to Review approved in Best Kanban Jun 6, 2024

costin-zaharia-sonarsource closed this as completed in #9412 Jun 7, 2024

Best Kanban automation moved this from Review approved to Validate Peach Jun 7, 2024

costin-zaharia-sonarsource added this to the 9.27 milestone Jun 7, 2024

costin-zaharia-sonarsource moved this from Validate Peach to Done in Best Kanban Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rule S6608: Benchmark is benchmarking the wrong things #9390

Rule S6608: Benchmark is benchmarking the wrong things #9390

Apollo3zehn commented Jun 4, 2024

costin-zaharia-sonarsource commented Jun 6, 2024

Apollo3zehn commented Jun 6, 2024 •

edited

Loading

Rule S6608: Benchmark is benchmarking the wrong things #9390

Rule S6608: Benchmark is benchmarking the wrong things #9390

Comments

Apollo3zehn commented Jun 4, 2024

costin-zaharia-sonarsource commented Jun 6, 2024

Apollo3zehn commented Jun 6, 2024 • edited Loading

Apollo3zehn commented Jun 6, 2024 •

edited

Loading