Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unknown Bug/NotBug about span benchmarks #54758

Closed
rootflood opened this issue Jun 25, 2021 · 5 comments
Closed

unknown Bug/NotBug about span benchmarks #54758

rootflood opened this issue Jun 25, 2021 · 5 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Comments

@rootflood
Copy link

Description

i don't know this is a bug or not, recently i saw an issue about performance impact in spans: #54672
i use that benchmark but i got unexpected strange result, i use a class as base of benchmark and other classes as sub benchmark. there isn't no benchmark in sub classes and they are just for test something which was strange for me and the benchmark was only in base class. when i test same method then results was different but i don't use any method in sub classes. i don't know what is the problem here and come from my codes or not because i'm not professional at this field.

the bench repo

Configuration

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.985 (21H1/May2021Update)
.NET SDK=5.0.100
[Host] : .NET 5.0.0 (5.0.20.51904), X64 RyuJIT
DefaultJob : .NET 5.0.0 (5.0.20.51904), X64 RyuJIT

Benchmark

    public abstract class BenchmarksBase
    {
        protected int[] _array;

        [Params(1_000_000)]
        public int Count { get; set; }

        [GlobalSetup]
        public void GlobalSetup()
        {
            _array = new int[Count];
        }
        
        [Benchmark]
        public int Span_ForEach()
        {
            var Res = 0;
            var source = _array.AsSpan();
            for (int i = 0; i < 1000; i++)
            {
                Res = 0;
                foreach (var item in source)
                    Res += item;
            }
            return Res;
        }
    }

Sub Benchmarks

    public class Benchmarks1 : BenchmarksBase
    {}
    public class Benchmarks2 : BenchmarksBase
    {
        public int Array_For()
        {
            var array = _array;
            var Res = 0;
            for (int To = 0; To < 1000; To++)
            {
                Res = 0;
                for (int i = 0; i < array.Length; i++)
                    Res += array[i];
            }
            return Res;
        }
    }

    public class Benchmarks3 : BenchmarksBase
    {
        public int Array_For()
        {...} // same like Benchmarks2.Array_For()

        public int Array_For1()
        {...} // same like Benchmarks2.Array_For()

        public int Array_For2()
        {...} // same like Benchmarks2.Array_For()

        public int Array_For3()
        {...} // same like Benchmarks2.Array_For()
    }

Benchmark Results

Benchmarks1 class:

Method Count Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
Span_ForEach 1000000 736.6 ms 10.36 ms 11.52 ms - - - - 65 B

Benchmarks2 class:

Method Count Mean Error StdDev Code Size Gen 0 Gen 1 Gen 2 Allocated
Span_ForEach 1000000 1.054 s 0.0064 s 0.0057 s 65 B - - - -

Benchmarks3 class:

Method Count Mean Error StdDev Code Size Gen 0 Gen 1 Gen 2 Allocated
Span_ForEach 1000000 733.8 ms 8.68 ms 7.25 ms 0 KB - - - 1 KB
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jun 25, 2021
@huoyaoyuan
Copy link
Member

The Allocated column hints that their method body are different.
You can use sharplab to compare asm of method bodies.

@rootflood
Copy link
Author

thanks, you are right but in sharplab the "Span_ForEach" method are just one and have one body:

BenchmarksBase.Span_ForEach()
    L0000: mov rax, [rcx+8]
    L0004: test rax, rax
    L0007: jne short L000f
    L0009: xor edx, edx
    L000b: xor ecx, ecx
    L000d: jmp short L0016
    L000f: lea rdx, [rax+0x10]
    L0013: mov ecx, [rax+8]
    L0016: xor eax, eax
    L0018: xor r8d, r8d
    L001b: xor r9d, r9d
    L001e: test ecx, ecx
    L0020: jle short L0034
    L0022: movsxd r10, r9d
    L0025: mov r10d, [rdx+r10*4]
    L0029: add r8d, r10d
    L002c: inc r9d
    L002f: cmp r9d, ecx
    L0032: jl short L0022
    L0034: inc eax
    L0036: cmp eax, 0x3e8
    L003b: jl short L0018
    L003d: mov eax, r8d
    L0040: ret

but in asm in debug mode in my machine yes "Span_ForEach" have different body if i thought correct. look like they are optimized in different way but i don't expect this different.
before of this issue i found this different benchmark from my other bench here and the timing of "Array_ForEach" and "Array_ForEach_Inside" are unexpected different.

Method Count Mean Error StdDev Ratio RatioSD Code Size Gen 0 Gen 1 Gen 2 Allocated
Array_ForEach 1000000 742.0 ms 12.16 ms 10.78 ms baseline 69 B - - - 560 B
Array_ForEach_Inside 1000000 1,047.6 ms 5.31 ms 4.44 ms 1.41x slower 0.02x 52 B - - - 1,240 B

and look like that problem can lead same problem in span performance. so in my benchmark which i told at first of this issue that are 1.43x slower than other class. like what found in #54672

@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 26, 2021
@adamsitnik
Copy link
Member

This is a memory alignment issue, I've provided full answer in dotnet/BenchmarkDotNet#1733 (comment)

@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Jun 28, 2021
@adamsitnik
Copy link
Member

cc @kunalspathak

@dotnet dotnet locked as resolved and limited conversation to collaborators Jul 28, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

4 participants