Skip to content

JIT does not inline function with generic struct argument #116740

Closed
@hbirler

Description

@hbirler

Description

I have a tiny function Op2 that takes a single generic struct argument that is called in an hot loop. JIT refuses to inline this function. However, if I make the argument non-generic, the function is inlined.

This happens even though the function is marked with the [MethodImpl(MethodImplOptions.AggressiveInlining)] attribute.

Reproduction Steps

The code (Also on compiler explorer with BenchmarkDotNet removed https://csharp.godbolt.org/z/Gxej17TsP):

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace umbee_cli;

[DisassemblyDiagnoser(printSource: true)]
public class Test
{
    private int sum = 0;

    record struct Iu1();
    record struct Iu2();

    interface IVariables
    {
        public int Get<T>(T v);
        public void Set<T>(T v, int i);    
    }

    struct Variables : IVariables
    {
        private int iu1;
        private int iu2;


        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public int Get<T>(T v)
        {
            switch (v)
            {
                case Iu1 _: return iu1;
                case Iu2 _: return iu2;
            }
            return 0;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Set<T>(T v, int i)
        {
            switch (v)
            {
                case Iu1 _: iu1 = i; break;
                case Iu2 _: iu2 = i; break;
            }
        }
    }
    
    [Benchmark]
    public void ScanProduce()
    {
        Variables variables = new Variables();
        for (int i = 0; i < 100; i++)
        {
            variables.Set(new Iu1(), i);
            Op2(variables);
        }
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    void Op2<TVar>(TVar variables) where TVar : struct, IVariables
    {
        sum += variables.Get(new Iu1());
    }
}
class Program
{
    public static void Main(string[] args)
    {
        var summary = BenchmarkRunner.Run<Test>();
    }
}

The ASM:
.NET 10.0.0 (10.0.25.27814), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

; umbee_cli.Test.ScanProduce()
;         Variables variables = new Variables();
;         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
;         for (int i = 0; i < 100; i++)
;              ^^^^^^^^^
;             variables.Set(new Iu1(), i);
;             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
;             Op2(variables);
;             ^^^^^^^^^^^^^^^
       push      rsi
       push      rbx
       sub       rsp,28
       mov       rbx,rcx
       xor       esi,esi
M00_L00:
       mov       [rsp+20],esi
       xor       edx,edx
       mov       [rsp+24],edx
       mov       rdx,[rsp+20]
       mov       rcx,rbx
       call      qword ptr [7FFCD27EF150]; umbee_cli.Test.Op2[[umbee_cli.Test+Variables, umbee-cli]](Variables)
       inc       esi
       cmp       esi,64
       jl        short M00_L00
       add       rsp,28
       pop       rbx
       pop       rsi
       ret
; Total bytes of code 49
; umbee_cli.Test.Op2[[umbee_cli.Test+Variables, umbee-cli]](Variables)
;         sum += variables.Get(new Iu1());
;         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       add       [rcx+8],edx
       ret
; Total bytes of code 4

Although Op2 is extremely tiny it is not inlined. This happens even when I pass the struct by ref.

Expected behavior

I would expect the JIT to inline Op2 within ScanProduce.

Here is the ASM when I change the signature of Op2 to void Op2(Variables variables):
.NET 10.0.0 (10.0.25.27814), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

; umbee_cli.Test.ScanProduce()
;         Variables variables = new Variables();
;         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
;         for (int i = 0; i < 100; i++)
;              ^^^^^^^^^
;             variables.Set(new Iu1(), i);
;             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
;             Op2(variables);
;             ^^^^^^^^^^^^^^^
       xor       eax,eax
M00_L00:
       add       [rcx+8],eax
       inc       eax
       cmp       eax,64
       jl        short M00_L00
       ret
; Total bytes of code 13

Actual behavior

JIT does not inline Op2.

Regression?

No response

Known Workarounds

No response

Configuration

.NET 10.0.0 (10.0.25.27814), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Windows 11

The release version of .NET 9 also has this issue. So does not seem to be a recent regression.

Other information

Some context: I am trying to build an efficient data pipeline in C#. My goal is to implement operators using generics and instantiate this generics in runtime given a user query.
For example, for the following query:

select sum(x)
from generate_series(0, 99) t(x);

I have ScanProduce that produces the values between 0 and 99 and Op1 which adds up the values given by ScanProduce in a result.

I need a way of setting and getting values such that these operators can pass information amongst themselves based on the user-defined query. The hope would be that the JIT is capable of inlining everything avoiding copies and spills.

The rough sketch of the processing would be:
Query -> Prepare struct Variables with reflection -> Generic instantiation -> JIT -> Very efficient code for executing query

I hope to not have to use reflection within the implementation of the operators and rely on generics as much as possible, to be able to use the debugger effectively with breakpoints etc when debugging operators.

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions