Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine field types based on a derived type's generic instantiation of its base class to devirtualize virtual calls #80564

Open
NinoFloris opened this issue Jan 12, 2023 · 2 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@NinoFloris
Copy link
Contributor

NinoFloris commented Jan 12, 2023

As the title suggests, I'd like to see field types get refined based on the generic instantiation of a generic base class implemented by a derived type.

Given the following example and sharplab

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKBu2FwymzAwAIy2BhCAOwDcYUDEIA8AFQB8NAN402CjgGY29Rs1YcUbAOpQAliIAU4tv2wAbAK4wAlAG4aAXxo1cMSzAAmHUmwCSvBg8AkIiUCDcfILCYvpB0tRy1Ips8ewA+ubWMI60KYrEKhAxBl4wWroGxulmlja2bPKKMk0FqWwA9J1sAMoY0BWGbLgQbBgAFhUAUv6mXhAwuLwA5OwAIlwAomkYAHTNHVn1FQC8dTlsAFRsSnmpLtSPdAxMLOycAGonITHhEoE0GxxAB5KwYIHiX5hISSSLQ2JQAG8RKpADuU1gwIR4Xh0RhSNB4MSyVSAAcoBARKxvGxYNgFrwLABPbH4xFsDKQUKI+6FFTfHI4oQmYVQNjcv5CRqnSScyUEtjnBW81ztNgUqkwGk+NRvTRE9g48QQEyBC4NPkKIpsEpCMoVYjaPSGGBm3gWuxKuVc9nhPYu4zG03ZBoOZxq9yeHycXoTaDBP1CSKCmxi0S4ePCIHpIGBRM88Ikw50jyMlkBIJizlefSwVhivIlm1xhNioz56swABm3e1GH0gjFjRAwDwbp7fdYg5gw7YMme6s11JEPjtUAdaSCUULJqMmYTntsJdJHWtAHZPdc2OQAAy3q1sRepG1Oqqu9Z1/v7rPsUNe2Uay/BskygANqjdYMjH/Wxw3yF9ilKfRykqQM3QPYQj29Ngx3ccDXWgk44MeIA==

using System;

abstract class Converter<T>
{
    public abstract void Write(T value);
}

sealed class IntConverter: Converter<int>
{
    int _value;

    public override void Write(int value) 
    { 
       // Store it so the JIT doesn't DCE it.
       _value = value * 3;
    }
}

abstract class ValueConverter<TIn, TOut, TConverter>: Converter<TIn>
    where TConverter: Converter<TOut>
{
    protected readonly TConverter _converter;
    public ValueConverter(TConverter converter) => _converter = converter;

    protected abstract TOut ConvertTo(TIn value);
    public override void Write(TIn value) => _converter.Write(ConvertTo(value));
}

sealed class ShortConverter: ValueConverter<short, int, IntConverter>
{
    readonly IntConverter _directConverter;

    public ShortConverter(IntConverter effectiveConverter) :base(effectiveConverter) {}

    protected override int ConvertTo(short value)
    {
        return value * 100;
    }

    public void WriteDirect(short value) => _directConverter.Write(ConvertTo(value));

    public override void Write(short value) => base.Write(value);
}

Looking at the disassembly of ShortConverter.WriteDirect quickly we can see the JIT fully inlined the entire call chain including IntConverter.Write as expected.

In the disassembly of ShortConverter.Write we see the following:

  • ValueConverter.Write is devirtualized and inlined because ShortConverter is sealed.
  • The implementation of ConvertTo provided by ShortConverter is also devirtualized and inlined.
  • No devirtualization for _converter.Write and the JIT emits code for a virtual call.

What I want to understand is why the JIT won't devirtualize that final call. With the derived type information it should know the TConverter _converter field can only ever be of type IntConverter.

I understand that the CLR does not specialize code if it only involves reference types. (though I can imagine in inheritance hierarchies this statement could be a lot more nuanced?) So if I could construct ValueConverter and would disassemble Write I can understand how that's not inlining to whatever reference type TConverter I instantiated it at.

In this case though it's a new derived class (and a sealed one at that) and the most accurate information is not incorporated into the code that is being jitted. The method has to be jitted because it's simply a new method, at that point I don't see an argument for not flowing the most accurate type information anymore?

Is this a missed optimization or is this blocked for some specific reason to prevent code bloat? If it's the latter case would it be an idea to block inlining but not devirtualization?

Additionally (and I know this creeps closer to monomorphization) I would personally prefer that such an optimization would also kick in for sealed derived classes that don't redefine base virtual methods iff its base class was generic. This should give the perf benefits without immediately bringing a lot of code duplication as I see it. @AndyAyersMS

@NinoFloris NinoFloris added the tenet-performance Performance related issue label Jan 12, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 12, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 12, 2023
@ghost
Copy link

ghost commented Jan 12, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

As the title suggest, I'd like to see field types get refined based on the generic instantiation of a generic base class implemented by a derived type.

Given the following example and sharplab

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKBu2FwymzAwAIy2BhCAOwDcYUDEIA8AFQB8NAN402CjgGY29Rs1YcUbAOpQAliIAU4tv2wAbAK4wAlAG4aAXxo1cMSzAAmHUmwCSvBg8AkIiUCDcfILCYvpB0tRy1Ips8ewA+ubWMI60KYrEKhAxBl4wWroGxulmlja2bPKKMk0FqWwA9J1sAMoY0BWGbLgQbBgAFhUAUv6mXhAwuLwA5OwAIlwAomkYAHTNHVn1FQC8dTlsAFRsSnmpLtSPdAxMLOycAGonITHhEoE0GxxAB5KwYIHiX5hISSSLQ2JQAG8RKpADuU1gwIR4Xh0RhSNB4MSyVSAAcoBARKxvGxYNgFrwLABPbH4xFsDKQUKI+6FFTfHI4oQmYVQNjcv5CRqnSScyUEtjnBW81ztNgUqkwGk+NRvTRE9g48QQEyBC4NPkKIpsEpCMoVYjaPSGGBm3gWuxKuVc9nhPYu4zG03ZBoOZxq9yeHycXoTaDBP1CSKCmxi0S4ePCIHpIGBRM88Ikw50jyMlkBIJizlefSwVhivIlm1xhNioz56swABm3e1GH0gjFjRAwDwbp7fdYg5gw7YMme6s11JEPjtUAdaSCUULJqMmYTntsJdJHWtAHZPdc2OQAAy3q1sRepG1Oqqu9Z1/v7rPsUNe2Uay/BskygANqjdYMjH/Wxw3yF9ilKfRykqQM3QPYQj29Ngx3ccDXWgk44MeIA==

using System;

abstract class Converter<T>
{
    public abstract void Write(T value);
}

sealed class IntConverter: Converter<int>
{
    int _value;

    public override void Write(int value) 
    { 
       // Store it so the JIT doesn't DCE it.
       _value = value * 3;
    }
}

abstract class ValueConverter<TIn, TOut, TConverter>: Converter<TIn>
    where TConverter: Converter<TOut>
{
    protected readonly TConverter _converter;
    public ValueConverter(TConverter converter) => _converter = converter;

    protected abstract TOut ConvertTo(TIn value);
    public override void Write(TIn value) => _converter.Write(ConvertTo(value));
}

sealed class ShortConverter: ValueConverter<short, int, IntConverter>
{
    readonly IntConverter _directConverter;

    public ShortConverter(IntConverter effectiveConverter) :base(effectiveConverter) {}

    protected override int ConvertTo(short value)
    {
        return value * 100;
    }

    public void WriteDirect(short value) => _directConverter.Write(ConvertTo(value));

    public override void Write(short value) => base.Write(value);
}

Looking at the disassembly of ShortConverter.WriteDirect quickly we can see the JIT fully inlined the entire call chain including IntConverter.Write as expected.

In the disassembly of ShortConverter.Write we see the following:

  • ValueConverter.Write is devirtualized and inlined because ShortConverter is sealed.
  • The implementation of ConvertTo provided by ShortConverter is also devirtualized and inlined.
  • No devirtualization for _converter.Write and the JIT emits code for a virtual call.

What I want to understand is why the JIT won't devirtualize that final call. With the derived type information it should know the TConverter _converter field can only ever be of type IntConverter.

I understand that the CLR does not specialize code if it only involves reference types. (though I can imagine in inheritance hierarchies this statement could be a lot more nuanced?) So if I could construct ValueConverter and would disassemble Write I can understand how that's not inlining to whatever reference type TConverter I instantiated it at.

In this case though it's a new derived class (and a sealed one at that) and the most accurate information is not incorporated into the code that is being jitted. The method has to be jitted because it's simply a new method, at that point I don't see an argument for not flowing the most accurate type information anymore?

Is this a missed optimization or is this blocked for some specific reason to prevent code bloat? If it's the latter case would it be an idea to block inlining but not devirtualization?

Additionally (and I know this creeps closer to monomorphization) I would personally prefer that such an optimization would also kick in for sealed derived classes that don't redefine base virtual methods iff its base class was generic. This should give the perf benefits without immediately bringing a lot of code duplication as I see it. @AndyAyersMS

Author: NinoFloris
Assignees: -
Labels:

tenet-performance, area-CodeGen-coreclr, untriaged

Milestone: -

@AndyAyersMS
Copy link
Member

I know there are some cases where the jit could use more precise ("exact") contexts for parsing inlinees, if the root method is not a shared instance (eg #76714 (comment)).

IIRC plumbing this sort of thing through everywhere seemed tricky, but it probably deserves a deeper look.

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Jan 12, 2023
@AndyAyersMS AndyAyersMS added this to the Future milestone Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

2 participants