Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Expose xarch serialize instruction. #66467

Closed
anthonycanino opened this issue Mar 10, 2022 · 20 comments
Closed

[API Proposal]: Expose xarch serialize instruction. #66467

anthonycanino opened this issue Mar 10, 2022 · 20 comments
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics
Milestone

Comments

@anthonycanino
Copy link
Contributor

anthonycanino commented Mar 10, 2022

Background and motivation

Currently, the cpuid instruction also functions as a full serializing instruction --- one that ensures all modifications to flags, registers, and memory are drained before the next instruction can execute --- on x86 hardware. Executing a serializing instructions on P6 and more recent processor families constrain speculative execution because the results of speculatively executed instructions are discarded (see Chapter 8: Multiple-Processor Management in the Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A)

However, cpuid has overhead in that it clobbers EAX, EBX, ECX, and EDX registers. Intel exposes a new instruction, serialize under the cpuid feature flag SERIALIZE which acts as a full serializing instruction without the overhead of clobbering the registers as cpuid does.

Full details of the serialize instruction can be found in Chapter 2.1 Instruction Set Reference in Intel Architecture Instruction Set Extensions and Future Features.

API Proposal

namespace System.Runtime.Intrinsics.X86
{
    /// <summary>
    /// This class provides access to Intel SERIALIZE hardware instruction via intrinsics
    /// </summary>
    [Intrinsic]
    [CLSCompliant(false)]
    public abstract class X86Serialize : X86Base
    {
        internal Serialize() { }

        public static new bool IsSupported { get => IsSupported; }

        [Intrinsic]
        public new abstract class X64 : X86Base.X64
        {
            internal X64() {  }

            public static new bool IsSupported { get => IsSupported; }
        }

        /// <summary>
        /// void _serialize (void);
        /// </summary>
        public static void Serialize() => Serialize();

    }
}

API Usage

Broadly speaking, the developer can issue X86Serialize.Serialize() whenever they want to serialize the instruction stream before a branching conditional:

X86Serialize.Serialize();
if (someCondition)
{
  // ...
}

Alternative Designs

dotnet developers currently need to call System.Runtime.Intrinsics.X86.X86Base.CpuId() to issue a full serializing instruction which has the extra overhead stated above as opposed to issuing the serialize instruction.

Risks

As serialize functions as an alternative to cpuid with less overhead and a 0-arg use, we don't foresee any additional risks here.

@anthonycanino anthonycanino added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 10, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 10, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Mar 10, 2022

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Currently, the cpuid instruction also functions as a full serializing instruction --- one that ensures all modifications to flags, registers, and memory are drained before the next instruction can execute --- on x86 hardware. Executing a serializing instructions on P6 and more recent processor families constrain speculative execution because the results of speculatively executed instructions are discarded (see Chapter 8: Multiple-Processor Management in the Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A)

However, cpuid has overhead in that it clobbers EAX, EBX, ECX, and EDX registers. Intel exposes a new instruction, serialize under the cpuid feature flag SERIALIZE which acts as a full serializing instruction without the overhead of clobbering the registers as cpuid does.

Full details of the serialize instruction can be found in Chapter 2.1 Instruction Set Reference in Intel Architecture Instruction Set Extensions and Future Features.

API Proposal

namespace System.Runtime.Intrinsics.X86
{
    /// <summary>
    /// This class provides access to Intel SERIALIZE hardware instruction via intrinsics
    /// </summary>
    [Intrinsic]
    [CLSCompliant(false)]
    public abstract class Serialize : X86Base
    {
        internal Serialize() { }

        public static new bool IsSupported { get => IsSupported; }

        [Intrinsic]
        public new abstract class X64 : X86Base.X64
        {
            internal X64() {  }

            public static new bool IsSupported { get => IsSupported; }
        }

        /// <summary>
        /// void _serialize (void);
        /// </summary>
        public static void Serialize() => Serialize();

    }
}

API Usage

Broadly speaking, the developer can issue Serialize.Serialize() whenever they want to serialize the instruction stream before a branching conditional:

Serialize.Serialize();
if (someCondition)
{
  // ...
}

Alternative Designs

dotnet developers currently need to call System.Runtime.Intrinsics.X86.X86Base.CpuId() to issue a full serializing instruction which has the extra overhead stated above as opposed to issuing the serialize instruction.

Risks

As serialize functions as an alternative to cpuid with less overhead and a 0-arg use, we don't foresee any additional risks here.

Author: anthonycanino
Assignees: -
Labels:

api-suggestion, area-System.Runtime.Intrinsics, untriaged

Milestone: -

@tannergooding tannergooding added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Mar 10, 2022
@tannergooding
Copy link
Member

Thanks for the proposal here! I've gone ahead and marked it ready for review.

Notably, we can't name the class and method both Serialize so we will need to disambiguate somewhere. Just an initial thought/suggestion is X86Serialize.Serialize which would also ensure the class isn't itself more generally conflicting.

I don't have a strong preference, however.

@anthonycanino
Copy link
Contributor Author

anthonycanino commented Mar 10, 2022

Thanks for looking at it! X86Serialize.Serialize is fine with me. I'll adjust.

@anthonycanino
Copy link
Contributor Author

Hi @tannergooding , I can see this API hasn't been discussed yet (I believe) in the API proposal review meetings.

Is there a long backlog of discussions before this would get discussed? Thanks!

@tannergooding tannergooding added this to the 7.0.0 milestone Mar 30, 2022
@tannergooding
Copy link
Member

There is a decent backlog of issues which can be seen here: https://apireview.net/.

If it is important for this to make it into .NET 7 (which will ship around November) than we can prioritize it further.

@anthonycanino
Copy link
Contributor Author

Thank you @tannergooding , I think this is ok for now. Thanks for moving it to the .NET 7 milestone.

@terrajobst
Copy link
Member

terrajobst commented Apr 12, 2022

Video

  • Looks good as proposed
namespace System.Runtime.Intrinsics.X86;

[Intrinsic]
[CLSCompliant(false)]
public abstract class X86Serialize : X86Base
{
    public static new bool IsSupported { get; }

    [Intrinsic]
    public new abstract class X64 : X86Base.X64
    {
        public static new bool IsSupported { get; }
    }

    public static void Serialize();
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 12, 2022
@tannergooding
Copy link
Member

@anthonycanino, this is approved now and we can feel free to implement it whenever someone has availability.

@anthonycanino
Copy link
Contributor Author

@tannergooding is adding the intrinsic lowering for the mono llvm path required for this implemtation, as seen in (#61707) or is adding it to RyuJIT enough?

@tannergooding
Copy link
Member

It shouldn't be required since this is a new ISA and IsSupported will report false.

However, it is generally preferable to bring both online where possible. CC. @fanyang-mono

@fanyang-mono
Copy link
Member

Yes, if it is possible, it would be nice to add intrinsics support to Mono at the same time as CoreCLR. And I could answer questions if there is any, regarding to Mono.

@anthonycanino
Copy link
Contributor Author

Would that include the mini and llvm lowering pathway of mono, or just one?

@anthonycanino
Copy link
Contributor Author

Hi @tannergooding can you answer a question regarding adding a new ISA to RyuJIT? Currently, I am adding just two lines to the InstructionSetDesc.txt...

instructionset     ,X86   ,X86Serialize   ,   ,27 ,X86SERIALIZE ,x86serialize
; ...
implication        ,X86   ,X86SERIALIZE   ,X86Base

and generating with gen.bat in the directory. This change alone makes superpmi report a bunch of missing methods and method diffs, with no other adjustments. I assume it is because the JIT/EE interface has changed? Updating the jit guuid didn't fix the issue.

I'm trying to get the superpmi diffs stable so I can test a branch of mine that does implement the intrinsic.

@tannergooding
Copy link
Member

Right, superpmi won't work as expected when you change the JIT/EE version.

I'm not sure what the best way to get a diff in this case is. CC. @dotnet/jit-contrib who may have advice here

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Apr 25, 2022

jit-diffs --pmi will work but it requires a couple of steps and before/after builds.

You can find instructions here: https://gist.github.com/EgorBo/a72bfa70f08a4d890241037c0fc1fb93

(replace --crossgen <path> with --pmi and add -f )

@anthonycanino
Copy link
Contributor Author

@AndyAyersMS will this work in my case though? Since I am creating a new ISA, I don't think I can isolate the base jit changes from the C# changes because of the re-generated corejit interface via InstructionSetDesc.txt.

Essentially, I cannot only edit the jit due to the new ISA being needed for the JIT intrinsic nodes etc.

@AndyAyersMS
Copy link
Member

Yes, it should work -- you would be comparing results from two entirely different builds.

@fanyang-mono
Copy link
Member

Would that include the mini and llvm lowering pathway of mono, or just one?

Both mini and LLVM.

@tannergooding
Copy link
Member

This was completed in #68677

@ghost ghost locked as resolved and limited conversation to collaborators Jun 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics
Projects
None yet
Development

No branches or pull requests

6 participants