Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify instruction set definition #33730

Merged

Conversation

davidwrighton
Copy link
Member

  • Build simple DSL to describe the instruction set support of the compiler/jit/etc
  • Parse DSL and produce data structures useable throughout our compilation environment
  • This is in support of adding more granular instruction set support to crossgen2, but this change is pulled out into its own PR to ease reviewing cost

- Build simple DSL to describe the instruction set support of the compiler/jit/etc
- Parse DSL and produce data structures useable throughout our compilation environment
- This is in support of adding more granular instruction set support to crossgen2, but this change is pulled out into its own PR to ease reviewing cost
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 18, 2020
@AndyAyersMS
Copy link
Member

cc @dotnet/jit-contrib

InstructionSet_SSE41_X64=24,
InstructionSet_SSE42_X64=25,
#endif // TARGET_AMD64
#ifdef TARGET_X86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why duplicate TARGET_X86 and TARGET_AMD64, when the latter is just an extension of the former?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now an autogenerated file, and adding the XARCH handling is actually a fair amount of complexity to the generator.

@@ -40,7 +42,6 @@ class CORJIT_FLAGS
CORJIT_FLAG_TARGET_P4 = 9,
CORJIT_FLAG_USE_FCOMI = 10, // Generated code may use fcomi(p) instruction
CORJIT_FLAG_USE_CMOV = 11, // Generated code may use cmov instruction
CORJIT_FLAG_USE_SSE2 = 12, // Generated code may use SSE-2 instructions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for a potential future PR, if we are removing this, it might be worth revisiting some of the others as well.

CMOVcc was in the Pentium Pro (1995) while SSE came later in the Pentium III (1999).
SSE2 came in the Pentium 4 which I presume is the TARGET_P4 we have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corjit flag isn't really removed. We now have a separate InstructionSet_SSE2 for it, along with all of the rest of the instruction set stuff.

We could probably remove the CMOV and FCOMI stuff as it isn't really used for anything, and the TARGET_P4 is only tied to an optimization that I bet we never hit. (TARGET_P4 generates code that runs slower on any other core than a P4, but runs subtly better on a P4). I'm not really trying to make a bigger change than this though, so if you want to adjust that, please do it in another PR.

#define READYTORUNINSTRUCTIONSET_H
enum ReadyToRunInstructionSet
{
READYTORUN_INSTRUCTION_Sse=1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange that we are mixing x86 and ARM instruction sets in here where we separate them in other parts of the runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ReadyToRun constants are generally defined in an architecture independent way, and this just follows that convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this going to expand beyond 32-bits?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are simply increasing numbers. The planned encoding for this is actually as an array of raw numbers so there is no limit to 32 different instruction sets at all, the limit is more like 2^30 different instruction sets supported, which is definitely enough.

}

if (jitFlags.IsSet(JitFlags::JIT_FLAG_USE_LZCNT) && JitConfig.EnableLZCNT())
if (!JitConfig.EnableSSE())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous nesting was also mirroring the implementation hierarchy. Is this going to remain correct based on VM checks (and also take into account the COMPlus_EnableIsa=0 checks)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is what the code that calls EnsureInstructionSetFlagsAreValid is doing. It represents the hierarchy of support described in /src/coreclr/src/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt, and means we don't have to duplicate the hierarchy in the VM, the JIT, crossgen2, and wherever else we put this thing.

@@ -34,95 +34,95 @@ PAL_GetJitCpuCapabilityFlags(CORJIT_FLAGS *flags)
// available, using the latest kernel for release should be sufficient.
#ifdef HWCAP_AES
if (hwCap & HWCAP_AES)
CPUCompileFlags.Set(CORJIT_FLAGS::CORJIT_FLAG_HAS_ARM64_AES);
CPUCompileFlags.Set(InstructionSet_Aes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep arm somewhere in this name, so it can be differentiated from other platforms that have AES?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C++ world, the InstructionSet_XXX enumeration values are architecture specific. In the C# world they are not. The Instruction_XXX names were specifically chosen to match the existing names as used in the jit to make the delta for this change reviewable.

case InstructionSet.ARM64_Sha1: return ReadyToRunInstructionSet.Sha1;
case InstructionSet.ARM64_Sha256: return ReadyToRunInstructionSet.Sha256;
case InstructionSet.ARM64_Atomics: return ReadyToRunInstructionSet.Atomics;
case InstructionSet.ARM64_Vector64: return ReadyToRunInstructionSet.;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a Base entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, you caught me, this file isn't actually attached to the build yet. Once this is merged, I'll rebase my existing ISA changes to use this stuff, and … well I'll have to fix this problem. I'd rather not invest too much on the R2R instruction set stuff at this time in this PR as I don't have the code in the branch that will use any of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this, even though this file still isn't attached to the build.

flags.Set(InstructionSet.X86_PCLMULQDQ);
flags.Set(InstructionSet.X86_SSE3);
flags.Set(InstructionSet.X86_SSSE3);
flags.Set(InstructionSet.X86_LZCNT);
Copy link
Member

@tannergooding tannergooding Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is mirroring the current state, but LZCNT was released (Haswell) the same time as AVX2, BMI1, and BMI2. I don't think it makes sense to have it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I REALLY don't want to change any of the behavior of the system with this change. Also, fortunately, there isn't actually any use of this instruction in corelib, so it doesn't really matter.

If we want to change the current state of the world, we should do it in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to change the current state of the world, we should do it in another PR.

I'm fine with that, but we definitely are using it from multiple places in corelib: https://source.dot.net/#System.Private.CoreLib/BitOperations.cs,f10c861b0040aaea,references

)
}
}
else if (targetArchitecture == TargetArchitecture.X64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could combine the x86 and x64 codepaths and just add the *_X64 variants for x64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee that the X86 and X64 variants actually use the same constants. (It so happens they do at the moment, but that is one of the few actually negative consequences of this change, which is that the constants could become out of sync.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is unfortunate, as it means we have a lot of duplicated code between the two of them (and that will likely grow over time).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does seem unfortunate. Does it really add a lot of complexity to share them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it would be quite a bit of complexity, as the generator is pretty dumb. Mind you, the followon work is expected to delete this (the only logic which adds this duplication of effort) in favor of some string handling which will make all of this unified again. I hope to put together that PR by end of week, assuming I can get this in tonight or tomorrow morning, so I don't see much value in investing in a bit of code that should only live for about 48-96 hours before beginning active replacement.

@tannergooding
Copy link
Member

Overall the changes look good and they will likely make maintaining this easier across the runtime as a whole.

I'll log separate bugs for a couple of the things I called out (like us using LZCNT as a "baseline" in S.P.Corelib where it is an AVX2 era instruction for Intel).

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires a new jit guid, right?

Would be good to double-check that this is zero diff.

// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

// DO NOT EDIT THIS FILE! IT IS AUTOGENERATED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice for the autogenerated files to list the source files they depend on (and maybe also the commands to actually propagate updates).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, a proliferation of mystery isn't ideal.

@@ -8270,7 +8270,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
return false;
}

bool compSupports(InstructionSet isa) const
bool compSupports(CORINFO_InstructionSet isa) const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should remove this method, retype compSupportsISA as CORINFO_InstructionSetFlags, and have consumers call the equivalent methods defined on that type. But this can be done as a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the idea was to take the new names we discussed in the other WIP PR that I have, but I didn't want to merge all of that with this.

@@ -1181,10 +1181,17 @@ void Zapper::InitializeCompilerFlags(CORCOMPILE_VERSION_INFO * pVersionInfo)
}

// .NET Core requires SSE2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// .NET Core requires SSE2.

Copy link
Contributor

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a nice simplification. I have a question and a couple of comments.

if (hwCap & HWCAP_DCPOP)
CPUCompileFlags.Set(CORJIT_FLAGS::CORJIT_FLAG_HAS_ARM64_DCPOP);
// if (hwCap & HWCAP_DCPOP)
// CPUCompileFlags.Set(CORJIT_FLAGS::CORJIT_FLAG_HAS_ARM64_DCPOP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these (and those below) commented out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't currently used (they represent ISAs we don't actually support today). I would guess that is why they are commented out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have final names for these, and they aren't used. OTOH, when it becomes time to bring them back, uncommenting should be convenient.

)
}
}
else if (targetArchitecture == TargetArchitecture.X64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does seem unfortunate. Does it really add a lot of complexity to share them?


; Definition of X86 instruction sets

instructionset,X86,Sse,,1,SSE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be easier to read (and hopefully not significantly harder for the tool to parse) if the columns lined up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- Fix CORJIT_FLAGS::Add, Reset, and IsEmpty
- Make it harder to mess up 64bit instruction set variants by adding helper function to fill them in.
…need to add a large number of #ifdefs to the jit
@davidwrighton
Copy link
Member Author

@AndyAyersMS I've verified there are no longer any jit diffs.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David. This looks good to me.

Copy link
Contributor

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks!

InstructionSet_POPCNT=15,
InstructionSet_Vector128=16,
InstructionSet_Vector256=17,
InstructionSet_BMI1_X64=18,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure, do these need to be defined so x86 can report false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I got everything working, but I still couldn't build x86 …. and the amount of idefs that were going to be required was entirely unreasonable.

if (HasInstructionSet(InstructionSet_POPCNT))
AddInstructionSet(InstructionSet_POPCNT_X64);
#endif // TARGET_AMD64
#ifdef TARGET_X86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming, this is meant to be empty and isn't a bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, having the generator skip if empty or leave a comment if empty might be more clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@@ -46,18 +47,21 @@ implication ,X86 ,POPCNT ,SSE42
instructionset ,X86 , , , ,Vector128
instructionset ,X86 , , , ,Vector256

; Definition of X64 instruction sets
; Definition of X64 instruction sets (Define )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to define instructionset64bit for a non-64bit architecture.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, custom dsls are weird. But avoiding very large diffs that don't really provide value is worth it.

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just had a couple questions about the x86 vs x64 definitions.

@davidwrighton davidwrighton merged commit f80a514 into dotnet:master Mar 19, 2020
davidwrighton added a commit to davidwrighton/runtime that referenced this pull request Mar 21, 2020
jkotas pushed a commit that referenced this pull request Mar 21, 2020
davidwrighton added a commit to davidwrighton/runtime that referenced this pull request Mar 23, 2020
davidwrighton added a commit that referenced this pull request Mar 24, 2020
* Revert "Revert "Unify instruction set definition (#33730)""

This reverts commit 5a71f14.

* Ensure that 64 bit variants of instruction sets are handled correctly
- Make sure to enable them based on the related 32bit instruction sets before disabling instruction sets that are enabled but not compatible with the instruction set hierarchy the runtime is designed for.

* Update jit EE version interface as this revert changes the interface back to the pre-revert state.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
@davidwrighton davidwrighton deleted the generated_instruction_set_processing branch April 20, 2021 17:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants