Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel architecture improvements for .NET 9 #93196

Open
12 of 33 tasks
BruceForstall opened this issue Oct 9, 2023 · 11 comments
Open
12 of 33 tasks

Intel architecture improvements for .NET 9 #93196

BruceForstall opened this issue Oct 9, 2023 · 11 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@BruceForstall
Copy link
Member

BruceForstall commented Oct 9, 2023

This issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9.

In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10.

Libraries work

Vector<T>

  • Consider Vector<T> expanding to Vector512<T>, either automatically or opt-in.

AVX10

AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512. An arch-avx10 GitHub label is defined to be added to all related PRs and issues: https://github.com/dotnet/runtime/labels/arch-avx10

  • Convert remaining AVX2 implementations to Vector256 (to be "help wanted")
  • Add VM/JIT AVX10 awareness: CPUID enumeration and detection
  • Propose a new AVX10 API: [API Proposal]: Expose AVX10 converged vector ISA #98069
  • (Q2'24) Do JIT codegen implementation of the API
  • (Q2'24) Add AVX10 APIs
  • Enhance Vector256 codegen with AVX10 instructions (related to what has already been done for AVX512VL)
  • (Q2'24) Allow additional 16 YMM registers for AVX10
  • Allow embedded rounding for YMM/ZMM (related: Enable EVEX embedded rounding support in xarch emitter #93154)
  • Allow AVX-512 optimizations for YMM (e.g., scalar conversion, vpternlog)
  • Identify test plan for .NET 9 sign-off

RyuJIT feature work

RyuJIT optimization work

Debugging / diagnostics work

API design work

JCC erratum

@BruceForstall BruceForstall added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic. labels Oct 9, 2023
@BruceForstall BruceForstall added this to the 9.0.0 milestone Oct 9, 2023
@ghost
Copy link

ghost commented Oct 9, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9.

In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10.

Libraries work

RyuJIT feature work

  • Consider Vector expanding to Vector512, either automatically or opt-in.

RyuJIT feature work

RyuJIT optimization work

AVX10

AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512.

  • Add VM/JIT AVX10 awareness: CPUID enumeration and detection
  • Propose a new AVX10 API
  • Do JIT codegen implementation of the API
  • Enhance Vector256 codegen with AVX10 instructions (related to what has already been done for AVX512VL)
  • Allow additional 16 YMM registers for AVX10
  • Allow embedded rounding for YMM/ZMM (related: Enable EVEX embedded rounding support in xarch emitter #93154)
  • Convert remaining AVX2 implementations to Vector256
  • Allow AVX-512 optimizations for YMM (e.g., scalar conversion, vpternlog)

CI/testing work

Debugging / diagnostics work

API design work

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr, User Story

Milestone: 9.0.0

@BruceForstall BruceForstall added this to Needs Triage in .NET Core CodeGen via automation Oct 9, 2023
@BruceForstall BruceForstall moved this from Needs Triage to Team User Stories in .NET Core CodeGen Oct 9, 2023
@BruceForstall BruceForstall self-assigned this Oct 9, 2023
@MichalPetryka
Copy link
Contributor

Is there maybe any interest in adding the workaround for the JCC erratum (#35730) in .Net 9? I've seen minor codegen improvements be reported as huge regressions because the code started to hit this issue.

@BruceForstall
Copy link
Member Author

Is there maybe any interest in adding the workaround for the JCC erratum (#35730) in .Net 9? I've seen minor codegen improvements be reported as huge regressions because the code started to hit this issue.

@AndyAyersMS has expressed a desire to at least have a mode that could be used for performance testing to avoid the JCC erratum. Whether we could enable this always would depend on how uniform the improvements would be. It is expected there would be some code size regressions -- possibly significant -- due to the need to insert NOPs.

@MichalPetryka
Copy link
Contributor

It is expected there would be some code size regressions -- possibly significant -- due to the need to insert NOPs.

Didn't we already accept that tradeoff with loop alignment?

@BruceForstall
Copy link
Member Author

Didn't we already accept that tradeoff with loop alignment?

Yes, but this could be a very different magnitude of regression that will need to be measured.

@BruceForstall
Copy link
Member Author

I went ahead and created #93243 related to adding a JIT mode to avoid the JCC erratum, and linked it here.

@Spacefish
Copy link
Contributor

I added Vector512 support for Min/Max of simple numeric datatypes in this PR: #93369

@huoyaoyuan
Copy link
Member

What about the upcoming APX extension? It looks like a major change of x86-64. I can see discussions around ABI for APX in GCC mail thread: https://gcc.gnu.org/pipermail/gcc/2023-July/242154.html https://gcc.gnu.org/pipermail/gcc-help/2023-August/142801.html

Maybe it's too early for .NET to adopt APX, but I'd like to see the estimated timeline. Should we wait for MSVC to define the calling convention?

@tannergooding
Copy link
Member

We want to have hardware available on which it can run.

While Intel hasn't given an official timeline as of yet, such hardware is most likely not in the .NET 9 lifetime which ships in November 2024 and will be out of support around May 2026.

I expect this work will be done for .NET 10 which will likely ship around November 2025 (assuming we don't change our current pacing of releases) and be out of support November 2028.

@MichalPetryka
Copy link
Contributor

MichalPetryka commented Feb 27, 2024

We want to have hardware available on which it can run.

Would using Intel SDE not be enough for testing the support for it? It seems to already have support for emulating AVX10 and APX.

@tannergooding
Copy link
Member

There's no point in scheduling work to be done for hardware that doesn't exist yet, particularly if that hardware is unlikely to exist within the lifetime of a release.

That is, we know that AVX10 is going to exist for Granite Rapids, as per the official announcement: https://www.intel.com/content/www/us/en/content-details/784267/intel-advanced-vector-extensions-10-intel-avx10-architecture-specification.html. The AVX10.1 work is correspondingly happening in .NET 9

While no official release date has been announced for APX, it is unlikely to happen in a timeframe that makes .NET 9 a good choice to target.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic.
Projects
.NET Core CodeGen
  
Team User Stories
Development

No branches or pull requests

5 participants