Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT: Add win-x86 support #99372

Merged
merged 32 commits into from
Mar 11, 2024

Conversation

filipnavara
Copy link
Member

@filipnavara filipnavara commented Mar 6, 2024

Passes the NativeAOT smoke tests in Debug and Checked configurations:

Time [secs] | Total | Passed | Failed | Skipped | Assembly Execution Summary
============================================================================
      0.079 |     1 |      1 |      0 |       0 | nativeaot.CustomMain.XUnitWrapper.dll
      0.053 |     1 |      1 |      0 |       0 | nativeaot.GenerateUnmanagedEntryPoints.XUnitWrapper.dll
      6.308 |    16 |     16 |      0 |       0 | nativeaot.SmokeTests.XUnitWrapper.dll
      0.059 |     1 |      1 |      0 |       0 | nativeaot.StartupHook.XUnitWrapper.dll
----------------------------------------------------------------------------
      6.499 |    19 |     19 |      0 |       0 | (total)

Math helpers in ASM

Due to calling convention differences it was easier to implement the subset of math helper function in native runtime in ASM code. Most of the methods call C runtime. Some of them depend on xmm registers which rises the baseline (fully updated Windows 7 requires SSE2 instruction set anyway). The implementation is mostly minimum viable prototype and I anticipate that few of the helper methods will be moved to managed code soon.

@ghost
Copy link

ghost commented Mar 6, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Keeping this as draft as I may split it into more PRs. There are some unresolved issues which may warrant a discussion first.

Calling conventions and name mangling

The first commit in the PR deals with the calling convention and name mangling. Generally, there are three unmanaged calling conventions on x86:

  • stdcall (_name@argSize) with caller-push, callee-pop, all arguments on stack pushed right-to-left
  • cdecl (_name) with caller-push, caller-pop, all arguments on stack pushed right-to-left
  • fastcall (@name@argSize) with caller-push, callee-pop, first two arguments in ECX, EDX, rest pushed on stack right-to-left

In addition, there are managed calling conventions:

  • The default managed convention is similar to fastcall above with caller-push, callee-pop, and first two arguments in ECX, EDX; unlike fastcall the rest of parameters is pushed on stack left-to-right
  • Some math helpers use custom calling convention (eg. RhpLLsh, RhpLRsh and RhpLRsz)

We need to bridge between the managed calling convention and the unmanaged one. Some helper calls are implemented in the managed code, while others are implemented in the native code. There are two parts to this bridging - argument reordering and name mangling.

The argument reordering is handled with the COOP_PINVOKE_HELPER macro which swaps argument order for any method with 4 or more parameters.

For name mangling we basically have two options - using the fastcall one, or using something else like undecorated names. I prototyped both but this PR uses the undecorated names. Using the fastcall mangled ones would mean that we have to add knowledge about method signatures to JitHelper.cs on the managed side. That proved to be tedious and also doesn't extend well to the custom calling convention used by several helpers. The undecorated names are easy to handle on the managed side but tricky to handle on the native side. MSVC doesn't have a native way to export methods under alternative names. The trick is to use alternatename linker/pragma. The downside of the trick is that linker doesn't see the comment pragma if there's no other reference to the .obj file inside a static library. There happened to be exactly two files that have only COOP_PINVOKE_HELPER methods and no other referenced code. As a workaround I added a dummy reference from the .asm files for one function from each of those two files.

Name mangling of imports and exports

Import libraries on x86 require mangled names. Likewise, the export are usually decorated and then stripped by the linker (through .def file or other means). The code is roughly based on https://github.com/bflattened/runtime/pull/60/files. The mangling is used for UnmanagedCallersOnly exports and DllImport imports through DirectPInvoke.

Data variables are mangled as _name (ie. _ prefix). This had to be tweaked on few spots where variables are shared between the managed and native runtime code.

Math imports from LibC

Other platforms used RuntimeImport for imports of math functions from the C runtime library. Since RuntimeImport assumes the managed calling convention this approach doesn't work for x86. I changed the imports to use DllImport + SuppressGCTransition (under TARGET_X86 #ifdef). The downside is that it comes with a GC poll call. I don't have a satisfactory solution for this.

Math helpers in ASM

Due to calling convention differences it was easier to implement the subset of math helper function in native runtime in ASM code. Most of the methods call C runtime. Some of them depend on xmm registers which rises the baseline (fully updated Windows 7 requires SSE2 instruction set anyway). The implementation is mostly minimum viable prototype and I anticipate that few of the helper methods will be moved to managed code soon.

Author: filipnavara
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@filipnavara
Copy link
Member Author

cc @SingleAccretion please check the bits that would conflict with the LLVM branch.

@filipnavara
Copy link
Member Author

filipnavara commented Mar 6, 2024

Hmm. Looks like the build is broken on the CI machine. I'll check on the matching MSVC version. Worst case it's possible to split the macro magic into zero-argument and 1+-argument macros. That works even with the old preprocessor.

Also, ARM hits this annoyance:

/__w/1/s/src/tests/nativeaot/CustomMain/CustomMainNative.cpp:17:17: error: '__cdecl' calling convention is not supported for this target [-Werror,-Wignored-attributes]
extern "C" void __cdecl IncrementExitCode(int32_t amount);

(will fix it globally once other builds finish/fail)

@jkotas
Copy link
Member

jkotas commented Mar 6, 2024

Math imports from LibC

This looks fine. These calls should really be DllImports w/ SuppressGCTransition on all plaforms (#13820, #39474 was an attempt to do that).

Or maybe we bite the bullet and reimplement them in C# (#9001). We have been copying floating point math methods implementation for System.Numerics.Tensors, so we have taken on the burden of maintaining out own float algorithms anyway.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please split the math helpers and the find&replaces changes the FCalls/QCalls into a separate PR that will be merged first, so that the rest is easier to review?

src/coreclr/nativeaot/Runtime/i386/MiscStubs.asm Outdated Show resolved Hide resolved
src/coreclr/nativeaot/Runtime/CommonMacros.h Outdated Show resolved Hide resolved
src/coreclr/nativeaot/Runtime/EHHelpers.cpp Outdated Show resolved Hide resolved
@filipnavara
Copy link
Member Author

Could you please split the math helpers and the find&replaces changes the FCalls/QCalls into a separate PR that will be merged first, so that the rest is easier to review?

Sure. FWIW these changes are mostly isolated in the first commit, so viewing commit by commit is easier for this draft PR.

@@ -119,7 +119,9 @@

<PropertyGroup>
<!-- CLR NativeAot only builds in a subset of the matrix -->
<NativeAotSupported Condition="('$(TargetOS)' == 'windows' or '$(TargetOS)' == 'linux' or '$(TargetOS)' == 'osx' or '$(TargetOS)' == 'maccatalyst' or '$(TargetOS)' == 'iossimulator' or '$(TargetOS)' == 'ios' or '$(TargetOS)' == 'tvossimulator' or '$(TargetOS)' == 'tvos' or '$(TargetOS)' == 'freebsd') and ('$(TargetArchitecture)' == 'x64' or '$(TargetArchitecture)' == 'arm64' or '$(TargetArchitecture)' == 'arm')">true</NativeAotSupported>
<_NativeAotSupportedOS Condition="'$(TargetOS)' == 'windows' or '$(TargetOS)' == 'linux' or '$(TargetOS)' == 'osx' or '$(TargetOS)' == 'maccatalyst' or '$(TargetOS)' == 'iossimulator' or '$(TargetOS)' == 'ios' or '$(TargetOS)' == 'tvossimulator' or '$(TargetOS)' == 'tvos' or '$(TargetOS)' == 'freebsd'">true</_NativeAotSupportedOS>
Copy link
Member

@am11 am11 Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can invert it NativeAotUnsupportedOS; that set seems to be getting smaller. 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to do that eventually;)

Copy link
Member Author

@filipnavara filipnavara Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the unsupported OSes are tizen, wasi, and browser. Not sure of the status for haiku, illumos, solaris and netbsd which happen to have some build system support.

Unsupported archs are RV64, LA64, PowerPC (supported by Mono), WASM, and linux-x86. We may be able to shrink this list before .NET 9 ships.

Copy link
Member

@am11 am11 Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, something like:

<NativeAotUnsupportedOS Condition="$([MSBuild]::ValueOrDefault(',browser,haiku,illumos,netbsd,tizen,wasi,', '').Contains(',$(TargetOS),'))">true</NativeAotUnsupportedOS>

usage: Condition="'$(NativeAotUnsupportedOS)' != 'true'"

Unsupported archs are RV64, LA64, PowerPC (supported by Mono), WASM, and linux-x86

and s390x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this only needs to create a subset of what can already build with CoreCLR but does not build for native AOT (so that we don't break ./build.sh clr for a platform that already works otherwise).

I don't think browser/wasi/powerpc/s390 builds with CoreCLR, so we don't mind if native AOT makes ./build.sh clr more broken. If someone brings up a new platform and wants native AOT out of the way, they can exclude it, same way they can exclude anything else they don't (yet) want as part of bringup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think browser/wasi/powerpc/s390 builds with CoreCLR,

Given we have to ensure that clr.iltools and clr.packages remain intact:

buildArgs: -s mono+libs+host+packs+libs.tests+clr.iltools+clr.packages -c $(_BuildConfig) /p:ArchiveTests=true

we might as well just list the 'not yet supported' ones without classifying them further.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we have to ensure that clr.iltools and clr.packages remain intact:

Clr.iltools "just works" because clr.iltools doesn't include coreclr VM and doesn't include any native AOT parts either, same for clr.packages.

Does building the clr subset (the entire clr subset, not just ilasm/dasm) work on s390 today? If not, I don't see why we'd need an extra exclusion for the native aot part. We shouldn't need to add extra exclusions (that are specific to native AOT) every time someone adds an obscure platform for mono. We either skip the entire clr build (i.e. we do the thing where build is skipped for all of coreclr, not just native AOT - except for iltools), or let it do whatever it does (break, probably). I think the latter is fine because that's how it is today.

@filipnavara
Copy link
Member Author

filipnavara commented Mar 8, 2024

Rebased on top of main. I'll update the top comment in the morning.

@filipnavara filipnavara marked this pull request as ready for review March 8, 2024 14:39
@@ -446,31 +447,33 @@ public PInvokeDelegateWrapperHashtable(InteropStateManager interopStateManager,
}
}

private sealed class PInvokeLazyFixupFieldHashtable : LockFreeReaderHashtable<MethodDesc, PInvokeLazyFixupField>
private readonly record struct PInvokeLazyFixupFieldKey(MethodDesc Method, int SignatureBytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the SignatureBytes part of the key? The number is derived from MethodDesc - it should not need to be part of the key.

Copy link
Member Author

@filipnavara filipnavara Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s the part I don’t like but I don’t know how to propagate it into the value in LockFreeReaderHashtable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another attempt at this, but it still feels odd. LockFreeReaderHashtable is designed to create the value from the key and there's no easy way to pass the additional information into the PInvokeLazyFixupField constructor.

@filipnavara
Copy link
Member Author

filipnavara commented Mar 9, 2024

I started running a bigger set of tests. It looks promising but there are several notable issues that we hit:

  • Many of the float math methods in CRT (ceilf, floorf, etc.) are not real methods in the native code and they cannot be P/Invoked. Surprisingly, this doesn't affect the methods operating on double.
  • Plenty of the mcc_* tests don't compile. I suspect that it's likely lack of varargs and some tailcail support in majority of cases.
  • rem_r4 / rem_r8 tests fail. These go through the math helpers but not the ones that I implemented in assembly code.
  • At least one exception handling test is failing (throwinfinallyintryfilter2).

Results on Debug build:

Time [secs] | Total | Passed | Failed | Skipped | Assembly Execution Summary
============================================================================
      0.165 |    27 |     26 |      1 |       0 | baseservices.baseservices
      0.020 |     7 |      7 |      0 |       0 | baseservices.exceptions.baseservices-exceptions
      0.016 |     2 |      2 |      0 |       0 | baseservices.threading.threading_group1
      0.016 |     1 |      1 |      0 |       0 | baseservices.threading.threading_group2
      0.014 |     1 |      1 |      0 |       0 | CoreMangLib.CoreMangLib
      0.016 |     2 |      2 |      0 |       0 | Exceptions.Exceptions
      0.559 |    33 |     33 |      0 |       0 | JIT.CodeGenBringUpTests.JIT.CodeGenBringUpTests
      0.243 |    75 |     75 |      0 |       0 | JIT.Directed.Directed_1
      0.034 |    11 |     11 |      0 |       0 | JIT.Directed.Directed_2
      0.517 |    43 |     43 |      0 |       0 | JIT.Generics.JIT.Generics
      0.090 |   113 |    113 |      0 |       0 | JIT.IL_Conformance.IL_Conformance
      0.018 |     1 |      1 |      0 |       0 | JIT.jit64.jit64_1
      0.146 |    44 |     44 |      0 |       0 | JIT.jit64.jit64_3
      0.013 |     1 |      1 |      0 |       0 | JIT.jit64.jit64_4
      0.179 |    46 |     45 |      1 |       0 | JIT.jit64.jit64_5
      0.018 |     2 |      2 |      0 |       0 | JIT.JIT_d
      0.015 |     3 |      3 |      0 |       0 | JIT.JIT_do
      0.022 |    10 |     10 |      0 |       0 | JIT.JIT_others
      0.256 |     1 |      1 |      0 |       0 | nativeaot.CustomMain.XUnitWrapper.dll
      0.205 |     1 |      1 |      0 |       0 | nativeaot.GenerateUnmanagedEntryPoints.XUnitWrapper.dll
      6.658 |    16 |     16 |      0 |       0 | nativeaot.SmokeTests.XUnitWrapper.dll
      0.226 |     1 |      1 |      0 |       0 | nativeaot.StartupHook.XUnitWrapper.dll
----------------------------------------------------------------------------
      9.446 |   441 |    439 |      2 |       0 | (total)

The failed EH test:

Failed test: throwinfinallyintryfilter2::Main (JIT\jit64\eh\basics\throwinfinallyintryfilter2\throwinfinallyintryfilter2.dll) (JIT.jit64.jit64_5)
in try
  in try
    in try
    in finally
  in filter
in filter
    in finally
  in filter
in filter
caught an exception!

FAILED!

[EXPECTED OUTPUT]
in try
  in try
    in try
    in finally
  in filter
in filter
caught an exception!

[DIFF RESULT]
< caught an exception!
---
>     in finally

The baseservices tests is supposed to be skipped through issues.targets but that didn't trigger:

<test name="baseservices\RuntimeConfiguration\TestConfigTester\TestConfigTester.dll" type="TestConfigTester" method="RunTests" time="0.000702" result="Fail"><failure exception-type="System.ArgumentNullException"><message><![CDATA[Value cannot be null. (Parameter 'path1')]]></message><stack-trace><![CDATA[   at System.ArgumentNullException.Throw(String) + 0x2f
   at System.ArgumentNullException.ThrowIfNull(Object, String) + 0x29
   at System.IO.Path.Combine(String, String) + 0x23
   at TestConfigTester.RunTests() + 0x99
   at Program.<<Main>$>g__TestExecutor1|0_2(StreamWriter tempLogSw, StreamWriter statsCsvSw, Program.<>c__DisplayClass0_0&) + 0xe2]]></stack-trace></failure></test>

@filipnavara filipnavara force-pushed the not-a-win-x86-naot-001 branch from 6533c0d to 543a4c8 Compare March 10, 2024 05:32
@filipnavara
Copy link
Member Author

(Rebased to resolve conflicts with main)

@jkotas
Copy link
Member

jkotas commented Mar 10, 2024

@MichalStrehovsky Could you please review this PR as well?

…"atoi" is imported multiple times with different calling convention
Copy link
Member

@MichalStrehovsky MichalStrehovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the two things. Thank you!

public MethodSignature NativeSignature
{
get;
set;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be passed to the constructor? We have plenty of prior art for similar situations:

protected struct SerializedFrozenObjectKey : IEquatable<SerializedFrozenObjectKey>
{
public readonly MetadataType OwnerType;
public readonly int AllocationSiteId;
public readonly TypePreinit.ISerializableReference SerializableObject;
public SerializedFrozenObjectKey(MetadataType ownerType, int allocationSiteId, TypePreinit.ISerializableReference obj)
{
Debug.Assert(ownerType.HasStaticConstructor);
OwnerType = ownerType;
AllocationSiteId = allocationSiteId;
SerializableObject = obj;
}
public override bool Equals(object obj) => obj is SerializedFrozenObjectKey && Equals((SerializedFrozenObjectKey)obj);
public bool Equals(SerializedFrozenObjectKey other) => OwnerType == other.OwnerType && AllocationSiteId == other.AllocationSiteId;
public override int GetHashCode() => HashCode.Combine(OwnerType.GetHashCode(), AllocationSiteId);
}

Above struct contains TypePreinit.ISerializableReference SerializableObject but it's not part of equality/hashcode computation. It just smuggles the value through the hash table.

I don't think we have prior art for mutable type system entities on the other hand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t find a way to do that with LockFreeReaderHashtable (see comments above). I tried several but it always felt odd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t find a way to do that with LockFreeReaderHashtable (see comments above). I tried several but it always felt odd.

Does this not work? 60f1f17

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#99372 (comment)

That comment was on the record that was making the "passenger field" part of the key because that's how records work.

What I linked uses it as a passenger, exactly how it's used in the #99372 (comment) and elsewhere within the compiler. It's not part of hash code, it's not part of equality checks.

Copy link
Member Author

@filipnavara filipnavara Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood the comment then. I thought that it generally should not be part of the key (although the record part doesn't really matter here since the equality is handled in the overridden methods).

9dd4866

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thanks. I hate this approach a bit less than the field with setter even if I would ideally prefer something similar to ConcurrentDictionary.GetOrAdd.)

filipnavara and others added 2 commits March 11, 2024 08:51
…rd.csproj

Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com>
@MichalStrehovsky MichalStrehovsky merged commit 60d73db into dotnet:main Mar 11, 2024
181 checks passed
@filipnavara filipnavara deleted the not-a-win-x86-naot-001 branch March 11, 2024 10:59
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants