Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

jkoritzinsky · 2022-01-11T22:59:50Z

We would like to add the ability to defer some AOT compilation errors until execution time.

When the AOT compiler is generating code, there are some assemblies that contain code that it cannot handle (for example platform-dependent code that is never executed on mobile that depends on assemblies that are only present on desktop Windows). In this case instead of preventing the whole assembly from compiling, we should defer the actual error until execution time.

Additionally the runtime tests in src/tests include some test cases that intentionally have malformed metadata or that are expected to not compile with AOT. In this case, we would like to allow some of the methods in the assemblies to be AOTed, without the problematic test cases blocking the testing of the whole assembly.

Tasks:

Add an aot compiler option to allow some class of errors to be AOT-time warnings and to defer the actual error to runtime (initially as our generic "Attempting to JIT compile method... while running in aot-only mode") [mono] Add an 'allow-loader-errors' AOT option. #64640
Adjust mono AOT compiler errors and warnings to emit the prefix that MSBuild uses to capture warnings/errors from Exec tasks
Add allow-errors option to the MonoAOTCompiler task
Enable 'allow-errors' mode in CI, re-enable disabled tests
For cases where it makes sense, make Mono AOT produce the same warning/error codes as NativeAOT
Add some support for specific execution-time AOT failure errors (either by emitting stub method bodies, or adding some error table to the AOT images, or something else)

Original issue, below

In the src/tests test tree, we test a number of corner cases of invalid code, including type layouts and UnmanagedCallersOnly usage. When running tests on Mono with LLVMFullAOT, some of these failures, including ones seen #63320 and in some tests I've listed below cause AOT-time failures. As a result, we need to exclude these assemblies from running through the AOT compiler.

Today, we handle this by excluding the tests from AOT compilation via issues.targets. However, we're looking at moving away from issues.targets and towards using [ActiveIssue] attributes throughout our whole repo. This gets into a problem where we can't easily exclude methods from LLVMFullAOT, but we can't easily mark an assembly as "skip AOT" in the attribute model.

I talked with @imhameed offline, and we thought of an idea that is similar to some mechanisms used within CoreCLR's NativeAOT for some scenarios like IL stub generation: Provide a mode for the Mono AOT compiler that, on failure, emits a stub method body that throws the exception that would have been thrown if the method was generated at runtime. In this mode, AOT compilation would succeed for cases where the program will ultimately fail at runtime, whereas today it fails at AOT-compilation time.

This will allow us to re-enable some tests on llvmfullaot runs or at least exclude them in a manner more accessible with [ActiveIssue] attributes.

Tests that are disabled today because they fail at AOT-time instead of run time:

All tests in the Loader/classloader/explicitlayout/objrefandnonobjrefoverlap subtree
Loader/classloader/explicitlayout/NestedStructs/case03
Loader/classloader/explicitlayout/NestedStructs/case04
Loader/classloader/explicitlayout/NestedStructs/case05
The RuntimeMarshallingDisabled_NativeAssemblyEnabled test suite in Implement System.Runtime.CompilerServices.DisabledRuntimeMarshallingAttribute on CoreCLR-family of runtimes/type systems #63320

There may be others in that section that are disabled for the same reason but based on the issue metadata it looks like they fail at runtime.

cc: @MichalStrehovsky I don't know if there's other places in NativeAOT where a mechanism like this might be useful for testing scenarios.
cc: @trylek as this affects our work with refactoring the test tree

The text was updated successfully, but these errors were encountered:

trylek · 2022-01-11T23:25:38Z

This sounds like a good direction to me. /cc @davidwrighton and @jkotas in case they might have additional feedback to share.

imhameed · 2022-01-11T23:36:40Z

cc @lambdageek @vargaz

MichalStrehovsky · 2022-01-11T23:49:38Z

cc: @MichalStrehovsky I don't know if there's other places in NativeAOT where a mechanism like this might be useful for testing scenarios.

The NativeAOT compiler does this here:

runtime/src/coreclr/tools/aot/ILCompiler.RyuJit/Compiler/RyuJitCompilation.cs

Lines 182 to 214 in 032129c

    
           // If we previously failed to import the method, do not try to import it again and go 
        
           // directly to the error path. 
        
           if (exception == null) 
        
           { 
        
               try 
        
               { 
        
                   corInfo.CompileMethod(methodCodeNodeNeedingCode); 
        
               } 
        
               catch (TypeSystemException ex) 
        
               { 
        
                   exception = ex; 
        
               } 
        
           } 
        
           if (exception != null) 
        
           { 
        
               // TODO: fail compilation if a switch was passed 
        
               // Try to compile the method again, but with a throwing method body this time. 
        
               MethodIL throwingIL = TypeSystemThrowingILEmitter.EmitIL(method, exception); 
        
               corInfo.CompileMethod(methodCodeNodeNeedingCode, throwingIL); 
        
               if (exception is TypeSystemException.InvalidProgramException 
        
                   && method.OwningType is MetadataType mdOwningType 
        
                   && mdOwningType.HasCustomAttribute("System.Runtime.InteropServices", "ClassInterfaceAttribute")) 
        
               { 
        
                   Logger.LogWarning("COM interop is not supported with full ahead of time compilation", 3052, method, MessageSubCategory.AotAnalysis); 
        
               } 
        
               else 
        
               { 
        
                   Logger.LogWarning($"Method will always throw because: {exception.Message}", 1005, method, MessageSubCategory.AotAnalysis); 
        
               } 
        
           }

We have a well-defined set of exception types that map to runtime exceptions (TypeLoadException, MarshalDirectiveException, BadImageFormatException, InvalidProgramException, FileNotFoundException, etc.). If one is thrown during a method body compilation, we catch it and recompile a method body that throws the matching exception type (and localized exception message) at runtime when the method body is reached.

vargaz · 2022-01-12T04:03:58Z

The mono AOT compiler used to silently ignore AOT methods whose compilation caused an exception, but this was changed because those failures showed up at runtime, causing confusion with customers, so now the failures show up at build time.

MichalStrehovsky · 2022-01-12T04:08:26Z

those failures showed up at runtime, causing confusion with customers, so now the failures show up at build time.

What was the failure mode? If they got "Attempting to JIT compile method... while running in aot-only mode" I can see how that would be confusing, but it can be done better.

We did this feature in .NET Native and NativeAOT because it's very common for customers to have broken/mismatched dependencies and AOT compilation failure resulted in customer sadness.

vargaz · 2022-01-12T04:22:35Z

Yes, 'attempting to JIT compile method' was the common runtime error. The most common cause was mismatched dependencies. If these mismatches do not cause a build failure, then the resulting app would work sometimes, and would fail sometimes, which is not desirable either.

MichalStrehovsky · 2022-01-12T04:26:44Z

the resulting app would work sometimes, and would fail sometimes, which is not desirable either.

It matches the behavior of non-AOT configuration, making the AOT step transparent to the user.

vargaz · 2022-01-12T04:34:55Z

The build system for the mobile platforms have access to all the assemblies, so in theory, it can actually check whenever the assemblies are matching, and report some kind of user friendly error. This is not done right now, so the failures show up during the AOT process or at runtime.

MichalStrehovsky · 2022-01-12T05:16:13Z

For the scope of this bug, I think it would be sufficient to restore the old behavior under a command line switch. This would allow the amalgamated test to compile and the ActiveIssue on the method would prevent the "Attempting to JIT compile" exception from happening.

Generating the throwing stub is nice to have. I still believe it's better to allow these things to compile in general because it's pretty widespread (we still get bug reports every couple months for cases where even this is not sufficient - the customer has input that is so broken we don't detect it when compiling a method, only later when we generate data structures and at that point there's no spot to generate a throwing body; the fix for those is usually doing more eager validation during method compilation to hit these problems earlier).

lambdageek · 2022-01-12T16:35:22Z

I don't think this is the right approach. We want to test what is the actual behavior that customers experience. FullAOT fails the build at conpile-time if there is a conpile-time error. Deferring errors to runtime would introduce a new behavior different from what users experience.

If we want to productize "defer AOT errors to runtime", then we can test that configuration. Otherwise this is just adding complexity to our tests.

The right approach, IMO, is for the test infrastructure to understand FullAOT testing and to be able to mark assemblies as expected to fail AOT.

jkoritzinsky · 2022-01-12T17:39:55Z

The problem with marking assemblies as expected to fail AOT is that we lose coverage of other tests in that assembly. For example, in #63320, only one test case has failures during AOT compilation, but it would be useful to run the other test cases in the FullAOT configuration. As it is today, we're losing test coverage. Additionally, as we move forward with consolidating assemblies in the src/tests tree to reduce build times, we'd end up with less ability to disable tests at a smaller scale without disabling tons of other tests.

MichalStrehovsky · 2022-01-24T01:45:44Z

I don't think this is the right approach. We want to test what is the actual behavior that customers experience. FullAOT fails the build at conpile-time if there is a conpile-time error. Deferring errors to runtime would introduce a new behavior different from what users experience.

I think we should change that behavior for fullAOT so that it can defer the failure to runtime.

Errors deferred to runtime is the behavior users get sans AOT compilation. The runtime doesn't fail to load an assembly if one of the references is missing at startup: instead, it lets the code run until a problematic codepath is hit. The deferred failure mode is why people often don't notice that they're shipping broken assemblies and there are so many of those in the wild.

We design/ship our tools (IL Linker, crossgen2, NativeAOT) to be tolerant to such garbage input. Mono fullAOT should align with how other tools handle it.

lambdageek · 2022-02-01T15:22:29Z

I'm more open to this as a customer-visible change. In terms of badness, I think having a "special testing mode" that is unlike what customers actual use is a bad idea.

On the other hand, having a customer-visible option for FullAOT to be tolerant of bad input and to signal problems at runtime seems less bad. (I still think it should be an option, not an uncontrollable change in behavior. Detecting problems at compile time is good.)

There will be new problems that "AOT errors as warnings" will bring. For example it will be harder to diagnose mismatches between the compile-time and the runtime environment (e.g. referenced assemblies missing at AOT-time but present at runtime).

This can be used to avoid aborting the AOT process if a loader error occurs. The methods which fail to load will not be AOTed and the failures will happen at runtime. Related: dotnet#63654

lambdageek · 2022-02-02T16:25:20Z

I will update this issue to be a tracking issue for the whole user experience but some of the things we will need to do include:

Add an aot compiler option to allow some class of errors to be AOT-time warnings and to defer the actual error to runtime (initially as our generic "Attempting to JIT compile method... while running in aot-only mode") [mono] Add an 'allow-loader-errors' AOT option. #64640
Adjust mono AOT compiler errors and warnings to emit the prefix that MSBuild uses to capture warnings/errors from Exec tasks
Add allow-errors option to the MonoAOTCompiler task
Enable 'allow-errors' mode in CI, re-enable disabled tests
For cases where it makes sense, make Mono AOT produce the same warning/error codes as NativeAOT
Add some support for specific execution-time AOT failure errors (either by emitting stub method bodies, or adding some error table to the AOT images, or something else)

Does this make sense? /cc @vargaz @marek-safar

This can be used to avoid aborting the AOT process if a loader error occurs. The methods which fail to load will not be AOTed and the failures will happen at runtime. Related: #63654

lambdageek · 2022-02-07T14:40:59Z

/cc @SamMonoRT @steveisok

jandupej · 2023-08-31T11:51:39Z

Some of the disabled unit tests mentioned here are being addressed in #91261

jkoritzinsky added the area-Codegen-AOT-mono label Jan 11, 2022

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 11, 2022

vargaz mentioned this issue Feb 1, 2022

[mono] Add an 'allow-loader-errors' AOT option. #64640

Merged

lambdageek added this to the 7.0.0 milestone Feb 1, 2022

lambdageek removed the untriaged New issue has not been triaged by the area owner label Feb 1, 2022

SamMonoRT assigned fanyang-mono and jandupej Aug 4, 2022

SamMonoRT modified the milestones: 7.0.0, 8.0.0 Aug 4, 2022

MichalStrehovsky mentioned this issue Jan 23, 2023

[API Proposal]: InlineArrayAttribute #61135

Closed

LeVladIonescu self-assigned this May 19, 2023

jandupej removed this from the 8.0.0 milestone May 19, 2023

jandupej added this to the Future milestone May 19, 2023

jandupej mentioned this issue Sep 7, 2023

[mono][aot] Type load checks do not fail at compile time but produce a runtime exception #91261

Merged

SamMonoRT assigned matouskozak and unassigned LeVladIonescu and jandupej Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

jkoritzinsky commented Jan 11, 2022 •

edited by LeVladIonescu

trylek commented Jan 11, 2022

imhameed commented Jan 11, 2022

MichalStrehovsky commented Jan 11, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

lambdageek commented Jan 12, 2022

jkoritzinsky commented Jan 12, 2022

MichalStrehovsky commented Jan 24, 2022

lambdageek commented Feb 1, 2022

lambdageek commented Feb 2, 2022

lambdageek commented Feb 7, 2022

jandupej commented Aug 31, 2023

Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

Comments

jkoritzinsky commented Jan 11, 2022 • edited by LeVladIonescu

trylek commented Jan 11, 2022

imhameed commented Jan 11, 2022

MichalStrehovsky commented Jan 11, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

vargaz commented Jan 12, 2022

MichalStrehovsky commented Jan 12, 2022

lambdageek commented Jan 12, 2022

jkoritzinsky commented Jan 12, 2022

MichalStrehovsky commented Jan 24, 2022

lambdageek commented Feb 1, 2022

lambdageek commented Feb 2, 2022

lambdageek commented Feb 7, 2022

jandupej commented Aug 31, 2023

jkoritzinsky commented Jan 11, 2022 •

edited by LeVladIonescu