Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a "fail at runtime" mechanism for better LLVMFullAOT testability #63654

Open
1 of 6 tasks
jkoritzinsky opened this issue Jan 11, 2022 · 16 comments
Open
1 of 6 tasks
Assignees
Milestone

Comments

@jkoritzinsky
Copy link
Member

jkoritzinsky commented Jan 11, 2022

We would like to add the ability to defer some AOT compilation errors until execution time.

When the AOT compiler is generating code, there are some assemblies that contain code that it cannot handle (for example platform-dependent code that is never executed on mobile that depends on assemblies that are only present on desktop Windows). In this case instead of preventing the whole assembly from compiling, we should defer the actual error until execution time.

Additionally the runtime tests in src/tests include some test cases that intentionally have malformed metadata or that are expected to not compile with AOT. In this case, we would like to allow some of the methods in the assemblies to be AOTed, without the problematic test cases blocking the testing of the whole assembly.

Tasks:

  • Add an aot compiler option to allow some class of errors to be AOT-time warnings and to defer the actual error to runtime (initially as our generic "Attempting to JIT compile method... while running in aot-only mode") [mono] Add an 'allow-loader-errors' AOT option. #64640
  • Adjust mono AOT compiler errors and warnings to emit the prefix that MSBuild uses to capture warnings/errors from Exec tasks
  • Add allow-errors option to the MonoAOTCompiler task
  • Enable 'allow-errors' mode in CI, re-enable disabled tests
  • For cases where it makes sense, make Mono AOT produce the same warning/error codes as NativeAOT
  • Add some support for specific execution-time AOT failure errors (either by emitting stub method bodies, or adding some error table to the AOT images, or something else)

Original issue, below


In the src/tests test tree, we test a number of corner cases of invalid code, including type layouts and UnmanagedCallersOnly usage. When running tests on Mono with LLVMFullAOT, some of these failures, including ones seen #63320 and in some tests I've listed below cause AOT-time failures. As a result, we need to exclude these assemblies from running through the AOT compiler.

Today, we handle this by excluding the tests from AOT compilation via issues.targets. However, we're looking at moving away from issues.targets and towards using [ActiveIssue] attributes throughout our whole repo. This gets into a problem where we can't easily exclude methods from LLVMFullAOT, but we can't easily mark an assembly as "skip AOT" in the attribute model.

I talked with @imhameed offline, and we thought of an idea that is similar to some mechanisms used within CoreCLR's NativeAOT for some scenarios like IL stub generation: Provide a mode for the Mono AOT compiler that, on failure, emits a stub method body that throws the exception that would have been thrown if the method was generated at runtime. In this mode, AOT compilation would succeed for cases where the program will ultimately fail at runtime, whereas today it fails at AOT-compilation time.

This will allow us to re-enable some tests on llvmfullaot runs or at least exclude them in a manner more accessible with [ActiveIssue] attributes.

Tests that are disabled today because they fail at AOT-time instead of run time:

There may be others in that section that are disabled for the same reason but based on the issue metadata it looks like they fail at runtime.

cc: @MichalStrehovsky I don't know if there's other places in NativeAOT where a mechanism like this might be useful for testing scenarios.
cc: @trylek as this affects our work with refactoring the test tree

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 11, 2022
@trylek
Copy link
Member

trylek commented Jan 11, 2022

This sounds like a good direction to me. /cc @davidwrighton and @jkotas in case they might have additional feedback to share.

@imhameed
Copy link
Contributor

cc @lambdageek @vargaz

@MichalStrehovsky
Copy link
Member

cc: @MichalStrehovsky I don't know if there's other places in NativeAOT where a mechanism like this might be useful for testing scenarios.

The NativeAOT compiler does this here:

// If we previously failed to import the method, do not try to import it again and go
// directly to the error path.
if (exception == null)
{
try
{
corInfo.CompileMethod(methodCodeNodeNeedingCode);
}
catch (TypeSystemException ex)
{
exception = ex;
}
}
if (exception != null)
{
// TODO: fail compilation if a switch was passed
// Try to compile the method again, but with a throwing method body this time.
MethodIL throwingIL = TypeSystemThrowingILEmitter.EmitIL(method, exception);
corInfo.CompileMethod(methodCodeNodeNeedingCode, throwingIL);
if (exception is TypeSystemException.InvalidProgramException
&& method.OwningType is MetadataType mdOwningType
&& mdOwningType.HasCustomAttribute("System.Runtime.InteropServices", "ClassInterfaceAttribute"))
{
Logger.LogWarning("COM interop is not supported with full ahead of time compilation", 3052, method, MessageSubCategory.AotAnalysis);
}
else
{
Logger.LogWarning($"Method will always throw because: {exception.Message}", 1005, method, MessageSubCategory.AotAnalysis);
}
}

We have a well-defined set of exception types that map to runtime exceptions (TypeLoadException, MarshalDirectiveException, BadImageFormatException, InvalidProgramException, FileNotFoundException, etc.). If one is thrown during a method body compilation, we catch it and recompile a method body that throws the matching exception type (and localized exception message) at runtime when the method body is reached.

@vargaz
Copy link
Contributor

vargaz commented Jan 12, 2022

The mono AOT compiler used to silently ignore AOT methods whose compilation caused an exception, but this was changed because those failures showed up at runtime, causing confusion with customers, so now the failures show up at build time.

@MichalStrehovsky
Copy link
Member

those failures showed up at runtime, causing confusion with customers, so now the failures show up at build time.

What was the failure mode? If they got "Attempting to JIT compile method... while running in aot-only mode" I can see how that would be confusing, but it can be done better.

We did this feature in .NET Native and NativeAOT because it's very common for customers to have broken/mismatched dependencies and AOT compilation failure resulted in customer sadness.

@vargaz
Copy link
Contributor

vargaz commented Jan 12, 2022

Yes, 'attempting to JIT compile method' was the common runtime error. The most common cause was mismatched dependencies. If these mismatches do not cause a build failure, then the resulting app would work sometimes, and would fail sometimes, which is not desirable either.

@MichalStrehovsky
Copy link
Member

the resulting app would work sometimes, and would fail sometimes, which is not desirable either.

It matches the behavior of non-AOT configuration, making the AOT step transparent to the user.

@vargaz
Copy link
Contributor

vargaz commented Jan 12, 2022

The build system for the mobile platforms have access to all the assemblies, so in theory, it can actually check whenever the assemblies are matching, and report some kind of user friendly error. This is not done right now, so the failures show up during the AOT process or at runtime.

@MichalStrehovsky
Copy link
Member

For the scope of this bug, I think it would be sufficient to restore the old behavior under a command line switch. This would allow the amalgamated test to compile and the ActiveIssue on the method would prevent the "Attempting to JIT compile" exception from happening.

Generating the throwing stub is nice to have. I still believe it's better to allow these things to compile in general because it's pretty widespread (we still get bug reports every couple months for cases where even this is not sufficient - the customer has input that is so broken we don't detect it when compiling a method, only later when we generate data structures and at that point there's no spot to generate a throwing body; the fix for those is usually doing more eager validation during method compilation to hit these problems earlier).

@lambdageek
Copy link
Member

I don't think this is the right approach. We want to test what is the actual behavior that customers experience. FullAOT fails the build at conpile-time if there is a conpile-time error. Deferring errors to runtime would introduce a new behavior different from what users experience.

If we want to productize "defer AOT errors to runtime", then we can test that configuration. Otherwise this is just adding complexity to our tests.

The right approach, IMO, is for the test infrastructure to understand FullAOT testing and to be able to mark assemblies as expected to fail AOT.

@jkoritzinsky
Copy link
Member Author

The problem with marking assemblies as expected to fail AOT is that we lose coverage of other tests in that assembly. For example, in #63320, only one test case has failures during AOT compilation, but it would be useful to run the other test cases in the FullAOT configuration. As it is today, we're losing test coverage. Additionally, as we move forward with consolidating assemblies in the src/tests tree to reduce build times, we'd end up with less ability to disable tests at a smaller scale without disabling tons of other tests.

@MichalStrehovsky
Copy link
Member

I don't think this is the right approach. We want to test what is the actual behavior that customers experience. FullAOT fails the build at conpile-time if there is a conpile-time error. Deferring errors to runtime would introduce a new behavior different from what users experience.

I think we should change that behavior for fullAOT so that it can defer the failure to runtime.

Errors deferred to runtime is the behavior users get sans AOT compilation. The runtime doesn't fail to load an assembly if one of the references is missing at startup: instead, it lets the code run until a problematic codepath is hit. The deferred failure mode is why people often don't notice that they're shipping broken assemblies and there are so many of those in the wild.

We design/ship our tools (IL Linker, crossgen2, NativeAOT) to be tolerant to such garbage input. Mono fullAOT should align with how other tools handle it.

@lambdageek
Copy link
Member

I'm more open to this as a customer-visible change. In terms of badness, I think having a "special testing mode" that is unlike what customers actual use is a bad idea.

On the other hand, having a customer-visible option for FullAOT to be tolerant of bad input and to signal problems at runtime seems less bad. (I still think it should be an option, not an uncontrollable change in behavior. Detecting problems at compile time is good.)

There will be new problems that "AOT errors as warnings" will bring. For example it will be harder to diagnose mismatches between the compile-time and the runtime environment (e.g. referenced assemblies missing at AOT-time but present at runtime).

@lambdageek lambdageek added this to the 7.0.0 milestone Feb 1, 2022
@lambdageek lambdageek removed the untriaged New issue has not been triaged by the area owner label Feb 1, 2022
vargaz added a commit to vargaz/runtime that referenced this issue Feb 1, 2022
This can be used to avoid aborting the AOT process if a loader
error occurs. The methods which fail to load will not be AOTed and
the failures will happen at runtime.

Related:
dotnet#63654
@lambdageek
Copy link
Member

I will update this issue to be a tracking issue for the whole user experience but some of the things we will need to do include:

  • Add an aot compiler option to allow some class of errors to be AOT-time warnings and to defer the actual error to runtime (initially as our generic "Attempting to JIT compile method... while running in aot-only mode") [mono] Add an 'allow-loader-errors' AOT option. #64640
  • Adjust mono AOT compiler errors and warnings to emit the prefix that MSBuild uses to capture warnings/errors from Exec tasks
  • Add allow-errors option to the MonoAOTCompiler task
  • Enable 'allow-errors' mode in CI, re-enable disabled tests
  • For cases where it makes sense, make Mono AOT produce the same warning/error codes as NativeAOT
  • Add some support for specific execution-time AOT failure errors (either by emitting stub method bodies, or adding some error table to the AOT images, or something else)

Does this make sense? /cc @vargaz @marek-safar

vargaz added a commit that referenced this issue Feb 6, 2022
This can be used to avoid aborting the AOT process if a loader
error occurs. The methods which fail to load will not be AOTed and
the failures will happen at runtime.

Related:
#63654
@lambdageek
Copy link
Member

/cc @SamMonoRT @steveisok

@SamMonoRT SamMonoRT modified the milestones: 7.0.0, 8.0.0 Aug 4, 2022
@LeVladIonescu LeVladIonescu self-assigned this May 19, 2023
@jandupej jandupej removed this from the 8.0.0 milestone May 19, 2023
@jandupej jandupej added this to the Future milestone May 19, 2023
@jandupej
Copy link
Member

Some of the disabled unit tests mentioned here are being addressed in #91261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests