Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure when building self-hosted NativeAOT compiler changes with 8.0 Preview 2 SDK #83695

Closed
mthalman opened this issue Mar 20, 2023 · 34 comments · Fixed by #86423 or #87226
Closed

Failure when building self-hosted NativeAOT compiler changes with 8.0 Preview 2 SDK #83695

mthalman opened this issue Mar 20, 2023 · 34 comments · Fixed by #86423 or #87226
Assignees
Milestone

Comments

@mthalman
Copy link
Member

The changes in #81205 have caused a failure when attempting to use .NET 8 Preview 2 SDK to build the Preview 3 source. This is a required scenario for .NET's Source-Build. This was identified in the changes to update the source-build configuration so that it references the Preview 2 SDK: dotnet/installer#15851.

The error that occurs is for the ILCompiler project:

EXEC : error : Failed to load assembly 'System.CommandLine' [/vmr/src/runtime/artifacts/source-build/self/src/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj]
      Internal.TypeSystem.TypeSystemException+FileNotFoundException: Failed to load assembly 'System.CommandLine'
         at Internal.TypeSystem.ThrowHelper.ThrowFileNotFoundException(ExceptionStringID, String) in /_/src/coreclr/tools/Common/TypeSystem/Common/ThrowHelper.cs:line 35
         at Internal.TypeSystem.ResolutionFailure.Throw() in /_/src/coreclr/tools/Common/TypeSystem/Common/ResolutionFailure.cs:line 105
         at Internal.TypeSystem.Ecma.EcmaModule.GetObject(EntityHandle, NotFoundBehavior ) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaModule.cs:line 412
         at Internal.TypeSystem.Ecma.EcmaModule.GetType(EntityHandle) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaModule.cs:line 359
         at Internal.TypeSystem.Ecma.EcmaType.InitializeBaseType() in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 163
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 199
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Common/TypeDesc.cs:line 135
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 247
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Common/TypeDesc.cs:line 135
         at Internal.TypeSystem.MetadataFieldLayoutAlgorithm.ComputeInstanceLayout(DefType, InstanceLayoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/MetadataFieldLayoutAlgorithm.cs:line 41
         at Internal.TypeSystem.DefType.ComputeInstanceLayout(InstanceLayoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/DefType.FieldLayout.cs:line 436
         at ILCompiler.CompilerTypeSystemContext.EnsureLoadableTypeUncached(TypeDesc) in /_/src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.Validation.cs:line 137
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) in /_/src/coreclr/tools/Common/TypeSystem/Common/Utilities/LockFreeReaderHashtable.cs:line 562
         at ILCompiler.DependencyAnalysis.EETypeNode..ctor(NodeFactory, TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/EETypeNode.cs:line 85
         at ILCompiler.DependencyAnalysis.NodeFactory.CreateConstructedTypeNode(TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs:line 542
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) in /_/src/coreclr/tools/Common/TypeSystem/Common/Utilities/LockFreeReaderHashtable.cs:line 562
         at ILCompiler.DependencyAnalysis.NodeFactory.ConstructedTypeSymbol(TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs:line 614
         at ILCompiler.DependencyAnalysis.ReflectionInvokeMapNode.AddDependenciesDueToReflectability(DependencyList&, NodeFactory, MethodDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectionInvokeMapNode.cs:line 54
         at ILCompiler.MetadataManager.GetDependenciesDueToReflectability(DependencyList&, NodeFactory, MethodDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/MetadataManager.cs:line 399
         at ILCompiler.DependencyAnalysis.ReflectedMethodNode.GetStaticDependencies(NodeFactory) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectedMethodNode.cs:line 39
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 182
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 222
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 257
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 308
         at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ILScanner.cs:line 140
         at ILCompiler.Program.<Run>g__RunScanner|3_0(<>c__DisplayClass3_0&) in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 438
         at ILCompiler.Program.Run() in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 418
         at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass203_0.<.ctor>b__0(InvocationContext) in /_/src/coreclr/tools/aot/ILCompiler/ILCompilerRootCommand.cs:line 272

An analysis of the error and its symptoms is given here: dotnet/installer#15851 (comment). As stated in the analysis, the behavior is sporadic for some reason. One notable symptom is that BadImageFormatException is thrown when attempting to load certain assemblies in the ComputeManagedAssembliesToCompileToNative build task.

It's been confirmed that reverting the changes in #81205 resolves the issue. We should consider reverting this change or getting a fix in to address this in the Preview 3 timeframe.

cc @MichalStrehovsky

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 20, 2023
@ghost
Copy link

ghost commented Mar 20, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

The changes in #81205 have caused a failure when attempting to use .NET 8 Preview 2 SDK to build the Preview 3 source. This is a required scenario for .NET's Source-Build. This was identified in the changes to update the source-build configuration so that it references the Preview 2 SDK: dotnet/installer#15851.

The error that occurs is for the ILCompiler project:

EXEC : error : Failed to load assembly 'System.CommandLine' [/vmr/src/runtime/artifacts/source-build/self/src/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj]
      Internal.TypeSystem.TypeSystemException+FileNotFoundException: Failed to load assembly 'System.CommandLine'
         at Internal.TypeSystem.ThrowHelper.ThrowFileNotFoundException(ExceptionStringID, String) in /_/src/coreclr/tools/Common/TypeSystem/Common/ThrowHelper.cs:line 35
         at Internal.TypeSystem.ResolutionFailure.Throw() in /_/src/coreclr/tools/Common/TypeSystem/Common/ResolutionFailure.cs:line 105
         at Internal.TypeSystem.Ecma.EcmaModule.GetObject(EntityHandle, NotFoundBehavior ) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaModule.cs:line 412
         at Internal.TypeSystem.Ecma.EcmaModule.GetType(EntityHandle) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaModule.cs:line 359
         at Internal.TypeSystem.Ecma.EcmaType.InitializeBaseType() in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 163
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 199
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Common/TypeDesc.cs:line 135
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaType.cs:line 247
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) in /_/src/coreclr/tools/Common/TypeSystem/Common/TypeDesc.cs:line 135
         at Internal.TypeSystem.MetadataFieldLayoutAlgorithm.ComputeInstanceLayout(DefType, InstanceLayoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/MetadataFieldLayoutAlgorithm.cs:line 41
         at Internal.TypeSystem.DefType.ComputeInstanceLayout(InstanceLayoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/DefType.FieldLayout.cs:line 436
         at ILCompiler.CompilerTypeSystemContext.EnsureLoadableTypeUncached(TypeDesc) in /_/src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.Validation.cs:line 137
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) in /_/src/coreclr/tools/Common/TypeSystem/Common/Utilities/LockFreeReaderHashtable.cs:line 562
         at ILCompiler.DependencyAnalysis.EETypeNode..ctor(NodeFactory, TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/EETypeNode.cs:line 85
         at ILCompiler.DependencyAnalysis.NodeFactory.CreateConstructedTypeNode(TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs:line 542
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) in /_/src/coreclr/tools/Common/TypeSystem/Common/Utilities/LockFreeReaderHashtable.cs:line 562
         at ILCompiler.DependencyAnalysis.NodeFactory.ConstructedTypeSymbol(TypeDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs:line 614
         at ILCompiler.DependencyAnalysis.ReflectionInvokeMapNode.AddDependenciesDueToReflectability(DependencyList&, NodeFactory, MethodDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectionInvokeMapNode.cs:line 54
         at ILCompiler.MetadataManager.GetDependenciesDueToReflectability(DependencyList&, NodeFactory, MethodDesc) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/MetadataManager.cs:line 399
         at ILCompiler.DependencyAnalysis.ReflectedMethodNode.GetStaticDependencies(NodeFactory) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectedMethodNode.cs:line 39
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 182
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 222
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 257
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 308
         at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ILScanner.cs:line 140
         at ILCompiler.Program.<Run>g__RunScanner|3_0(<>c__DisplayClass3_0&) in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 438
         at ILCompiler.Program.Run() in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 418
         at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass203_0.<.ctor>b__0(InvocationContext) in /_/src/coreclr/tools/aot/ILCompiler/ILCompilerRootCommand.cs:line 272

An analysis of the error and its symptoms is given here: dotnet/installer#15851 (comment). As stated in the analysis, the behavior is sporadic for some reason. One notable symptom is that BadImageFormatException is thrown when attempting to load certain assemblies in the ComputeManagedAssembliesToCompileToNative build task.

It's been confirmed that reverting the changes in #81205 resolves the issue. We should consider reverting this change or getting a fix in to address this in the Preview 3 timeframe.

cc @MichalStrehovsky

Author: mthalman
Assignees: -
Labels:

untriaged, area-NativeAOT-coreclr

Milestone: -

@agocke
Copy link
Member

agocke commented Mar 20, 2023

It's been confirmed that reverting the changes in #81205 resolves the issue. We should consider reverting this change or getting a fix in to address this in the Preview 3 timeframe.

Reverting would be pretty bad -- we're planning on relying on this change to do more testing with the Pri0 coreclr tests. Otherwise the AOT compile is too slow.

How can we service the old compiler? I think we should start by getting rid of the catch and seeing the actual exception, with message.

@MichalStrehovsky
Copy link
Member

If we need to unblock source build, adding a NativeAotSupported=false under SourceBuild here would unblock:

<NativeAotSupported Condition="'$(TargetOS)' != 'windows' and '$(TargetOS)' != 'linux' and '$(TargetOS)' != 'osx'">false</NativeAotSupported>
<NativeAotSupported Condition="'$(TargetArchitecture)' != 'x64'">false</NativeAotSupported>

I think this is some kind of build race caused by how source build is set up. I'm not aware of seeing this before. We do build NativeAOT compiler with NativeAOT in dotnet/runtime builds and CI and haven't seen this.

The ComputeManagedAssembliesToCompileToNative is another remnant from CoreRT/runtimelab when we could not make changes to the SDK targets to accomodate the compiler. If we address #67080 and similar by moving more of this logic into the SDK, the task can potentially be deleted. The SDK knows what assemblies we need to compile - we shouldn't have to recompute that in a task.

@mthalman
Copy link
Member Author

I think we should start by getting rid of the catch and seeing the actual exception, with message.

The exception message is Image is too small. Seems to come from here:

@agocke
Copy link
Member

agocke commented Mar 21, 2023

?? Ok this must be some sort of race condition where the binaries are still being written when we read them. Agreed with @MichalStrehovsky -- let's try to fix this authoring.

@agocke agocke added this to the 8.0.0 milestone Mar 23, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Mar 23, 2023
@MichaelSimons
Copy link
Member

Gentle ping. What is the status/eta on fixing this? It is affecting source-build.

@agocke
Copy link
Member

agocke commented May 16, 2023

@sbomer Could you also look at this? This seems to be some incredibly strange race condition that results in a partially-written file being read by another task. From @MichalStrehovsky's comment, it sounds like we might be able to just avoid some complexity here entirely by modifying the SDK.

@MichalStrehovsky
Copy link
Member

The ILCompiler that we're building now has a dependency on the System.CommandLine NuGet. The LKG version of ILCompiler is failing because the reference is bad.

How does source build restore the NuGets that the repo build depends on? I don't think this is fixable from the runtime repo. The problem is with the NuGet that source build is providing to the repo build system. When the task is running, the NuGets are not fully written to disk. Does source build do something with how NuGets are used? Does it run the restore as usual or there is something special going on?

@MichalStrehovsky
Copy link
Member

Before #81205 the partially restored nugets would still be a problem (we would likely sometimes construct a ILCompiler that doesn't work because we'd blindly copy the partial assembly) but I doubt any NativeAOT testing is happening in source build so this would never be detected.

@MichaelSimons
Copy link
Member

MichaelSimons commented May 17, 2023

How does source build restore the NuGets that the repo build depends on?

The only intended difference between a repo build and source-build are the nuget feeds. For source-build we have to build everything from source so any dependencies are restored from local feeds that consist of artifacts built prior in the dependency tree or from the N-1 build (necessary to break circular dependencies)

@jkotas
Copy link
Member

jkotas commented May 17, 2023

How are the contributing repo dependencies tracked by source build?

We need to guarantee that https://github.com/dotnet/dotnet/tree/main/src/command-line-api finishes building before it gets used by https://github.com/dotnet/dotnet/tree/main/src/runtime . What does guarantee that?

@sbomer
Copy link
Member

sbomer commented May 17, 2023

Maybe runtime.proj is missing a <RepositoryReference Include="command-line-api" /> here https://github.com/dotnet/dotnet/blob/eb24b2bf58a85e34199f319b60c30551e65984c5/repo-projects/runtime.proj#L50-L55?

@MichaelSimons
Copy link
Member

Maybe runtime.proj is missing a <RepositoryReference Include="command-line-api" /> here https://github.com/dotnet/dotnet/blob/eb24b2bf58a85e34199f319b60c30551e65984c5/repo-projects/runtime.proj#L50-L55?

I agree this dependency should be expressed but it won't actually change the build order as command-line-api is built before runtime as expressed in the vmr's top level proj file.

@MichaelSimons
Copy link
Member

Bottom line is the command-line-api is the first product repo in the build.

@jkotas
Copy link
Member

jkotas commented May 18, 2023

Could you please share links to a recent build logs with the failure? (The build logs shared above have been deleted.)

What are the steps that can reproduce this? I have tried https://github.com/dotnet/dotnet#building a few times and it always works fine for me.

@sbomer
Copy link
Member

sbomer commented May 18, 2023

I was just able to repro it with these changes: https://github.com/dotnet/dotnet/compare/main...sbomer:dotnet:tmp?expand=1.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 18, 2023
jkotas added a commit to jkotas/runtime that referenced this issue May 18, 2023
We expect the implementation of these assemblies to come as part of msbuild.

Fixes dotnet#83695
@jkotas
Copy link
Member

jkotas commented May 18, 2023

This is regression introduced by #79783.

Before this change, the only assemblies copied next to ILCompiler.Build.Tasks.dll are System.Collections.Immutable.dll and System.Reflection.Metadata.dll.

After this change, the build is also copying over System.Buffers.dll, System.Memory.dll and System.Numerics.Vectors.dll. The problem is that these are reference assemblies that fail spectacularly when loaded for execution.

It is causing a problem in the source build environment only. It is not causing a problem in regular shipping build environment since these assemblies end up being loaded from somewhere else and so these broken reference assemblies do not come into the picture.

#86423 is proposed fix.

@MichaelSimons
Copy link
Member

I have tried https://github.com/dotnet/dotnet#building a few times and it always works fine for me.

@jkotas - You must remove the workaround patch in the VMR - https://github.com/dotnet/installer/blob/main/src/SourceBuild/patches/runtime/0002-Revert-switch-to-self-hosted-NativeAOT-compiler.patch

@jkotas
Copy link
Member

jkotas commented May 18, 2023

Yep, I have figured that based on Sven's hint. Based on earlier comments, I assumed that this is a race condition that will reproduce intermittently, and I did not expect there to be a patch for it. It turns out that this is not a race condition. It reproduces 100% of time with the workaround patch removed.

jkotas added a commit that referenced this issue May 18, 2023
* Exclude System.* reference assemblies in ILCompiler.Build.Tasks

We expect the implementation of these assemblies to come as part of msbuild.

Fixes #83695

* Do not include any System.* assemblies as part of the task

* Delete custom resolver
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 18, 2023
@jkotas
Copy link
Member

jkotas commented May 18, 2023

@MichaelSimons The fix is checked in. Could you please take care of deleting the workaround from source build?

@MichaelSimons
Copy link
Member

Will do - Thanks for investigating and resolving the runtime issue.

@MichaelSimons
Copy link
Member

@jkotas - I am seeing the same failure as before when trying to remove the source build workaround - dotnet/installer#16466. I did verify the runtime changes flowed in last week - dotnet/dotnet@cda4753#diff-531a689fcb6b7793148534b23232ba5733d6afe2ca51c1a31119defd88090bfd.

@jkotas
Copy link
Member

jkotas commented May 23, 2023

@MichaelSimons Your PR was submitted against stale main that did not have the fix. I have merged current main into your PR and it is all green now.

@MichaelSimons
Copy link
Member

@MichaelSimons Your PR was submitted against stale main that did not have the fix. I have merged current main into your PR and it is all green now.

Thanks for updating my PR. That confuses me though, because azdo syncs in latest when running PR validation.

@jkotas
Copy link
Member

jkotas commented May 23, 2023

azdo syncs in latest when running PR validation.

I do not think that installer repo is configured to sync when running PR validation.

@MichaelSimons
Copy link
Member

Source build internal CI which runs several distro variants is failing in multiple legs with the original symptoms.

Failing Legs
CentOSStream8_Offline_MsftSdk_x64
CentOSStream9_Offline_MsftSdk_x64
Ubuntu2004_Offline_MsftSdk_x64

Note: There are several other legs failing but they are caused from different issues.

@MichaelSimons MichaelSimons reopened this May 23, 2023
@jkotas
Copy link
Member

jkotas commented May 23, 2023

The failure is:

    EXEC : error : Failed to load assembly 'System.CommandLine' [/vmr/src/runtime/artifacts/source-build/self/src/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj]
      Internal.TypeSystem.TypeSystemException+FileNotFoundException: Failed to load assembly 'System.CommandLine'
         at Internal.TypeSystem.ThrowHelper.ThrowFileNotFoundException(ExceptionStringID, String) + 0x30
         at Internal.TypeSystem.ResolutionFailure.Throw() + 0xfe
         at Internal.TypeSystem.Ecma.EcmaModule.GetObject(EntityHandle, NotFoundBehavior) + 0x9b
         at Internal.TypeSystem.Ecma.EcmaModule.GetType(EntityHandle) + 0x2c
         at Internal.TypeSystem.Ecma.EcmaType.InitializeBaseType() + 0x81
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) + 0x48
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) + 0x1e
         at Internal.TypeSystem.Ecma.EcmaType.ComputeTypeFlags(TypeFlags) + 0x1b4
         at Internal.TypeSystem.TypeDesc.InitializeTypeFlags(TypeFlags) + 0x1e
         at Internal.TypeSystem.MetadataFieldLayoutAlgorithm.ComputeInstanceLayout(DefType, InstanceLayoutKind) + 0x170
         at Internal.TypeSystem.DefType.ComputeInstanceLayout(InstanceLayoutKind) + 0x60
         at ILCompiler.CompilerTypeSystemContext.EnsureLoadableTypeUncached(TypeDesc) + 0x450
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) + 0x14
         at ILCompiler.DependencyAnalysis.EETypeNode..ctor(NodeFactory, TypeDesc) + 0x14f
         at ILCompiler.DependencyAnalysis.NodeFactory.CreateConstructedTypeNode(TypeDesc) + 0x81
         at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey) + 0x14
         at ILCompiler.DependencyAnalysis.NodeFactory.ConstructedTypeSymbol(TypeDesc) + 0x77
         at ILCompiler.DependencyAnalysis.ReflectionInvokeMapNode.AddDependenciesDueToReflectability(DependencyNodeCore`1.DependencyList&, NodeFactory, MethodDesc) + 0x88
         at ILCompiler.MetadataManager.GetDependenciesDueToReflectability(DependencyNodeCore`1.DependencyList&, NodeFactory, MethodDesc) + 0x65
         at ILCompiler.DependencyAnalysis.ReflectedMethodNode.GetStaticDependencies(NodeFactory) + 0x6a
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1) + 0x35
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1) + 0x39
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() + 0xb1
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() + 0x54
         at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() + 0x19
         at ILCompiler.Program.<Run>g__RunScanner|3_0(Program.<>c__DisplayClass3_0&) + 0x117
         at ILCompiler.Program.Run() + 0x1637
         at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass206_0.<.ctor>b__0(InvocationContext context) + 0x201
    /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/build/Microsoft.NETCore.Native.targets(270,5): error MSB3073: The command ""/vmr/src/runtime/artifacts/source-build/self/package-cache/runtime.linux-x64.microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/ilc" @"/vmr/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/ILCompiler/x64/Release/native/ilc.ilc.rsp"" exited with code 1. [/vmr/src/runtime/artifacts/source-build/self/src/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj]

Notice that this is running .NET 8.0 Preview 4 build from package-cache (package-cache/runtime.linux-x64.microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5). This fix is present in .NET 8.0 Preview 5+ only. The failure should go away once this part of the source build is upgraded to .NET 8.0 Preview 5+.

@jkotas
Copy link
Member

jkotas commented May 23, 2023

The source build seems to be combining live and prebuilt binaries in a way that is not present in the regular build. Is that intentional? It is the reason why this fails in source build only while regular build is fine. Alternative way to address this issue sooner may be to make source build closer to how regular build works.

@MichaelSimons
Copy link
Member

Can you point me to the indicators that source build seems to be combining live and prebuilt binaries in a way that is not present in the regular build? I don't think this is intentional.

@jkotas
Copy link
Member

jkotas commented May 24, 2023

Can you point me to the indicators that source build seems to be combining live and prebuilt binaries in a way that is not present in the regular build?
Source build internal CI which runs several distro variants is failing in multiple legs with the original symptoms.

Notice that the CentOSStream8_Online_MsftSdk_x64 build worked fine, but CentOSStream8_Offline_MsftSdk_x64 build failed with the System.CommandLine error.

The relevant part of the CentOSStream8_Online_MsftSdk_x64 build log:

ComputeManagedAssembliesToCompileToNative
    Assembly = /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/build/../tools/netstandard/ILCompiler.Build.Tasks.dll
    Assembly loaded during TaskRun (Build.Tasks.ComputeManagedAssembliesToCompileToNative): System.Reflection.Metadata, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a (location: /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/netstandard/System.Reflection.Metadata.dll, MVID: 245bce55-05ea-4d04-adfa-9faa17b201fd, AppDomain: [Default])
    Assembly loaded during TaskRun (Build.Tasks.ComputeManagedAssembliesToCompileToNative): System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a (location: /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/netstandard/System.Collections.Immutable.dll, MVID: 39e02189-e4dd-4b54-8fe2-17d6ac2c6f64, AppDomain: [Default])

The relevant part of the CentOSStream8_Offline_MsftSdk_x64 build log:

ComputeManagedAssembliesToCompileToNative
    Assembly = /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/build/../tools/netstandard/ILCompiler.Build.Tasks.dll
    Assembly loaded during TaskRun (Build.Tasks.ComputeManagedAssembliesToCompileToNative): System.Reflection.Metadata, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a (location: /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/netstandard/System.Reflection.Metadata.dll, MVID: 1c82603b-e988-482a-913f-67e66508d167, AppDomain: [Default])
    Assembly loaded during TaskRun (Build.Tasks.ComputeManagedAssembliesToCompileToNative): System.Collections.Immutable, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a (location: /vmr/src/runtime/artifacts/source-build/self/package-cache/microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/netstandard/System.Collections.Immutable.dll, MVID: 41f1c2d3-a8f9-4ab4-8eda-52a9000d6cf3, AppDomain: [Default])

The assembly version is Version=6.0.0.0 in the successful build and Version=8.0.0.0 in the failing build, and moreover the Version=8.0.0.0 assembly is reference assembly instead of implementation assembly that breaks the build. Why does the offline build ends up with a completely different binary in the preview.4 bits?

@mthalman
Copy link
Member Author

mthalman commented Jun 6, 2023

Ok, I've tracked down the cause of the use of inconsistent versions. It's due to the bootstrapping logic having a casing typo for the package name. This causes two different names of the Microsoft.DotNet.ILCompiler package (with the same version) to exist in the local package feed. There's apparently some non-determinism in what NuGet will give you in that case which leads to the unpredictable build results. I've fixed this in my testing.

However, there is another related failure that we missed earlier through all the noise of the build errors:

Unhandled Exception: System.CommandLine.CommandLineException: Target OS 'centos' is not supported
  at System.CommandLine.Helpers.GetTargetOS(String) + 0x48d
  at System.CommandLine.Argument`1.<>c__DisplayClass5_0.<.ctor>b__1(ArgumentResult argumentResult, Object& value) + 0x20
  at System.CommandLine.Parsing.ArgumentResult.Convert(Argument) + 0x1c6
  at System.CommandLine.Parsing.ArgumentResult.GetArgumentConversionResult() + 0x1c
  at System.CommandLine.Parsing.ParseResultVisitor.ValidateAndConvertArgumentResult(ArgumentResult) + 0x70
  at System.CommandLine.Parsing.ParseResultVisitor.ValidateAndConvertOptionResult(OptionResult) + 0x230
  at System.CommandLine.Parsing.ParseResultVisitor.Stop() + 0x4c
  at System.CommandLine.Parsing.Parser.Parse(IReadOnlyList`1, String) + 0xdb
  at ILCompiler.Program.Main(String[] args) + 0x142
  at ilc!<BaseAddress>+0xf508d5
/tmp/MSBuildTemproot/tmpcb14709e86a14624b5eea8b5646a9ecc.exec.cmd: line 2: 27181 Aborted                 (core dumped) "/vmr/src/runtime/artifacts/source-build/self/package-cache/runtime.linux-x64.microsoft.dotnet.ilcompiler/8.0.0-preview.4.23259.5/tools/ilc" @"/vmr/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/ILCompiler/x64/Release/native/ilc.ilc.rsp"

This occurs when using a source-built SDK to build the VMR. In my test scenario, the RID of the SDK is centos.8. This logic here gets that RID and trims it to set the value of the target OS that is passed to ilc:

<_targetOS>$(RuntimeIdentifier.SubString(0, $(RuntimeIdentifier.LastIndexOf('-'))))</_targetOS>
<_indexOfPeriod>$(_targetOS.IndexOf('.'))</_indexOfPeriod>
<_targetOS Condition="'$(_indexOfPeriod)' &gt; -1">$(_targetOS.SubString(0, $(_indexOfPeriod)))</_targetOS>

This fails in the following code because there's no corresponding case for handling a centos value:

public static TargetOS GetTargetOS(string token)
{
if(string.IsNullOrEmpty(token))
{
if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
return TargetOS.Windows;
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
return TargetOS.Linux;
else if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
return TargetOS.OSX;
else if (RuntimeInformation.IsOSPlatform(OSPlatform.FreeBSD))
return TargetOS.FreeBSD;
throw new NotImplementedException();
}
return token.ToLowerInvariant() switch
{
"linux" => TargetOS.Linux,
"win" or "windows" => TargetOS.Windows,
"osx" => TargetOS.OSX,
"freebsd" => TargetOS.FreeBSD,
"maccatalyst" => TargetOS.MacCatalyst,
"iossimulator" => TargetOS.iOSSimulator,
"ios" => TargetOS.iOS,
"tvossimulator" => TargetOS.tvOSSimulator,
"tvos" => TargetOS.tvOS,
_ => throw new CommandLineException($"Target OS '{token}' is not supported")
};
}

Here's a link to the binlog (internal link) which comes from this build (internal link).

@jkotas - Can you provide your input here? Clearly this code isn't comprehensive with respect to the available RIDs. But it seems broken that it should have to account for everything. Is there a better way to do this?

Because of this issue, it will delay our ability to remove the NativeAotSupported patch. We need to first release an 8.0 preview version that contains a fix for this target OS issue. Then in the following preview version, we can remove the patch. They can't be done in the same release because of the need to support a workflow that allows you to build Preview N using the SDK of Preview N-1.

@agocke
Copy link
Member

agocke commented Jun 6, 2023

I think the RID logic in the targets is broken -- instead of using RuntimeIdentifier for the _targetOS property, that should probably be the portable RID. I forget offhand what the correct property is for that. I'll have to dig it out of the build

@jkotas
Copy link
Member

jkotas commented Jun 6, 2023

instead of using RuntimeIdentifier for the _targetOS property, that should probably be the portable RID. I forget offhand what the correct property is for that. I'll have to dig it out of the build

Do you mean NETCoreSdkPortableRuntimeIdentifier property? It is not exactly what we need here. NETCoreSdkPortableRuntimeIdentifier is host, but we need target.

crossgen2 task has some complicated logic for this that walks RID graph here: https://github.com/dotnet/sdk/blob/dc8c847472b827dae15b3f865a5615fd7717b83e/src/Tasks/Microsoft.NET.Build.Tasks/ResolveReadyToRunCompilers.cs#L68-L73 . Is there a simpler way to do it?

it will delay our ability to remove the dotnet/installer#15872. We need to first release an 8.0 preview version that contains a fix for this target OS issue.

The complete removal of this patch is going to depend enabling native AOT for source build - tracked by dotnet/source-build#1215 (comment) . If you would really like to see this patch gone, we can check in in to dotnet/runtime - we have number of similar conditions throughout the repo. It does not hurt anything.

@mthalman
Copy link
Member Author

mthalman commented Jun 7, 2023

The complete removal of this patch is going to depend enabling native AOT for source build - tracked by dotnet/source-build#1215 (comment) . If you would really like to see this patch gone, we can check in in to dotnet/runtime - we have number of similar conditions throughout the repo. It does not hurt anything.

I think that makes sense to do then. We're making a push to eliminate all patches for source-build.

jkotas added a commit to jkotas/runtime that referenced this issue Jun 7, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 7, 2023
jkotas added a commit that referenced this issue Jun 7, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 7, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jul 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
6 participants