Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSBuild.exe spuriously dies with BadImageFormatException on BuildEnvironmentHelper.get_Instance() #6970

Closed
jzabroski opened this issue Oct 18, 2021 · 15 comments
Labels
bug closed-by-bot Closed due to lack of activity needs-more-info Issues that need more info to continue investigation. stale For issues that haven't had activity in some time. triaged

Comments

@jzabroski
Copy link

Issue Description

I periodically get MSBuild failures. I previously reported a separate issue that was closed due to lack of reproducibility. However, this time, the Windows Event Log dump seems more helpful than previously. See Actual Behavior below.

Steps to Reproduce

I have no idea. Feeling a bit like Steve Martin in THE JERK right now just opening this ticket, but it needs fixing.

Expected Behavior

No Crashes

Actual Behavior

MSBuild.exe crashes. Windows Event Log has the following error messages:

Error	10/18/2021 7:28:28 PM	Application Error	1000	(100)
Faulting application name: MSBuild.exe, version: 16.10.2.30804, time stamp: 0xe420bb9d
Faulting module name: KERNELBASE.dll, version: 10.0.14393.4350, time stamp: 0x606eb12a
Exception code: 0xe0434352
Fault offset: 0x000dc602
Faulting process id: 0x4f18
Faulting application start time: 0x01d7c477dac329c3
Faulting application path: C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\MSBuild.exe
Faulting module path: C:\Windows\System32\KERNELBASE.dll
Report Id: e3f0efcd-1087-488f-a7be-497ecb67a096
Faulting package full name: 
Faulting package-relative application ID: 
Error	10/18/2021 7:28:28 PM	.NET Runtime	1026	None
Application: MSBuild.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.BadImageFormatException
   at Microsoft.Build.Shared.BuildEnvironmentHelper.get_Instance()
   at Microsoft.Build.Utilities.Traits.get_Instance()
   at Microsoft.Build.CommandLine.MSBuildApp.Execute(System.String)
   at Microsoft.Build.CommandLine.MSBuildApp.Main()
Error	10/18/2021 7:27:58 PM	Application Error	1000	(100)
Faulting application name: MSBuild.exe, version: 16.10.2.30804, time stamp: 0xe420bb9d
Faulting module name: KERNELBASE.dll, version: 10.0.14393.4350, time stamp: 0x606eb12a
Exception code: 0xe0434352
Fault offset: 0x000dc602
Faulting process id: 0x3e60
Faulting application start time: 0x01d7c477c8e27647
Faulting application path: C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\MSBuild.exe
Faulting module path: C:\Windows\System32\KERNELBASE.dll
Report Id: 89447d74-5d51-4095-8f31-0aeae7ea35ae
Faulting package full name: 
Faulting package-relative application ID: 
Error	10/18/2021 7:27:58 PM	.NET Runtime	1026	None
Application: MSBuild.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.BadImageFormatException
   at Microsoft.Build.Shared.BuildEnvironmentHelper.get_Instance()
   at Microsoft.Build.Utilities.Traits.get_Instance()
   at Microsoft.Build.CommandLine.MSBuildApp.Execute(System.String)
   at Microsoft.Build.CommandLine.MSBuildApp.Main()

Analysis

It might be inside BuildEnvironmentHelper.Instance property. There are 5 places inside MSBuildApp that call this helper Instance property, see https://github.com/dotnet/msbuild/blob/bbb9655b007be6d079985f3a7ec14f5d82a18f64/src/MSBuild/XMake.cs

Versions & Configurations

MSBuild.exe, version: 16.10.2.30804

Attach a binlog

How do I attach a binlog if most often the process that crashes is a background process created by VS Tools? https://docs.microsoft.com/en-us/visualstudio/msbuild/obtaining-build-logs-with-msbuild?view=vs-2019 is actually not at all helpful in this regard.

@jzabroski jzabroski added bug needs-triage Have yet to determine what bucket this goes in. labels Oct 18, 2021
@jzabroski
Copy link
Author

jzabroski commented Oct 18, 2021

Actually, it looks like any line here could cause this issue:

private static BuildEnvironment Initialize()
{
// See https://github.com/Microsoft/msbuild/issues/1461 for specification of ordering and details.
var possibleLocations = new Func<BuildEnvironment>[]
{
TryFromEnvironmentVariable,
TryFromVisualStudioProcess,
TryFromMSBuildProcess,
TryFromMSBuildAssembly,
TryFromDevConsole,
TryFromSetupApi,
TryFromAppContextBaseDirectory
};
foreach (var location in possibleLocations)
{
var env = location();
if (env != null)
return env;
}
// If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
// we will use the current running process for the CurrentMSBuildExePath value. This is likely
// wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
// functionality to work.
//
// If we are running tests, then the current running process may be a test runner located in the
// NuGet packages folder. So in that case, we use the location of the current assembly, which
// will be in the output path of the test project, which is what we want.
string msbuildExePath;
if (s_runningTests())
{
msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
}
else
{
msbuildExePath = s_getProcessFromRunningProcess();
}
return new BuildEnvironment(
BuildEnvironmentMode.None,
msbuildExePath,
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: null);
}
private static BuildEnvironment TryFromEnvironmentVariable()
{
var msBuildExePath = s_getEnvironmentVariable("MSBUILD_EXE_PATH");
return msBuildExePath == null
? null
: TryFromMSBuildExeUnderVisualStudio(msBuildExePath, allowLegacyToolsVersion: true) ?? TryFromStandaloneMSBuildExe(msBuildExePath);
}
private static BuildEnvironment TryFromVisualStudioProcess()
{
if (!NativeMethodsShared.IsWindows)
return null;
var vsProcess = s_getProcessFromRunningProcess();
if (!IsProcessInList(vsProcess, s_visualStudioProcess)) return null;
var vsRoot = FileUtilities.GetFolderAbove(vsProcess, 3);
string msBuildExe = GetMSBuildExeFromVsRoot(vsRoot);
return new BuildEnvironment(
BuildEnvironmentMode.VisualStudio,
msBuildExe,
runningTests: false,
runningInVisualStudio: true,
visualStudioPath: vsRoot);
}
private static BuildEnvironment TryFromMSBuildProcess()
{
var msBuildExe = s_getProcessFromRunningProcess();
if (!IsProcessInList(msBuildExe, s_msBuildProcess)) return null;
// First check if we're in a VS installation
if (NativeMethodsShared.IsWindows &&
Regex.IsMatch(msBuildExe, $@".*\\MSBuild\\{CurrentToolsVersion}\\Bin\\.*MSBuild(?:TaskHost)?\.exe", RegexOptions.IgnoreCase))
{
return new BuildEnvironment(
BuildEnvironmentMode.VisualStudio,
msBuildExe,
runningTests: false,
runningInVisualStudio: false,
visualStudioPath: GetVsRootFromMSBuildAssembly(msBuildExe));
}
// Standalone mode running in MSBuild.exe
return new BuildEnvironment(
BuildEnvironmentMode.Standalone,
msBuildExe,
runningTests: false,
runningInVisualStudio: false,
visualStudioPath: null);
}
private static BuildEnvironment TryFromMSBuildAssembly()
{
var buildAssembly = s_getExecutingAssemblyPath();
if (buildAssembly == null) return null;
// Check for MSBuild.[exe|dll] next to the current assembly
var msBuildExe = Path.Combine(FileUtilities.GetFolderAbove(buildAssembly), "MSBuild.exe");
var msBuildDll = Path.Combine(FileUtilities.GetFolderAbove(buildAssembly), "MSBuild.dll");
// First check if we're in a VS installation
var environment = TryFromMSBuildExeUnderVisualStudio(msBuildExe);
if (environment != null)
{
return environment;
}
// We're not in VS, check for MSBuild.exe / dll to consider this a standalone environment.
string msBuildPath = null;
if (FileSystems.Default.FileExists(msBuildExe)) msBuildPath = msBuildExe;
else if (FileSystems.Default.FileExists(msBuildDll)) msBuildPath = msBuildDll;
if (!string.IsNullOrEmpty(msBuildPath))
{
// Standalone mode with toolset
return new BuildEnvironment(
BuildEnvironmentMode.Standalone,
msBuildPath,
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: null);
}
return null;
}
private static BuildEnvironment TryFromMSBuildExeUnderVisualStudio(string msbuildExe, bool allowLegacyToolsVersion = false)
{
string msBuildPathPattern = allowLegacyToolsVersion
? $@".*\\MSBuild\\({CurrentToolsVersion}|\d+\.0)\\Bin\\.*"
: $@".*\\MSBuild\\{CurrentToolsVersion}\\Bin\\.*";
if (NativeMethodsShared.IsWindows &&
Regex.IsMatch(msbuildExe, msBuildPathPattern, RegexOptions.IgnoreCase))
{
string visualStudioRoot = GetVsRootFromMSBuildAssembly(msbuildExe);
return new BuildEnvironment(
BuildEnvironmentMode.VisualStudio,
GetMSBuildExeFromVsRoot(visualStudioRoot),
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: visualStudioRoot);
}
return null;
}
private static BuildEnvironment TryFromDevConsole()
{
if (s_runningTests())
{
// If running unit tests, then don't try to get the build environment from MSBuild installed on the machine
// (we should be using the locally built MSBuild instead)
return null;
}
// VSINSTALLDIR and VisualStudioVersion are set from the Developer Command Prompt.
var vsInstallDir = s_getEnvironmentVariable("VSINSTALLDIR");
var vsVersion = s_getEnvironmentVariable("VisualStudioVersion");
if (string.IsNullOrEmpty(vsInstallDir) || string.IsNullOrEmpty(vsVersion) ||
vsVersion != CurrentVisualStudioVersion || !FileSystems.Default.DirectoryExists(vsInstallDir)) return null;
return new BuildEnvironment(
BuildEnvironmentMode.VisualStudio,
GetMSBuildExeFromVsRoot(vsInstallDir),
runningTests: false,
runningInVisualStudio: false,
visualStudioPath: vsInstallDir);
}
private static BuildEnvironment TryFromSetupApi()
{
if (s_runningTests())
{
// If running unit tests, then don't try to get the build environment from MSBuild installed on the machine
// (we should be using the locally built MSBuild instead)
return null;
}
Version v = new Version(CurrentVisualStudioVersion);
var instances = s_getVisualStudioInstances()
.Where(i => i.Version.Major == v.Major && FileSystems.Default.DirectoryExists(i.Path))
.ToList();
if (instances.Count == 0) return null;
if (instances.Count > 1)
{
// TODO: Warn user somehow. We may have picked the wrong one.
}
return new BuildEnvironment(
BuildEnvironmentMode.VisualStudio,
GetMSBuildExeFromVsRoot(instances[0].Path),
runningTests: false,
runningInVisualStudio: false,
visualStudioPath: instances[0].Path);
}
private static BuildEnvironment TryFromAppContextBaseDirectory()
{
// Assemblies compiled against anything older than .NET 4.0 won't have a System.AppContext
// Try the base directory that the assembly resolver uses to probe for assemblies.
// Under certain scenarios the assemblies are loaded from spurious locations like the NuGet package cache
// but the toolset files are copied to the app's directory via "contentFiles".
var appContextBaseDirectory = s_getAppContextBaseDirectory();
if (string.IsNullOrEmpty(appContextBaseDirectory)) return null;
// Look for possible MSBuild exe names in the AppContextBaseDirectory
return s_msBuildExeNames
.Select((name) => TryFromStandaloneMSBuildExe(Path.Combine(appContextBaseDirectory, name)))
.FirstOrDefault(env => env != null);
}
private static BuildEnvironment TryFromStandaloneMSBuildExe(string msBuildExePath)
{
if (!string.IsNullOrEmpty(msBuildExePath) && FileSystems.Default.FileExists(msBuildExePath))
{
// MSBuild.exe was found outside of Visual Studio. Assume Standalone mode.
return new BuildEnvironment(
BuildEnvironmentMode.Standalone,
msBuildExePath,
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: null);
}
return null;
}

Yikes. How is Microsoft testing this huge dependency tree? It seems impossible. What human being understands what this does?

It of course would be great first step if the Try methods reported their own exceptions, rather than reporting BuildEnvironment singleton failed to create via a cryptic TypeLoadInitializationException - but I dont understand why this manifests as a BadImageFormatException. Have not studied the code close enough to figure that out yet.

@jzabroski
Copy link
Author

It seems BadImageFormatException can happen if a bad assembly is loaded. It would seem to therefore make some sense if MSBuild were to be improved to read the assembly metadata and determine if the types needed are from the right assembly, but I have not really written such code myself, so that is just a working theory. In any case, it seems this probing logic is rather unintelligent and could be made more debuggable.

@Forgind
Copy link
Member

Forgind commented Oct 27, 2021

That explanation seems plausible. I'm assuming it was originally written that way to try to have neat code, but it would be cleaner (in my opinion) like:

BuildEnvironment env = TryFromEnvironmentVariable() ??
                           TryFromVisualStudioProcess() ??
                           TryFromMSBuildProcess() ??
                           TryFromDevConsole() ??
                           TryFromSetupApi() ??
                           TryFromAppContextBaseDirectory();

if (env is not null)
{
    return env;
}
...

If you also have get_Instance check whether it was already initialized and return if so, else Initialize(), I think it would have the added benefit of giving you a more complete call stack, which should help with figuring out what the real root of this problem is. It might be worth trying that with your repro.

@jzabroski
Copy link
Author

jzabroski commented Oct 27, 2021

I was wondering if just having a static constructor on BuildEnvironment that does nothing important would also avoid the bad image format exception and a more useful exception. But that is a gray area in my knowledge of how the runtime handles these. To be honest, I try not to ever write code that can fail in a ctor. The design pattern I was taught in my 20s was "create, set, use".

@jzabroski
Copy link
Author

@Forgind I thought about your proposal a bit more, and I suspect that re-writing it that way won't solve the issue, but it may change the windows event log exception that is logged into something more actionable. It would also make the code more readable.

That said, if it were me, I would seek to write this code in such a way that de-coupled candidate paths from side effects. One possible solution would be to write the beginning of the method this way:

        private static BuildEnvironment Initialize()
        {
            // See https://github.com/Microsoft/msbuild/issues/1461 for specification of ordering and details.
            var env = TryFromEnvironmentVariable() ??
                           TryFromVisualStudioProcess() ??
                           TryFromMSBuildProcess() ??
                           TryFromDevConsole() ??
                           TryFromSetupApi() ??
                           TryFromAppContextBaseDirectory();

            if (env != null)
                 return env;

            // If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
            // we will use the current running process for the CurrentMSBuildExePath value.  This is likely
            // wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
            // functionality to work.
            //
            // If we are running tests, then the current running process may be a test runner located in the
            // NuGet packages folder.  So in that case, we use the location of the current assembly, which
            // will be in the output path of the test project, which is what we want.

            string msbuildExePath;
            if (s_runningTests())
            {
                msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
            }

Separately, the following is a bit of duplicate code compared to other parts of code in the same file:

// Check for MSBuild.[exe|dll] next to the current assembly
var msBuildExe = Path.Combine(FileUtilities.GetFolderAbove(buildAssembly), "MSBuild.exe");
var msBuildDll = Path.Combine(FileUtilities.GetFolderAbove(buildAssembly), "MSBuild.dll");

// Look for possible MSBuild exe names in the AppContextBaseDirectory
return s_msBuildExeNames
.Select((name) => TryFromStandaloneMSBuildExe(Path.Combine(appContextBaseDirectory, name)))
.FirstOrDefault(env => env != null);

Although, I must say, it's strange that on lines 295-298, the method is saying it's searching for MSBuild.exe, but it's actually searching for both MSBuild.exe and MSBuild.dll. I don't know - this code is Gone Wild.

What do you think? CC @rainersigwald

As you can see in #6404 I have been having this problem quite a bit, dating back to at least May 3rd, 2021.

@jzabroski
Copy link
Author

I also think that these lines should be their own method, so that the Initialize thing can be made even clearer/simpler:

// If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
// we will use the current running process for the CurrentMSBuildExePath value. This is likely
// wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
// functionality to work.
//
// If we are running tests, then the current running process may be a test runner located in the
// NuGet packages folder. So in that case, we use the location of the current assembly, which
// will be in the output path of the test project, which is what we want.
string msbuildExePath;
if (s_runningTests())
{
msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
}
else
{
msbuildExePath = s_getProcessFromRunningProcess();
}
return new BuildEnvironment(
BuildEnvironmentMode.None,
msbuildExePath,
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: null);

        private static BuildEnvironment TryFromNoneMode()
        {
            // If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
            // we will use the current running process for the CurrentMSBuildExePath value.  This is likely
            // wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
            // functionality to work.
            //
            // If we are running tests, then the current running process may be a test runner located in the
            // NuGet packages folder.  So in that case, we use the location of the current assembly, which
            // will be in the output path of the test project, which is what we want.

            string msbuildExePath;
            if (s_runningTests())
            {
                msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
            }
            else
            {
                msbuildExePath = s_getProcessFromRunningProcess();
            }

            return new BuildEnvironment(
                BuildEnvironmentMode.None,
                msbuildExePath,
                runningTests: s_runningTests(),
                runningInVisualStudio: false,
                visualStudioPath: null);
        }

which would then make your initialization code read like this:

        private static BuildEnvironment Initialize()
        {
            // See https://github.com/Microsoft/msbuild/issues/1461 for specification of ordering and details.
            return TryFromEnvironmentVariable() ??
                           TryFromVisualStudioProcess() ??
                           TryFromMSBuildProcess() ??
                           TryFromDevConsole() ??
                           TryFromSetupApi() ??
                           TryFromAppContextBaseDirectory() ??
                           TryFromNoneMode();
        }

@jzabroski
Copy link
Author

Separately, I do think it's a bit weird 'running tests' is not an environment mode, and is assigned to the wonky state of None, but whatever.

@Forgind
Copy link
Member

Forgind commented Nov 1, 2021

Those refactorings look good to me. As you said, they wouldn't resolve the problem, but they should make it a good bit easier to figure out what exactly the problem is.

This code, like most code, was written piecemeal, so I'm sure there are places where comments don't quite line up or where we do duplicate work. I'd be a little careful around duplicate work just because I seem to remember there was some "duplicate work" I found recently that we stopped doing. It promptly broke, so we started doing it again. That said, I'm always happy to not do unnecessary work.

I am wondering about the "TryFromNoneMode" name...I'm not sure what a better name would be, but it doesn't feel quite right, since you're not trying to use None mode the same way you're looking in an environment variable, MSBuild process, etc. "TryConstructInDefaultMode"?

@jzabroski
Copy link
Author

None Mode is very much just as magical as all this other stuff.

@jzabroski
Copy link
Author

TryFromNoneMode could be named ContinueInNoneMode, as that is what the original comments suggest:

If we can't find a suitable environment, continue in the 'None' mode.

@Forgind
Copy link
Member

Forgind commented Nov 2, 2021

I'm not too worried about the name. If you just use it to help with debugging, it doesn't really matter what it is; if you submit it as a PR, I'm sure someone else will have an opinion 🙂

@jzabroski
Copy link
Author

Happened again with the latest Visual Studio.

Windows Event Log entries were:

Application: MSBuild.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.BadImageFormatException
   at Microsoft.Build.Shared.BuildEnvironmentHelper.get_Instance()
   at Microsoft.Build.Utilities.Traits.get_Instance()
   at Microsoft.Build.CommandLine.MSBuildApp.Execute(System.String)
   at Microsoft.Build.CommandLine.MSBuildApp.Main()
Faulting application name: MSBuild.exe, version: 16.11.1.47101, time stamp: 0xa5031179
Faulting module name: KERNELBASE.dll, version: 10.0.14393.4350, time stamp: 0x606eb12a
Exception code: 0xe0434352
Fault offset: 0x000dc602
Faulting process id: 0x51f0
Faulting application start time: 0x01d7d0e715245d9f
Faulting application path: C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\MSBuild.exe
Faulting module path: C:\Windows\System32\KERNELBASE.dll
Report Id: 7f2151a4-1fc9-43ce-9957-88c51a965c22
Faulting package full name: 
Faulting package-relative application ID: 

@rainersigwald
Copy link
Member

The callstack is not very helpful here. Can you try getting a memory dump of the crash (perhaps with procdump.exe as a postmortem debugger?

@benvillalobos benvillalobos added the needs-more-info Issues that need more info to continue investigation. label Nov 11, 2021
@ghost ghost removed the needs-triage Have yet to determine what bucket this goes in. label Nov 11, 2021
@ghost ghost added the stale For issues that haven't had activity in some time. label Dec 12, 2021
@ghost
Copy link

ghost commented Dec 12, 2021

This issue is marked as stale because feedback has been requested for 30 days with no response. Please respond within 14 days or this issue will be closed due to inactivity.

@ghost ghost closed this as completed Dec 26, 2021
@ghost
Copy link

ghost commented Dec 26, 2021

This issue was closed due to inactivity. If you can still reproduce this bug, please comment with the requested information, detailed steps to reproduce the problem, or any other notes that might help in the investigation.

@ghost ghost added the closed-by-bot Closed due to lack of activity label Dec 26, 2021
@AR-May AR-May added the triaged label Feb 21, 2024
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug closed-by-bot Closed due to lack of activity needs-more-info Issues that need more info to continue investigation. stale For issues that haven't had activity in some time. triaged
Projects
None yet
Development

No branches or pull requests

5 participants