-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSBuild.exe spuriously dies with BadImageFormatException on BuildEnvironmentHelper.get_Instance() #6970
Comments
Actually, it looks like any line here could cause this issue: msbuild/src/Shared/BuildEnvironmentHelper.cs Lines 74 to 315 in bbb9655
Yikes. How is Microsoft testing this huge dependency tree? It seems impossible. What human being understands what this does? It of course would be great first step if the Try methods reported their own exceptions, rather than reporting BuildEnvironment singleton failed to create via a cryptic TypeLoadInitializationException - but I dont understand why this manifests as a BadImageFormatException. Have not studied the code close enough to figure that out yet. |
It seems BadImageFormatException can happen if a bad assembly is loaded. It would seem to therefore make some sense if MSBuild were to be improved to read the assembly metadata and determine if the types needed are from the right assembly, but I have not really written such code myself, so that is just a working theory. In any case, it seems this probing logic is rather unintelligent and could be made more debuggable. |
That explanation seems plausible. I'm assuming it was originally written that way to try to have neat code, but it would be cleaner (in my opinion) like: BuildEnvironment env = TryFromEnvironmentVariable() ??
TryFromVisualStudioProcess() ??
TryFromMSBuildProcess() ??
TryFromDevConsole() ??
TryFromSetupApi() ??
TryFromAppContextBaseDirectory();
if (env is not null)
{
return env;
}
... If you also have get_Instance check whether it was already initialized and return if so, else Initialize(), I think it would have the added benefit of giving you a more complete call stack, which should help with figuring out what the real root of this problem is. It might be worth trying that with your repro. |
I was wondering if just having a static constructor on BuildEnvironment that does nothing important would also avoid the bad image format exception and a more useful exception. But that is a gray area in my knowledge of how the runtime handles these. To be honest, I try not to ever write code that can fail in a ctor. The design pattern I was taught in my 20s was "create, set, use". |
@Forgind I thought about your proposal a bit more, and I suspect that re-writing it that way won't solve the issue, but it may change the windows event log exception that is logged into something more actionable. It would also make the code more readable. That said, if it were me, I would seek to write this code in such a way that de-coupled candidate paths from side effects. One possible solution would be to write the beginning of the method this way: private static BuildEnvironment Initialize()
{
// See https://github.com/Microsoft/msbuild/issues/1461 for specification of ordering and details.
var env = TryFromEnvironmentVariable() ??
TryFromVisualStudioProcess() ??
TryFromMSBuildProcess() ??
TryFromDevConsole() ??
TryFromSetupApi() ??
TryFromAppContextBaseDirectory();
if (env != null)
return env;
// If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
// we will use the current running process for the CurrentMSBuildExePath value. This is likely
// wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
// functionality to work.
//
// If we are running tests, then the current running process may be a test runner located in the
// NuGet packages folder. So in that case, we use the location of the current assembly, which
// will be in the output path of the test project, which is what we want.
string msbuildExePath;
if (s_runningTests())
{
msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
} Separately, the following is a bit of duplicate code compared to other parts of code in the same file: msbuild/src/Shared/BuildEnvironmentHelper.cs Lines 181 to 183 in bbb9655
msbuild/src/Shared/BuildEnvironmentHelper.cs Lines 295 to 298 in bbb9655
Although, I must say, it's strange that on lines 295-298, the method is saying it's searching for MSBuild.exe, but it's actually searching for both MSBuild.exe and MSBuild.dll. I don't know - this code is Gone Wild. What do you think? CC @rainersigwald As you can see in #6404 I have been having this problem quite a bit, dating back to at least May 3rd, 2021. |
I also think that these lines should be their own method, so that the Initialize thing can be made even clearer/simpler: msbuild/src/Shared/BuildEnvironmentHelper.cs Lines 95 to 119 in bbb9655
private static BuildEnvironment TryFromNoneMode()
{
// If we can't find a suitable environment, continue in the 'None' mode. If not running tests,
// we will use the current running process for the CurrentMSBuildExePath value. This is likely
// wrong, but many things use the CurrentMSBuildToolsDirectory value which must be set for basic
// functionality to work.
//
// If we are running tests, then the current running process may be a test runner located in the
// NuGet packages folder. So in that case, we use the location of the current assembly, which
// will be in the output path of the test project, which is what we want.
string msbuildExePath;
if (s_runningTests())
{
msbuildExePath = typeof(BuildEnvironmentHelper).Assembly.Location;
}
else
{
msbuildExePath = s_getProcessFromRunningProcess();
}
return new BuildEnvironment(
BuildEnvironmentMode.None,
msbuildExePath,
runningTests: s_runningTests(),
runningInVisualStudio: false,
visualStudioPath: null);
} which would then make your initialization code read like this: private static BuildEnvironment Initialize()
{
// See https://github.com/Microsoft/msbuild/issues/1461 for specification of ordering and details.
return TryFromEnvironmentVariable() ??
TryFromVisualStudioProcess() ??
TryFromMSBuildProcess() ??
TryFromDevConsole() ??
TryFromSetupApi() ??
TryFromAppContextBaseDirectory() ??
TryFromNoneMode();
} |
Separately, I do think it's a bit weird 'running tests' is not an environment mode, and is assigned to the wonky state of None, but whatever. |
Those refactorings look good to me. As you said, they wouldn't resolve the problem, but they should make it a good bit easier to figure out what exactly the problem is. This code, like most code, was written piecemeal, so I'm sure there are places where comments don't quite line up or where we do duplicate work. I'd be a little careful around duplicate work just because I seem to remember there was some "duplicate work" I found recently that we stopped doing. It promptly broke, so we started doing it again. That said, I'm always happy to not do unnecessary work. I am wondering about the "TryFromNoneMode" name...I'm not sure what a better name would be, but it doesn't feel quite right, since you're not trying to use None mode the same way you're looking in an environment variable, MSBuild process, etc. "TryConstructInDefaultMode"? |
None Mode is very much just as magical as all this other stuff. |
|
I'm not too worried about the name. If you just use it to help with debugging, it doesn't really matter what it is; if you submit it as a PR, I'm sure someone else will have an opinion 🙂 |
Happened again with the latest Visual Studio. Windows Event Log entries were:
|
The callstack is not very helpful here. Can you try getting a memory dump of the crash (perhaps with |
This issue is marked as stale because feedback has been requested for 30 days with no response. Please respond within 14 days or this issue will be closed due to inactivity. |
This issue was closed due to inactivity. If you can still reproduce this bug, please comment with the requested information, detailed steps to reproduce the problem, or any other notes that might help in the investigation. |
Issue Description
I periodically get MSBuild failures. I previously reported a separate issue that was closed due to lack of reproducibility. However, this time, the Windows Event Log dump seems more helpful than previously. See Actual Behavior below.
Steps to Reproduce
I have no idea. Feeling a bit like Steve Martin in THE JERK right now just opening this ticket, but it needs fixing.
Expected Behavior
No Crashes
Actual Behavior
MSBuild.exe crashes. Windows Event Log has the following error messages:
Analysis
It might be inside BuildEnvironmentHelper.Instance property. There are 5 places inside MSBuildApp that call this helper Instance property, see https://github.com/dotnet/msbuild/blob/bbb9655b007be6d079985f3a7ec14f5d82a18f64/src/MSBuild/XMake.cs
Versions & Configurations
MSBuild.exe, version: 16.10.2.30804
Attach a binlog
How do I attach a binlog if most often the process that crashes is a background process created by VS Tools? https://docs.microsoft.com/en-us/visualstudio/msbuild/obtaining-build-logs-with-msbuild?view=vs-2019 is actually not at all helpful in this regard.
The text was updated successfully, but these errors were encountered: