Skip to content

[build] Shutdown/kill any build servers at the end of the build.#21315

Merged
rolfbjarne merged 1 commit intomainfrom
dev/rolf/dotnet-shutdown-build-server-at-end
Sep 30, 2024
Merged

[build] Shutdown/kill any build servers at the end of the build.#21315
rolfbjarne merged 1 commit intomainfrom
dev/rolf/dotnet-shutdown-build-server-at-end

Conversation

@rolfbjarne
Copy link
Member

@rolfbjarne rolfbjarne commented Sep 27, 2024

This is a log from our bots, note the 14 minute gap just before printing the timing results:

[...]
2024-09-27T07:34:00.3958920Z Making install in dotnet
2024-09-27T07:34:01.7633820Z Validated file permissions for Xamarin.Mac.
2024-09-27T07:34:01.7800150Z Validated file permissions for Xamarin.iOS.
2024-09-27T07:34:01.7825300Z
2024-09-27T07:34:01.7872490Z 	Xamarin.iOS has not been installed into your system by 'make install'
2024-09-27T07:34:01.7918570Z 	In order to set the currently built Xamarin.iOS as your system version,
2024-09-27T07:34:01.7965090Z 	execute 'make install-system'.
2024-09-27T07:34:01.7987920Z
2024-09-27T07:34:01.8034290Z 	Xamarin.Mac has not been installed into your system by 'make install'
2024-09-27T07:34:01.8080260Z 	In order to set the currently built Xamarin.Mac as your system version,
2024-09-27T07:34:01.8126200Z 	execute 'make install-system'.
2024-09-27T07:34:01.8148530Z
2024-09-27T07:48:22.3100850Z
2024-09-27T07:48:22.3102130Z real	15m26.160s
2024-09-27T07:48:22.3102800Z user	1m4.044s
2024-09-27T07:48:22.3103270Z sys	0m18.379s

What happens is this:

  • We're using parallel make, and parallel make will start a jobserver, managed by file descriptors, where these file descriptors must be closed in all subprocesses for make to realize it's done.
  • Any 'dotnet build' might start a build server
  • The build server does not close any file descriptors it may have inherited when daemonizing itself.
  • Thus the build server (which will still be alive after we're done building here) might have a file descriptor open which make is waiting for.
  • The proper fix is to fix the build server to close its file descriptors.
  • The intermediate working is to shut down the build server instead.

This will save 10-15 minutes at the end of every build in the bots.

This is a log from our bots, note the 14 minute gap just before printing the timing results:

```
[...]
2024-09-27T07:34:00.3958920Z Making install in dotnet
2024-09-27T07:34:01.7633820Z Validated file permissions for Xamarin.Mac.
2024-09-27T07:34:01.7800150Z Validated file permissions for Xamarin.iOS.
2024-09-27T07:34:01.7825300Z
2024-09-27T07:34:01.7872490Z 	Xamarin.iOS has not been installed into your system by 'make install'
2024-09-27T07:34:01.7918570Z 	In order to set the currently built Xamarin.iOS as your system version,
2024-09-27T07:34:01.7965090Z 	execute 'make install-system'.
2024-09-27T07:34:01.7987920Z
2024-09-27T07:34:01.8034290Z 	Xamarin.Mac has not been installed into your system by 'make install'
2024-09-27T07:34:01.8080260Z 	In order to set the currently built Xamarin.Mac as your system version,
2024-09-27T07:34:01.8126200Z 	execute 'make install-system'.
2024-09-27T07:34:01.8148530Z
2024-09-27T07:48:22.3100850Z
2024-09-27T07:48:22.3102130Z real	15m26.160s
2024-09-27T07:48:22.3102800Z user	1m4.044s
2024-09-27T07:48:22.3103270Z sys	0m18.379s
```
@vs-mobiletools-engineering-service2
Copy link
Collaborator

📚 [CI Build] Artifacts 📚

Artifacts were not provided.

Pipeline on Agent
Hash: [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build] Tests on macOS X64 - Mac Sonoma (14) passed 💻

All tests on macOS X64 - Mac Sonoma (14) passed.

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build] Tests on macOS M1 - Mac Big Sur (11) passed 💻

All tests on macOS M1 - Mac Big Sur (11) passed.

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build] Tests on macOS M1 - Mac Monterey (12) passed 💻

All tests on macOS M1 - Mac Monterey (12) passed.

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build] Tests on macOS M1 - Mac Ventura (13) passed 💻

All tests on macOS M1 - Mac Ventura (13) passed.

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2

This comment was marked as outdated.

@vs-mobiletools-engineering-service2
Copy link
Collaborator

✅ API diff for current PR / commit

NET (empty diffs)
  • iOS: (empty diff detected)
  • tvOS: (empty diff detected)
  • MacCatalyst: (empty diff detected)
  • macOS: (empty diff detected)

✅ API diff vs stable

.NET (No breaking changes)

ℹ️ Generator diff

Generator Diff: vsdrops (html) vsdrops (raw diff) gist (raw diff) - Please review changes)

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build] Windows Integration Tests passed 💻

All Windows Integration Tests passed.

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@vs-mobiletools-engineering-service2

This comment has been minimized.

@vs-mobiletools-engineering-service2
Copy link
Collaborator

🚀 [CI Build] Test results 🚀

Test results

✅ All tests passed on VSTS: test results.

🎉 All 97 tests passed 🎉

Tests counts

✅ cecil: All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (iOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (MacCatalyst): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (macOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (Multiple platforms): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (tvOS): All 1 tests passed. Html Report (VSDrops) Download
✅ framework: All 2 tests passed. Html Report (VSDrops) Download
✅ fsharp: All 4 tests passed. Html Report (VSDrops) Download
✅ generator: All 1 tests passed. Html Report (VSDrops) Download
✅ interdependent-binding-projects: All 4 tests passed. Html Report (VSDrops) Download
✅ introspection: All 4 tests passed. Html Report (VSDrops) Download
✅ linker: All 40 tests passed. Html Report (VSDrops) Download
✅ monotouch (iOS): All 7 tests passed. Html Report (VSDrops) Download
✅ monotouch (MacCatalyst): All 7 tests passed. Html Report (VSDrops) Download
✅ monotouch (macOS): All 8 tests passed. Html Report (VSDrops) Download
✅ monotouch (tvOS): All 7 tests passed. Html Report (VSDrops) Download
✅ msbuild: All 2 tests passed. Html Report (VSDrops) Download
✅ xcframework: All 4 tests passed. Html Report (VSDrops) Download
✅ xtro: All 1 tests passed. Html Report (VSDrops) Download

Pipeline on Agent
Hash: d74d61c20d441ecdcbf1c19809314a15c22bce06 [PR build]

@rolfbjarne rolfbjarne changed the title Shutdown/kill any build servers at the end of the build. [build] Shutdown/kill any build servers at the end of the build. Sep 27, 2024
@rolfbjarne rolfbjarne marked this pull request as ready for review September 27, 2024 13:32
@rolfbjarne rolfbjarne merged commit 5470132 into main Sep 30, 2024
@rolfbjarne rolfbjarne deleted the dev/rolf/dotnet-shutdown-build-server-at-end branch September 30, 2024 22:23
rolfbjarne added a commit that referenced this pull request Oct 1, 2024
)

This is a log from our bots, note the 14 minute gap just before printing the timing results:

```
[...]
2024-09-27T07:34:00.3958920Z Making install in dotnet
2024-09-27T07:34:01.7633820Z Validated file permissions for Xamarin.Mac.
2024-09-27T07:34:01.7800150Z Validated file permissions for Xamarin.iOS.
2024-09-27T07:34:01.7825300Z
2024-09-27T07:34:01.7872490Z 	Xamarin.iOS has not been installed into your system by 'make install'
2024-09-27T07:34:01.7918570Z 	In order to set the currently built Xamarin.iOS as your system version,
2024-09-27T07:34:01.7965090Z 	execute 'make install-system'.
2024-09-27T07:34:01.7987920Z
2024-09-27T07:34:01.8034290Z 	Xamarin.Mac has not been installed into your system by 'make install'
2024-09-27T07:34:01.8080260Z 	In order to set the currently built Xamarin.Mac as your system version,
2024-09-27T07:34:01.8126200Z 	execute 'make install-system'.
2024-09-27T07:34:01.8148530Z
2024-09-27T07:48:22.3100850Z
2024-09-27T07:48:22.3102130Z real	15m26.160s
2024-09-27T07:48:22.3102800Z user	1m4.044s
2024-09-27T07:48:22.3103270Z sys	0m18.379s
```

What happens is this:
*   We're using parallel make, and parallel make will start a jobserver, managed by file descriptors, where these file descriptors must be closed in all subprocesses for make to realize it's done.
*   Any 'dotnet build' might start a build server
*   The build server does not close any file descriptors it may have inherited when daemonizing itself.
*   Thus the build server (which will still be alive after we're done building here) might have a file descriptor open which make is waiting for.
*   The proper fix is to fix the build server to close its file descriptors.
*   The intermediate working is to shut down the build server instead.

This will save 10-15 minutes at the end of every build in the bots.
rolfbjarne added a commit that referenced this pull request Oct 1, 2024
)

This is a log from our bots, note the 14 minute gap just before printing the timing results:

```
[...]
2024-09-27T07:34:00.3958920Z Making install in dotnet
2024-09-27T07:34:01.7633820Z Validated file permissions for Xamarin.Mac.
2024-09-27T07:34:01.7800150Z Validated file permissions for Xamarin.iOS.
2024-09-27T07:34:01.7825300Z
2024-09-27T07:34:01.7872490Z 	Xamarin.iOS has not been installed into your system by 'make install'
2024-09-27T07:34:01.7918570Z 	In order to set the currently built Xamarin.iOS as your system version,
2024-09-27T07:34:01.7965090Z 	execute 'make install-system'.
2024-09-27T07:34:01.7987920Z
2024-09-27T07:34:01.8034290Z 	Xamarin.Mac has not been installed into your system by 'make install'
2024-09-27T07:34:01.8080260Z 	In order to set the currently built Xamarin.Mac as your system version,
2024-09-27T07:34:01.8126200Z 	execute 'make install-system'.
2024-09-27T07:34:01.8148530Z
2024-09-27T07:48:22.3100850Z
2024-09-27T07:48:22.3102130Z real	15m26.160s
2024-09-27T07:48:22.3102800Z user	1m4.044s
2024-09-27T07:48:22.3103270Z sys	0m18.379s
```

What happens is this:
*   We're using parallel make, and parallel make will start a jobserver, managed by file descriptors, where these file descriptors must be closed in all subprocesses for make to realize it's done.
*   Any 'dotnet build' might start a build server
*   The build server does not close any file descriptors it may have inherited when daemonizing itself.
*   Thus the build server (which will still be alive after we're done building here) might have a file descriptor open which make is waiting for.
*   The proper fix is to fix the build server to close its file descriptors.
*   The intermediate working is to shut down the build server instead.

This will save 10-15 minutes at the end of every build in the bots.
rolfbjarne added a commit that referenced this pull request Oct 2, 2024
)

This is a log from our bots, note the 14 minute gap just before printing the timing results:

```
[...]
2024-09-27T07:34:00.3958920Z Making install in dotnet
2024-09-27T07:34:01.7633820Z Validated file permissions for Xamarin.Mac.
2024-09-27T07:34:01.7800150Z Validated file permissions for Xamarin.iOS.
2024-09-27T07:34:01.7825300Z
2024-09-27T07:34:01.7872490Z 	Xamarin.iOS has not been installed into your system by 'make install'
2024-09-27T07:34:01.7918570Z 	In order to set the currently built Xamarin.iOS as your system version,
2024-09-27T07:34:01.7965090Z 	execute 'make install-system'.
2024-09-27T07:34:01.7987920Z
2024-09-27T07:34:01.8034290Z 	Xamarin.Mac has not been installed into your system by 'make install'
2024-09-27T07:34:01.8080260Z 	In order to set the currently built Xamarin.Mac as your system version,
2024-09-27T07:34:01.8126200Z 	execute 'make install-system'.
2024-09-27T07:34:01.8148530Z
2024-09-27T07:48:22.3100850Z
2024-09-27T07:48:22.3102130Z real	15m26.160s
2024-09-27T07:48:22.3102800Z user	1m4.044s
2024-09-27T07:48:22.3103270Z sys	0m18.379s
```

What happens is this:
*   We're using parallel make, and parallel make will start a jobserver, managed by file descriptors, where these file descriptors must be closed in all subprocesses for make to realize it's done.
*   Any 'dotnet build' might start a build server
*   The build server does not close any file descriptors it may have inherited when daemonizing itself.
*   Thus the build server (which will still be alive after we're done building here) might have a file descriptor open which make is waiting for.
*   The proper fix is to fix the build server to close its file descriptors.
*   The intermediate working is to shut down the build server instead.

This will save 10-15 minutes at the end of every build in the bots.
dalexsoto added a commit that referenced this pull request Mar 17, 2026
Parallel make (e.g. 'make all -j8', 'make world') has been hanging
indefinitely at the end of the build. This is a long-standing issue
(#13355) that has been patched three times
(#15407, #21315, #22300) without fully fixing the root cause.

The problem: when using parallel make, GNU Make uses a jobserver with
pipe-based file descriptors to coordinate sub-makes. The dotnet CLI
can start background build servers (MSBuild server, Roslyn/VBCSCompiler)
that inherit these file descriptors but never close them. Make then
waits indefinitely for those file descriptors to close, thinking there
are still active jobs.

The previous workaround attempted to shut down and force-kill dotnet
processes after the build via a 'shutdown-build-server' target. This
approach was unreliable because:
- The shutdown ran from a double-colon all-hook:: rule with no
  prerequisites, so with -j it could execute in parallel with (or
  before) the actual build, killing nothing.
- Build servers started by later subdirectories (e.g. tests/) after
  the dotnet/ shutdown were never killed.
- The process-matching regex pattern might not match all server processes.

The fix: disable build servers entirely via environment variables in
Make.config:
- DOTNET_CLI_USE_MSBUILD_SERVER=0: prevents the MSBuild server
- UseSharedCompilation=false: prevents the Roslyn compiler server
- MSBUILDDISABLENODEREUSE=1: prevents MSBuild node reuse

This eliminates the root cause - no background servers means no
inherited file descriptors means no hang. The shutdown-build-server
target and its invocations are removed as they are no longer needed.

Additionally, 'make world' now prints the installed workloads at the
end of the build for visibility.

Build without changes:
	make world  2149.57s user 258.32s system 107% cpu 37:30.19 total

Build with changes:
	make world  2242.74s user 286.38s system 354% cpu 11:52.55 total

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dalexsoto added a commit that referenced this pull request Mar 17, 2026
Parallel make (e.g. 'make all -j8', 'make world') has been hanging
indefinitely at the end of the build. This is a long-standing issue
(#13355) that has been patched three times
(#15407, #21315, #22300) without fully fixing the root cause.

The problem: when using parallel make, GNU Make uses a jobserver with
pipe-based file descriptors to coordinate sub-makes. The dotnet CLI
can start background build servers (MSBuild server, Roslyn/VBCSCompiler)
that inherit these file descriptors but never close them. Make then
waits indefinitely for those file descriptors to close, thinking there
are still active jobs.

The previous workaround attempted to shut down and force-kill dotnet
processes after the build via a 'shutdown-build-server' target. This
approach was unreliable because:
- The shutdown ran from a double-colon all-hook:: rule with no
  prerequisites, so with -j it could execute in parallel with (or
  before) the actual build, killing nothing.
- Build servers started by later subdirectories (e.g. tests/) after
  the dotnet/ shutdown were never killed.
- The process-matching regex pattern might not match all server processes.

The fix: disable build servers entirely via environment variables in
Make.config:
- DOTNET_CLI_USE_MSBUILD_SERVER=0: prevents the MSBuild server
  https://learn.microsoft.com/en-us/visualstudio/msbuild/msbuild-server
  https://github.com/dotnet/msbuild/blob/main/documentation/MSBuild-Server.md
- UseSharedCompilation=false: prevents the Roslyn compiler server (VBCSCompiler)
  dotnet/roslyn#27975
- MSBUILDDISABLENODEREUSE=1: prevents MSBuild node reuse
  https://github.com/dotnet/msbuild/wiki/MSBuild-Tips-&-Tricks

This eliminates the root cause - no background servers means no
inherited file descriptors means no hang. The shutdown-build-server
target and its invocations are removed as they are no longer needed.

Additionally, 'make world' now prints the installed workloads at the
end of the build for visibility.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dalexsoto added a commit that referenced this pull request Mar 17, 2026
)

Parallel make (e.g. 'make all -j8', 'make world') has been hanging
for a while at the end of the build. This is a long-standing issue
(#13355) that has been patched
three times
(#15407,
#21315,
#22300) without fully fixing the
root cause.

The problem: when using parallel make, GNU Make uses a jobserver with
pipe-based file descriptors to coordinate sub-makes. The dotnet CLI
can start background build servers (MSBuild server, Roslyn/VBCSCompiler)
that inherit these file descriptors but never close them. Make then
waits for those file descriptors to close (which won't happen until
the servers exit - which they typically do about 10 minutes without
activity), thinking there are still active jobs.

The previous workaround attempted to shut down and force-kill dotnet
processes after the build via a 'shutdown-build-server' target. This
approach was unreliable because:
- The shutdown ran from a double-colon all-hook:: rule with no
  prerequisites, so with -j it could execute in parallel with (or
  before) the actual build, killing nothing.
- Build servers started by later subdirectories (e.g. tests/) after
  the dotnet/ shutdown were never killed.
- The process-matching regex pattern might not match all server
processes.

Ideally this would be fixed in when launching the build servers, by
making them not inherit handles. Unfortunately this is currently not
possible: dotnet/runtime#13943 (although this
might change in a not so
distant future: dotnet/runtime#123959)

The workaround: disable build servers entirely via environment variables
in
Make.config:
- DOTNET_CLI_USE_MSBUILD_SERVER=0: prevents the MSBuild server
  https://learn.microsoft.com/en-us/visualstudio/msbuild/msbuild-server

https://github.com/dotnet/msbuild/blob/main/documentation/MSBuild-Server.md
- UseSharedCompilation=false: prevents the Roslyn compiler server
(VBCSCompiler)
  dotnet/roslyn#27975
- MSBUILDDISABLENODEREUSE=1: prevents MSBuild node reuse
  https://github.com/dotnet/msbuild/wiki/MSBuild-Tips-&-Tricks

This eliminates the root cause - no background servers means no
inherited file descriptors means no hang. The shutdown-build-server
target and its invocations are removed as they are no longer needed.

Additionally, 'make world' now prints the installed workloads at the
end of the build for visibility.

Build without changes:
> make world  2149.57s user 258.32s system 107% cpu 37:30.19 total

Build with changes:
> make world  2242.74s user 286.38s system 354% cpu 11:52.55 total

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants