Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace RAR cache BinaryFormatter serialization with Bond #3868

Closed
wants to merge 15 commits into from

Conversation

ccastanedaucf
Copy link
Contributor

@ccastanedaucf ccastanedaucf commented Oct 20, 2018

The SystemState cache in ResolveAssemblyReference currently uses the built-in .NET BinaryFormatter to serialize the cache to disk. The serialization format produced by BinaryFormatter is:

  1. Slow
  2. Not compact

This modifies SystemState to serialize it's cache via Bond. Note the other caches deriving from StateFileBase have not been touched; I don't seem to have a repo that utilizes these and I don't think they're as important. However, those classes much simpler data structures than SystemState so converting them to the new serialization would be fairly straightforward.

Perf trace of '''ResolveAssemblyReference''' (left is this PR, right is master):
rar

Specifically the deserialization times here:
rar2

Just for completeness these are the results I get from running the perf suite, but I'm assuming these all start from a clean run since the numbers don't change. Especially since, as seen above, I get pretty massive gains locally.

DesignTimeBuild Time (ms) (src\msbuild.A vs src\msbuild.chcasta)

Test Overall Significant δ Value
DotnetConsoleProject 👌 no 29.1193 -> 29.222 (0.353%)
DotnetWebProject yes 203.5494 -> 200.6334 (-1.433%)
DotnetMvcProject 👌 no 204.846 -> 206.0219 (0.574%)
Picasso ::ok_hand: no 1460.9434 -> 1459.6714 (-0.087%)
SmallP2POldCsproj 🔴 yes 64.3028 -> 64.5305 (0.354%)
SmallP2PNewCsproj 👌 no 537.1404 -> 537.7936 (0.122%)
LargeP2POldCsproj yes 10404.0744 -> 10387.3153 (-0.161%)
OrchardCore yes 43264.529 -> 42965.3284 (-0.692%)

Tried to break up the commits as best as possible, but some notes since overall this is a fairly large PR (although much of it is just boilerplate/copy-paste):

Some Bond code is copied in the first commit. Bond only exposes its OutputStream/InputStream wrappers on .NET Framework since internally it depends on a .NET Framework-only unsafe assembly, requiring you to either write your own, or... copy-pasta it. Not the greatest solution, but using the safe version of that assembly seems to work perfectly fine on .NET Core.

Bond is much stricter about what you're allowed serialize into, and doesn't really have a mechanism for more complex type mappings. The result is... a ton of ugly conversion boilerplate.

This guy is mostly a copy-pasta of StateFileBase. Know it's not ideal to have these two essentially serving the same purpose.

Need a better location to asynchronously initialize the static Bond Serializer/Deserializer, and early enough to finish init'ing before RAR executes. Last commit was a bit of a hack to get it working in the perf tests.

/// <summary>
/// Implements IInputStream on top of System.Stream
/// </summary>
internal class InputStream : InputBuffer
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bond only exposes its OutputStream/InputStream wrappers on .NET Framework since internally it depends on a .NET Framework-only unsafe assembly, requiring you to either write your own, or... copy-pasta it. Not the greatest solution, but using the safe version of InputBuffer/OutputBuffer seems to work perfectly fine on .NET Core.

I'm also thinking "Add Bond" should be a separate PR? This is already big as it is.


namespace Microsoft.Build.Tasks
{
internal class StateFileCache<T> where T : StateFileCachePayload, new()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly a copy-pasta of StateFileBase. Know it's not ideal to have these two essentially serving the same purpose, but the only way to remove that class is to convert other deriving classes to Bond. Not sure if that's worth it.

@@ -344,6 +345,8 @@ public static BuildManager DefaultBuildManager
/// <exception cref="InvalidOperationException">Thrown if a build is already in progress.</exception>
public void BeginBuild(BuildParameters parameters)
{
System.Threading.Tasks.Task.Run(() => { ResolveAssemblyReference.InitializeSerializers(); });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hack to ensure this executes early enough in the perf tests. Need a better location to asynchronously initialize the static Bond Serializer/Deserializer, and early enough to finish init'ing before RAR executes.

{
internal sealed partial class SystemState
{
private class SystemStateCachePayloadExtractor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I like this Hydrator/Extractor naming scheme

@@ -28,6 +28,7 @@

<ItemGroup>
<ProjectReference Include="..\Framework\Microsoft.Build.Framework.csproj" />
<ProjectReference Include="..\Tasks\Microsoft.Build.Tasks.csproj" />
Copy link
Contributor

@dfederm dfederm Oct 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this dependency seems suspicious. My understanding of the projects are:

  • Microsoft.Build.Framework - Interfaces and classes shared by MSBuild and Tasks
  • Microsoft.Build - Core build logic. Depends only on Framework
  • Microsoft.Build.Tasks - Various tasks. Depends only on Framework
  • Microsoft.Build.Tasks.Utilities - Optional helpful utilities for task writers

Adding this dependency seems to break that pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree, this was just a quick hack me and @cdmihai came up with for perf tests to make sure the serialization initializers are called early enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. A lot of the "shared" code which is compiled into multiple assemblies is to avoid exactly this issue :( We'll need to look at a way to avoid this. And clearly we can't have this as "shared" code because it would then have to be initialized per assembly.

@ccastanedaucf
Copy link
Contributor Author

ccastanedaucf commented Oct 24, 2018

Well this is definitely a head scratcher. Debugging why parallel builds via /m are failing at the Csc task when attempting to build WebLarge, and strangely this exact reference

/reference:C:\Users\chcasta\.nuget\packages\netstandard.library\2.0.3\build\netstandard2.0\ref\netstandard.dll

is omitted from some command line args passed into Csc during an incremental parallel build (first build always succeeds), but not a single-proc build. And it's always omitted from exactly 5 different projects, which change every build. And a failed multi-proc build causes all following single-proc builds to fail. And subsequent single-proc builds still shuffle exactly 5 random projects to omit netstandard.dll. Which doesn't make a ton of sense because those are building projects in the same order each build.

Then just replacing https://github.com/Microsoft/msbuild/blob/e2e670bbe25a8979efdc96f0deb64d08019e82f3/src/Tasks/StateFileCache.cs#L40-L42 and https://github.com/Microsoft/msbuild/blob/e2e670bbe25a8979efdc96f0deb64d08019e82f3/src/Tasks/StateFileCache.cs#L86-L89 with BinaryFormatter just fixes everything, so that rules out anything with the SystemStateCachePayload conversion logic. So there's something very funky going on here.

@rainersigwald
Copy link
Member

As with the RAR-as-a-service prototype, let's close this since we haven't been able to land it yet. Hopefully we can come back to it when chasing #2015 again in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants