Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK targets for post-build transformations #2583

Open
sbomer opened this issue Oct 13, 2018 · 4 comments
Open

SDK targets for post-build transformations #2583

sbomer opened this issue Oct 13, 2018 · 4 comments
Milestone

Comments

@sbomer
Copy link
Member

sbomer commented Oct 13, 2018

Numerous tools exist today which perform some transformation on IL or other files that are deployed with a published app. Examples include:

  • ILLink, which does tree-shaking on IL assemblies.
  • Native dependency trimming in ILLink.Tasks, which removes some native files from the output based on references from IL assemblies.
  • Crossgen in ILLink.Tasks, which can compile IL to Ready-To-Run images with native code.
  • Fody, an extensible tool that has many plugins that do IL -> IL transformations.
  • ILMerge and ILRepack, which merge multiple IL assemblies files into one.
  • CoreRT, which compiles IL to object files and uses a native linker to produce an executable.
  • Single exe, currently being investigated, which may need to be able to merge the entire publish set into a single file.
  • Installer technology like clickonce, which probably needs to do the same.
  • Runtime package store (I believe this is being deprecated), which filters files from being included in the publish output based on a manifest.

These tools all need to run at similar places during the build, and many of the tools can usefully be used independently. It is desirable for developers to have the final say over which tools run during a
build, allowing them to weigh tradeoffs and make the right decision for their apps. I'll be focusing on the linker, but it's worth keeping these other tools in mind.

Some of these tools only provide commandline executables, leaving the developer to write MSBuild logic or scripts to run the tool during a build. Others attempt to provide full integration with the SDK, providing tasks and targets that can be turned on or off with a simple boolean property.

This proposal is for the SDK to provide a supported, documented way for such tools to make themselves a part of the build, reducing the amount of duplicated effort it takes for tool authors to ship tools that are properly integrated with the rest of the build logic.

This may involve coordination with MSBuild, since some of the relevant targets belong to MSBuild itself. See this issue on MSBuild transformations for details.

Problems with the existing approach

Some of the existing tools (ILLink, CoreRT, Crossgen) ship with their own targets that hook into the SDK to run at the right place. They use BeforeTargets to run just before ComputeFilesToPublish, where they rewrite ResolvedAssembliesToPublish, IntermediateAssembly, and other items to point to the transformed assemblies:

https://github.com/mono/linker/blob/7c11deffc05005b6b60eeccb80a7cf133e24c007/corebuild/integration/ILLink.Tasks/ILLink.Tasks.targets#L91-L117

The existing targets are factored in a way that makes transparently rewriting these inputs difficult. These are some of the issues that come up:

Downstream targets implicitly rely on invariants broken by the linker

Upstream targets capture pre-transform values of these ItemGroups

Dependencies between the publish set and other targets are not expressed

Duplicated work to understand inputs to ComputeFilesToPublish

SDK changes break tools that rely on this approach in subtle ways

  • The contents of the related ItemGroups aren't documented anywhere. In the past, subtle changes in the SDK have resulted in breaks for the linker. This makes it especially difficult for third parties to author tools that work well with the build. For example, a change in package asset resolution broke the way we get the paths to reference assemblies: Fatal error in IL Linker with latest SDK linker#286 (comment). We should work together to establish a contract and include a test in the SDK to ensure that it is maintained.

Duplicate work to handle common build concerns

  • Each tool independently has consider where the intermediate output goes for incremental builds, and how to ensure that these outputs are correctly cleaned up on clean builds. This may just be a matter of documenting conventions used by the rest of the SDK.

Cooperative phase ordering

Proposal

We should work together to come up with a contract that defines a single place where the linker and similar rewriting tools can hook in, without having to work around unrelated logic elsewhere in the SDK and MSBuild. As a start, here are the current behaviors I'm aware of, and some requirements for the contract to be useful to each tool. Note that there are many similarities between these tools.

ILLink

Current behavior

  • Before ComputeFilesToPublish, picks out from ResolvedAssembliesToPublish the managed assemblies that aren't resources. These and the IntermediateAssembly are linker inputs.
  • Computes platform libraries from the deps.json file and the RuntimeIdentifier and TargetFramework. Includes System.Private.CoreLib.dll as a special case (because it ships in the "native" directory of the runtime package, while everything else comes from "lib").
  • Links, with additional inputs (root descriptors, root assemblies).
  • Obtains the linker output managed assemblies and debug symbols. Needs to jump through some hoops to attach original item metadata to the output files so that they can be correctly processed by the later SDK logic.
  • Rewrites the inputs ResolvedAssembliesToPublish, IntermediateAssembly, and _DebugSymbolsIntermediatePath with the linked outputs.
  • Hacks _PublishConflictPackageFiles to include managed assemblies that were removed, so that they will be kept out of the generated deps.json.
  • Works around the reference assembly issue described above.

Requirements

  • Needs to operate on the set of IL files to be published, whether deploying standalone or self-contained.
  • Produces a subset of the input files, possibly with modified contents (some code may have been removed or marked with attributes).
  • Needs to know which assemblies are part of the platform so that it can implement different behavior for these as a heuristic.
  • In the future, it may need to operate on additional inputs like xaml that are relevant for reflection analysis heuristics.
  • Updates should be reflected in deps.json.

Native dependency trimming (part of ILLink.Tasks)

Current behavior

  • After linking, gets list of native assemblies from ResolvedAssembliesToPublish.
  • Has some hard-coded native dependencies which it always keeps.
  • Scans linked assemblies for references to native assemblies.
  • These are added back when rewriting ResolvedAssembliesToPublish.
  • Hacks _PublishConflictPackageFiles to include native files that were removed, so that they will be kept out of the generated deps.json.

Requirements

  • Should run after illink if illink was used.
  • Inputs include IL assemblies, and the native assemblies that will be published with the application.
  • It needs to be able to modify the set of native dependencies that will be published.
  • Updates should be reflected in deps.json.

Crossgen

Current behavior

  • Before ComputeFilesToPublish, picks out from ResolvedAssembliesToPublish the managed assemblies that aren't resources. Since this is after linking, ResolvedAssembliesToPublish has already been rewritten. This and IntermediateAssembly (also rewritten by the linker) are inputs to crossgen.
  • Compute the "platform" from the deps.json (shared with linker). This is just all managed assemblies from ResolvedAssembliesToPublish and IntermediateAssembly.
  • Scan any inputs in case they are already crossgen'd. Don't want to re-crossgen these.
  • Run crossgen.
  • Rewrite ResolvedAssembliesToPublish and IntermediateAssembly.

Requirements

  • Should run after illink if illink was used.
  • Inputs include all IL assemblies that will be published.
  • It needs to be able to rewrite the input to produce ReadyToRun images.

ILC

I'm less aware of the details of this tool, but here's my understanding:

Current behavior

  • Before ComputeFilesToPublish, processes ResolvedAssembliesToPublish to pick out the managed assemblies that are referenced by ilc subject to certain constraints, and also a set of assemblies to skip.
  • After native linking, rewrites IntermediateAssembly to point to the optimized binary.

Requirements

  • Needs to change the publish output to be the AOT-optimized native executable.

Other tools to consider

It may be helpful to keep in mind Fody, ILMerge, ILRepack, single exe, installer technology, and any tools that filter the publish set, like the runtime store, because they could all benefit from a documented set of behavior around publish, enabling them to work well with the .NET SDK out of the box.

@nguerrera @zamont @swaroop-sridhar @morganbr @jeffschwMSFT

@morganbr
Copy link

Thanks for putting this together, @sbomer. Having a supported way to find and change all of the files that would be included in publishing (including all managed assemblies and even miscellaneous files in the app) would make constructing these tools much more straightforward. Phase ordering will also help a lot -- I could easily imagine users wanting to run ILLink, followed by CrossGen, Single exe tooling, or ILC, followed by installer packaging and that ordering is very important. Of course, they might only want to run one or two of those steps as well.

I'd also emphasize the importance of being able to run tools in the build phase instead of publish -- some of these tools, such as ILLink can significantly change application behavior, so developers should be able to debug their app in the same way it will ship.

@nguerrera
Copy link
Contributor

nguerrera commented Oct 13, 2018

Cc @peterhuene

@sbomer This is a great write up and exactly what I was hoping to see next after our chat. Thanks!

@nguerrera
Copy link
Contributor

Cc @tmat

@nguerrera
Copy link
Contributor

I'd also emphasize the importance of being able to run tools in the build phase instead of publish

Good point. We are also working to effectivity make publish an optional step for common cases in 3.0. Build output will include nuget deoendencies and be xcopyable to other machines.

I'm hoping we can avhieve this while reducing code duplication between build and publish (for example, today they have separate paths for deps.json generation). Ideally this could help mean that such a tools like this can write code once that can easily be configured to run on build or only on publish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants