Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dockerfiles for Linux .NET Composite Images #4343

Closed
Tracked by #47240
ivdiazsa opened this issue Jan 20, 2023 · 41 comments
Closed
Tracked by #47240

Add Dockerfiles for Linux .NET Composite Images #4343

ivdiazsa opened this issue Jan 20, 2023 · 41 comments

Comments

@ivdiazsa
Copy link
Member

ivdiazsa commented Jan 20, 2023

Describe the Problem

We have been working on .NET composites where, by bundling together the most needed assemblies, we are able to deliver a lower startup time, and overall better performance when running .NET apps. Our next step is to have the official Docker images recurrently built, alongside the existing .NET Linux images, so that customers can consume them.

Describe the Solution

We currently build them by hand but that can only be used for internal testing. We would like to have them officially published in the website: https://hub.docker.com/_/microsoft-dotnet

Additional Context

In order to achieve this purpose, we would like to add the corresponding Dockerfiles in this repo, which will pull the composites from the ASP.NET repo artifacts, and use them in a similar fashion as the currently existing Dockerfiles.

@ivdiazsa
Copy link
Member Author

@mangod9 @richlander Here's the issue we can use to track progress and discussions to get the .NET composite images, and Dockerfiles, published to customers.

@ivdiazsa
Copy link
Member Author

Adding @trylek as well as he has been a key contributor to this project.

@tmds
Copy link
Member

tmds commented Jan 28, 2023

We have been working on .NET composites where, by bundling together the most needed assemblies

Is a .NET composite a single file that bundles parts of different assemblies?

we are able to deliver a lower startup time, and overall better performance

What causes it to start faster? and achieve a better performance?

@trylek
Copy link
Member

trylek commented Jan 28, 2023

The .NET composite image bundles together the ready-to-run native code from all component assemblies. The managed assemblies themselves remain on disk but they are basically pure MSIL. (Everything can then be bundled into one file using the Single EXE .NET feature though.) The better startup and to a certain extent performance (so far we've mostly been focusing on startup) stems from the fact that a composite image can be treated as a "versioning bubble". The limitation that it always naturally needs to be shipped / serviced as a whole lets us do the following:

  • JIT can drop ready-to-run type / field layout compatibility checks at assembly boundaries;
  • We can inline more code across assemblies within the versioning bubble;
  • We can directly encode field offsets for types from a different assembly within the versioning bubble;
  • We can compile a larger set of generic instantiations over types from all assemblies within the versioning bubble.

So far we haven't been focusing too much on steady state performance as that's supposed to be optimized by runtime tiered JITting. The problem with steady state performance is that many perf-critical constructs in general JIT and in the framework foundations e.g. cleaning up local variable area on the stack in method prologs or string manipulations can be much better optimized in the presence of various instruction set extensions (e.g. AVX2 on x64) while by default we strive to generate code compatible with all CPU variants i.e. using only the base instruction set.

Crossgen2 does include options to declare what instruction set extensions are guaranteed to be present at runtime and in their presence it can generate the better optimized code right away but we tend to see that as a niche scenario that's only worth the effort where super high performance is of bigger value than compatibility.

@tmds
Copy link
Member

tmds commented Jan 30, 2023

If I understand correctly, the composite image is a single file that provides the functionality otherwise split in many shared framework dlls in an optimized way?

Are we building these composites as part of source-build?

Are they included in the shared framework folders (like Microsoft.NETCore.App/7.0.2)? Or where do they live?

@ivdiazsa
Copy link
Member Author

Are we building these composites as part of source-build?

Yes, the artifact that we'll use for the Docker image already includes everything built. In other words, the Dockerfile's code here won't look very different from say the normal existing Runtime one.

Are they included in the shared framework folders (like Microsoft.NETCore.App/7.0.2)? Or where do they live?

That's right.

@mthalman mthalman removed the untriaged label Feb 1, 2023
@mthalman mthalman added this to the .NET 8 milestone Feb 1, 2023
@ivdiazsa
Copy link
Member Author

The composite Dockerfile would look as follows:

ARG REPO=mcr.microsoft.com/dotnet/runtime-deps

# Installer image
FROM amd64/buildpack-deps:jammy-curl AS installer

# Retrieve Composite .NET Runtime
RUN dotnet_version=7.0.2 \
    && curl -fSL --output dotnet.tar.gz https://dotnetcli.azureedge.net/dotnet/aspnetcore/Runtime/$dotnet_version/aspnetcore-runtime-$dotnet_version-linux-x64-composite.tar.gz \
    && mkdir -p /dotnet \
    && tar -oxzf dotnet.tar.gz -C /dotnet \
    && rm dotnet.tar.gz


# .NET Runtime Base Image
FROM $REPO:7.0.2-jammy-amd64

# .NET Runtime and ASP.NET Versions
ENV DOTNET_VERSION=7.0.2
ENV ASPNET_VERSION=7.0.2

COPY --from=installer ["/dotnet", "/usr/share/dotnet"]

RUN ln -s /usr/share/dotnet/dotnet /usr/bin/dotnet

Here it's using .NET 7 for testing, but the purpose is to release it for .NET 8.

@richlander
Copy link
Member

I was thinking about the layering.

We were talking about: sdk -> aspnet -> runtime-deps; runtime -> runtime-deps

I am guessing we'll be getting layer sharing (in an absolute byte sense) by: sdk -> runtime -> runtime-deps; aspnet -> runtime-deps

We should be able to determine this through some quick measurements. I'm guessing that the runtime layer is larger than the aspnet layer.

@mthalman
Copy link
Member

I am guessing we'll be getting layer sharing (in an absolute byte sense) by: sdk -> runtime -> runtime-deps; aspnet -> runtime-deps

Today, the SDK Dockerfile includes only the SDK "stuff", implicitly filtering out the runtime and ASP.NET Core files. You're suggesting that the SDK Dockerfile now include the ASP.NET Core files in that logic and be based directly on the runtime tag? What's the argument for doing that versus keeping the SDK installation the same and basing it on the aspnet tag as we already do today? Naively, I would think you'd get the benefit of the composite installation in SDK scenarios if it were based on aspnet. So why not do that?

@mthalman
Copy link
Member

[Triage]
It's still possible to have a separate runtime composite image in addition to the standalone aspnet composite image. So we should consider doing that and have consistency in all the runtime-based images.

For the SDK, we should favor ASP.NET Core scenarios and have the SDK based on aspnet. There's a tradeoff associated with this because someone that would need to pull the runtime image would have to reacquire those bits. There are likely to be more cases where people are accessing the aspnet image than the runtime image so it makes more sense to favor aspnet.

flowchart TD
    A[runtime-deps] --> B[runtime]
    A --> C[aspnet]
    C --> D[sdk]

@lbussell lbussell self-assigned this Mar 8, 2023
@mthalman
Copy link
Member

This test may be affected and need to compensate for this change:

public async Task VerifyDotnetFolderContents(ProductImageData imageData)

@lbussell
Copy link
Contributor

I did some work on this and had a few findings:

  1. The aspnetcore constructs the aspnet composite tarball based on the runtime version in its eng/Version.Details.xml file.
  2. The SDK tarball is constructed the same way but from the installer repo.
  3. In daily (nightly?) builds, there is no guarantee that the runtime version referenced by both aspnetcore and SDK will be the same, since the runtime version info flows into the SDK along many different paths.
  4. With the layering approach described by @mthalman above (SDK layered on top of the aspnet composite runtime image), the SDK will fail to run if the runtime version shipped in the aspnet composite tarball is older than the runtime version shipped in the SDK tarball.
  5. Even if we don't layer images and simply extract the SDK and aspnet composite tarballs into their own respective images, an app built by an SDK with a newer runtime will fail to run on an aspnetcore image with an older runtime.

On the bright side, we should have coherent runtime versions for previews, servicing releases, and GA.

Just brainstorming some potential workarounds from the dotnet-docker side:

  1. Defer updates to SDK and Runtime images until aspnetcore has the correct runtime version so that they all stay in-line
  2. Find a workaround or runtime configuration option that allows us to target the "older" runtimes when building in the SDK image
  3. Disable builds/testing of SDK image when runtime versions aren't coherent. This is not ideal.
  4. In the .NET Docker tooling, recognize when we have incoherent runtime versions and find some way to account for it with potentially multiple SDK image builds, etc.

From outside the dotnet-docker space:

  1. It's a little far-fetched, but could we potentially build the aspnet composite images from dotnet/installer or using dotnet/installer's dependency graph?

An ideal solution would be able to avoid the hard coherency requirement and build the images all the time for testing.

@mangod9
Copy link
Member

mangod9 commented Mar 29, 2023

Thanks for the info @lbussell. Tagging @trylek and @ivdiazsa here, since they have been thinking about possible options here too.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

I have spent some time looking into this. The first problem I see is that it seems to me that the aspnetcore-runtime-composite-ver-os-arch.tar.gz is organized inconsistently; I have yet to prove if it's the only cause of the docker build failure I'm able to easily repro locally. From what I see, the combined ASP.NET + runtime full composite image is located under shared/Microsoft.NETCore.App/ver/full-composite.r2r.dll but it's only accompanied by the runtime framework assemblies, not the ASP.NET assemblies; these reside under shared/Microsoft.AspNetCore.App/ver/*.dll but these are apparently the default one-by-one R2R-compiled ASP.NET assemblies without the composite image.

It would be great if someone on @dotnet/crossgen-contrib (like @davidwrighton who I believe originally implemented parts of this logic) could confirm my findings but for now I believe we must primarily resolve this, I think it's quite likely that this inconsistency is the root of the problem that Logan and Matt are hitting. Wherever the full composite is put, it needs to be next to all the ASP.NET and framework assemblies because these get rewritten in the compilation process to acquire the forwarding header to the composite image.

Either we put it in Microsoft.AspNetCore.App so that Microsoft.NETCore.App would continue working the same and, when SDK (the dotnet app) publishes some ASP.NET app, it must make sure that the Microsoft.AspNetCore.App bits stomp over the Microsoft.NETCore.App, or we put it directly in Microsoft.NETCore.App and Microsoft.AspNetCore.App would be basically empty indicating that in this mode all from ASP.NET has been included with the runtime.

@mangod9
Copy link
Member

mangod9 commented Mar 30, 2023

I wonder if the composite layout is somehow flawed why doesnt the issue repro consistently for the asp.net image? From what @lbussell describes it only happens when applied to the sdk image?

@davidwrighton
Copy link
Member

So, this isn't something I built, but I can lay out what we need to do.

  1. We need to put the composite next to the System.Private.CoreLib.dll so that it will be loaded when that assembly is loaded (which is always first)
  2. We need to update the various ASP.NET assemblies and Runtime assemblies when we do a composite R2R build.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Well, for the non-composite ASP.NET image it doesn't matter, the framework and ASP.NET assemblies are separate and each is put in their respective folders. It's only the combined composite that ties these two groups together.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

@davidwrighton - Thanks for your feedback, so are you saying that having the full-composite.r2r.dll in Microsoft.NETCore.App is fine, we just need to make sure that the Microsoft.NETCore.App and Microsoft.AspNetCore.App assemblies get updated with their rewritten component versions? I guess that makes sense to me, in that case for non-ASP.NET apps the full composite would still be used but the ASP.NET part would be kind of dormant as there would be no way to get to those assemblies even though their precompiled native code will reside in the composite image. Once you confirm I'm understanding you correctly, I'll start working on a PR fixing this in the aspnetcore repo. I don't think it makes sense to continue investigating the container construction before this is fixed.

@mangod9
Copy link
Member

mangod9 commented Mar 30, 2023

@ivdiazsa can confirm but I believe the work that was done in asp.net repo did what @davidwrighton lists above. Are you noticing that the update asp.net and framework assemblies are not copied over?

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Hmm, I think you're right, I probably messed up the folder names when analyzing the different tarball flavors, I continue looking what's wrong with the Dockerfile then.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

@lbussell - I have uploaded your Dockerfile with a bunch of my fixes here:

https://gist.github.com/trylek/76e2a0a146fcee9391302473033c888d

It seems to me that it lets me build the container just fine, could you please give it a shot? I'll be happy to elaborate on each of the changes I've made.

@MichaelSimons
Copy link
Member

@lbussell - can you share your Dockerfile so that I and others can see the changes @trylek made?

@trylek
Copy link
Member

trylek commented Mar 30, 2023

This is the original script Logan shared with me:

https://gist.github.com/lbussell/45f157e1f7beaec0f3e073793a52b1f3

@MichaelSimons
Copy link
Member

@trylek - What is the scope of your proposal? Is this a workaround or something we would ship? The changes in the aspnet-composite which overrides the Microsoft.NETCore.App from the runtime image isn't something we would want to ship. This would negate any size wins the composite provides.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Hmm, I'm not sure I'm getting this. I believe the overall purpose of this entire effort is to optimize startup by using the combined ASP.NET+runtime composite coming from the aspnetcore repo. It's indeed what I'm expecting to be shipped at some point. What am I missing?

@trylek
Copy link
Member

trylek commented Mar 30, 2023

@MichaelSimons - Just as a caveat, one somewhat confusing aspect of the script, probably caused by a series of copy & paste, is that multiple tarballs get downloaded under the name dotnet.tar.gz, that's not the actual name of the archive this represents, you need to look at the complete command-line to see the original URL.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Other than that, for tracking and discussing changes it would be ideal to make something like a PR we could use to discuss changes to the Dockerfile; in practice, as I understood from Ivan, the actual Dockerfile is getting generated from some meta-format, I'm not yet sufficiently familiar with the related logic to see whether it would be helpful to discuss these changes directly in context of the meta-dockerfile, ideally directly as a PR against the dotnet-docker repo.

@MichaelSimons
Copy link
Member

The following line in the aspnet-composite layer raises red flags to me.

COPY --from=aspnet-composite-installer ["/shared/Microsoft.NETCore.App", "/usr/share/dotnet/shared/Microsoft.NETCore.App"]

What are you expecting this to do assuming we have a coherent build?

@MichaelSimons
Copy link
Member

Other than that, for tracking and discussing changes it would be ideal to make something like a PR we could use to discuss changes to the Dockerfile; in practice, as I understood from Ivan, the actual Dockerfile is getting generated from some meta-format, I'm not yet sufficiently familiar with the related logic to see whether it would be helpful to discuss these changes directly in context of the meta-dockerfile, ideally directly as a PR against the dotnet-docker repo.

I think it would be helpful to have a draft PR to review. We don't have to worry about the templating infra at this point. We could just modify one slice of Dockerfiles - e.g the 8.0 bookworm amd64 Dockerfiles (runtime-deps, runtime, aspnet, and sdk). Once we have an agreed upon Dockerfile(s), @lbussell can update the templates accordingly and regenerate the full set of impacted Dockerfiles.

@lbussell
Copy link
Contributor

lbussell commented Mar 30, 2023

@trylek Thanks, I've taken a look as well. For the most part I agree with Michael. I would be concerned with duplicating the Runtime bits or having two versions of Microsoft.NETCore.App if we layer the aspnet composite image on top of the runtime image.

Apologies for creating a confusingly large Dockerfile, I was trying to structure it in a way that you can get a one-line repro of the issue, without multiple Dockerfiles and a script.

I will put together a draft PR that will make the changes easier to reason about.

@MichaelSimons
Copy link
Member

What are you expecting this to do assuming we have a coherent build?

This is a leading question 😃. If I understand what is happening here correctly, the aspnet-composite image is carrying multiple copies of the shared framework because Docker utilizes a union file system. This is something we would not want to do and I postulate would negate the value of using composite images.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Well, according to my understanding we pull down the combined ASP.NET+runtime composite image and use it to populate the Microsoft.NETCore.App and Microsoft.AspNetCore.App folders that are affected by the composite ReadyToRun compilation - the individual assemblies need to be updated to their rewritten versions containing the forwarding ReadyToRun header with the name of the composite image and the Microsoft.NETCore.App folder needs to receive the new full-composite.r2r.dll image.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

My current understanding is that, in the absence of a coherent build, we end up with two versions of Microsoft.NETCore.App, one used by the SDK and another one use by ASP.NET. The dotnet SDK app as such is self-contained, it has all runtime assemblies it uses in its folder, the only thing it needs from Microsoft.NETCore.App is the native runtime i.e. [lib]coreclr.[dll/so] and a dozen similar shared libraries. With the coherent build, these two versions just collapse into one.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Another way to approach this would be to put [lib]coreclr.[dll|so] and such directly into the "dotnet" app folder, that would make "dotnet" completely self-contained. I have no idea why we're not doing it this way.

@MichaelSimons
Copy link
Member

With the coherent build, these two versions just collapse into one.

Yes I understand that but because the individual assemblies need to be updated to their rewritten versions containing the forwarding ReadyToRun header with the name of the composite image and the Docker union file system, the image is still carrying multiple copies. Because of this, the aspnet-composite image cannot be based on the runtime image.

@trylek
Copy link
Member

trylek commented Mar 30, 2023

Hmm, perhaps I don't understand Docker sufficiently to answer this. Semantically the aspnet-composite image includes its own version of the runtime framework so if I was building such a thing locally, I would either skip the runtime installation or let aspnet-composite installation stomp over it. I defer to more qualified experts how to express this behavior in terms of the Dockerfile.

@lbussell
Copy link
Contributor

I opened the draft PR at #4532.

@MichaelSimons
Copy link
Member

I would either skip the runtime installation or let aspnet-composite installation stomp over it.

With a union file system, you can have the semantics of stomping but your image layers will contain the original files that were "stomped". I'll try to explain this. First thing to note is that Dockerfile instructions equate to image layers.

Layer 1
File 1 (v1)
File 2 (v1)

Layer 2
File 2 (v2)
File 3 (v1)

When you pull an image with these two layers, you will pull the entire contents of each indivedual layer which is:
File 1 (v1)
File 2 (v1)
File 2 (v2)
File 3 (v1)

When the layers are extracted on disk, the resulting filesystem will contain:
File 1 (v1)
File 2 (v2)
File 3 (v1)

If small image sizes are a goal with your Docker image in order to achieve quick cold startups, you want to avoid situations where you ship multiple versions of files.

@lbussell
Copy link
Contributor

Please see #4538 (comment) for the latest updates on the experimentation with producing the composite tarball out of the installer repo.

@mthalman
Copy link
Member

This has been merged to main and published with the 8.0 Preview 5 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

10 participants