Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a new layer for loader classes #20529

Closed
nebhale opened this issue Mar 15, 2020 · 13 comments
Closed

Create a new layer for loader classes #20529

nebhale opened this issue Mar 15, 2020 · 13 comments
Assignees
Labels
type: enhancement A general enhancement
Milestone

Comments

@nebhale
Copy link
Member

nebhale commented Mar 15, 2020

This one is an observation for discussion rather than a requirement for improvement. At this time, Spring Boot's default OCI image layering strategy describes four layers:

  1. dependencies
  2. snapshot-dependencies
  3. resources
  4. application

There is, however, an implicit fifth layer containing (in a typical case):

  • META-INF/MANIFEST.MF
  • META-INF/maven/**
  • BOOT-INF/classpath.idx
  • BOOT-INF/layers.idx
  • org/springframework/boot/loader/**

My observation is that this final implicit layer mixes lifecycles. The contents of META-INF/MANIFEST.MF, META-INF/maven/**, and BOOT-INF/classpath.idx are broadly application-specific and change for every tagged version of an application while BOOT-INF/layers.idx and org/springframework/boot/loader/** are much more stable, and generally would be duplicate even across applications within an enterprise (given consistent versions of Boot).

The concrete result of this is that every single version of an application using this layering would end up with a layer that isn't de-duplicatable but contains (at the current time) 364K of duplicated files and 12K of unique files. I think we can all agree that this isn't the biggest inefficiency in a typical enterprise deployment system, but there is room for a big win both in cumulative storage and cumulative data transfer (i.e. deployment speed) if these files can be de-duped in a registry and cached at edge nodes.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 15, 2020
@mbhave
Copy link
Contributor

mbhave commented Mar 16, 2020

At the moment, if layertools is used, the list above ends up in the application layer. The loader classes are indeed not tied to the application and we did discuss moving them to a separate layer (maybe dependencies?).

@nebhale
Copy link
Member Author

nebhale commented Mar 16, 2020

Yeah, you're absolutely right (I misread some documentation). Upon re-examination with the proper layout, I think that they only change to my observation would be that the duplication probably wouldn't matter if the application layer was vastly larger than ~346K most of the time. Do you have any feel for how large the application-contributed classes would be in a typical Spring Boot application.

@mbhave
Copy link
Contributor

mbhave commented Mar 16, 2020

Good question. I don't but someone else on the team might. Flagging for team attention so that we can decide if we want to do anything about this.

@mbhave mbhave added the for: team-attention An issue we'd like other members of the team to review label Mar 16, 2020
@wilkinsona
Copy link
Member

wilkinsona commented Mar 17, 2020

I can offer some anecdata.

Sagan's sagan-site has over 800KB of content that would go in the application layer. If we moved BOOT-INF/classes/templates by default into the resources layer it drops to ~500KB. I don't think of Sagan as being particularly huge in real-world terms.

I suspect plenty of apps would have more than ~346KB in the application layer. That said, we'd also have plenty of apps where it's almost empty as pretty much everything's either in the resources layer or one of the dependencies layers.

It feels intuitive to me for the loader classes to go in the dependencies layer by default. That's where other Boot jars will go and the loader classes will change at the same frequency as those jars. We probably also need to consider some changes to the Maven XML and Gradle DSL to allow users to move the loader classes to a different layer if they wish.

@nebhale
Copy link
Member Author

nebhale commented Mar 17, 2020

The dependencies layer feels right to me as well as the loader is tied strongly to the version of Boot, which strongly influences the rate of change of the dependencies layer.

Note, the loader is particularly troublesome for Boot's current layering mechanism because it builds layers by moving files and those are immovable. So, if this turns out to be a bear to implement/configure it probably won't be worth it.

@mbhave
Copy link
Contributor

mbhave commented Mar 17, 2020

We probably also need to consider some changes to the Maven XML and Gradle DSL to allow users to move the loader classes to a different layer if they wish.

The Maven XML and Gradle DSL determines where something ends up in the layered jar. I'm not sure if we can do anything about the loader classes in that case since they are at the root of the jar rather than nested in BOOT-INF/layers.

@wilkinsona
Copy link
Member

We going to place everything in /org/springframework/boot in a new spring-boot-loader layer by default. It will be sandwiched between the default dependencies and snapshot-dependencies layers by default. If the user customises the layers and does not define a spring-boot-loader layer we'll create our own fallback layer that will be the upper-most layer.

@mbhave mbhave added type: enhancement A general enhancement and removed for: team-attention An issue we'd like other members of the team to review status: waiting-for-triage An issue we've not yet triaged labels Mar 18, 2020
@mbhave mbhave added this to the 2.3.x milestone Mar 18, 2020
@mbhave mbhave changed the title OCI Image Layering Strategy Could (Possibly?) Be Better Create a new layer for loader classes when using layertools extract Mar 18, 2020
@nebhale
Copy link
Member Author

nebhale commented Mar 18, 2020

@wilkinsona How will that work given the classpath implications?

@wilkinsona
Copy link
Member

There shouldn't be any classpath implications. The stuff in /org/springframework/boot will remain in the root of the jar. It going into the spring-boot-loader layer will only be done at extraction time, much the same as is done today where it goes in the application layer.

@wilkinsona
Copy link
Member

wilkinsona commented Mar 20, 2020

We talked about this today and concluded that we'd like the index file to contain more information about where the content of each layer can be found. Rather than the hardcoded assumption that the content of a layer named application, for example, will be found in BOOT-INF/layers/application we'd like the index file to convey the information.

Such an index might look something like this for the currently proposed default layers:

dependencies BOOT-INF/layers/dependencies
spring-boot-loader org/springframework/boot
snapshot-dependencies BOOT-INF/layers/snapshot-dependencies
application

The top-most layer (the bottom layer in the index) does not specify a location for its content. This implies that it contains everything not included in any of the preceding layers. A limitation of this is that it means that it's only the top-most layer that can capture everything not included in any of the other layers but we think that is sufficient for our current purposes.

In a customized scenario, the index may look like this:

all-the-dependencies BOOT-INF/layers/all-the-dependencies
application

In this case the loader will be included in the application layer alongside everything else in the jar that wasn't located beneath BOOT-INF/layers/all-the-dependencies as there is no layer that captures the content of org/springframework/boot.

@nebhale
Copy link
Member Author

nebhale commented Mar 23, 2020

@wilkinsona At the risk of it becoming a bit complicated, you might take a look at the slicing definition we have for CNBs.

Effectively:

  • Each layer can have an arbitrary list of paths, including wildcards (we use Golang pattern syntax, but antpaths make more sense for you)
  • Any path not accounted for falls into a catch-all layer

So a pseudo example, (excuse the YAML):

dependencies:
  - BOOT-INF/lib/*-[^SNAPSHOT].jar
  - org/springframework/boot/**
snapshot-dependencies:
  - BOOT-INF/lib/*-SNAPSHOT.jar
resources:
  - META-INF/resources/**
  - resources/**
  - static/**
  - public/**
application:
  # - ** (implicit)

One of the benefits to a more complex design like this is that slicing no longer requires relocation. If boot created a resolved version of this kind of index (e.g. listing every single JAR path rather than using wildcards), iterating over the list would give you enough information to slice in place with a Dockerfile.

@wilkinsona
Copy link
Member

Thanks, @nebhale. We'd like the index file to be in a format that can be easily written and parsed by Java code with no dependencies and parsed by Go code in the buildpack. I'm not sure that TOML meets those requirements. If we can agree upon a format that does while also conveying, useful information, then we should definitely consider it.

One of the benefits to a more complex design like this is that slicing no longer requires relocation

We're a bit torn on this one in the Boot team. It came up recently and I initially found the idea very appealing but Phil (IIRC) raised some concerns about the size of the file, particularly for jars with lots of entries. Perhaps we should do some work to quantify that. I suspect it would zip quite efficiently.

@wilkinsona wilkinsona added the for: team-attention An issue we'd like other members of the team to review label Mar 24, 2020
@mbhave
Copy link
Contributor

mbhave commented Mar 25, 2020

I'm confused. Why would we need those rules in the layer.idx file, @nebhale? We would've already used similar rules to create the layered jar. Once we have the layered jar, the buildpack doesn't need to know how those layers were created, does it?

@philwebb philwebb added this to the 2.3.0.M4 milestone Apr 2, 2020
@mbhave mbhave removed the for: team-attention An issue we'd like other members of the team to review label Apr 3, 2020
@philwebb philwebb changed the title Create a new layer for loader classes when using layertools extract Create a new layer for loader classes Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

5 participants