Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The build plan should lose entries as they are provided by each /bin/build #24

Closed
sclevine opened this issue Oct 30, 2018 · 12 comments
Closed
Assignees

Comments

@sclevine
Copy link
Member

sclevine commented Oct 30, 2018

Note: This proposal has been edited. Comments below it may be outdated. See the edit history for previous versions.

If a buildpack’s /bin/build provides a dependency specified in the build plan, then the buildpack should be able to remove that build plan entry so that it is not provided to the next buildpack’s /bin/build. Otherwise, the same dependency may unintentionally be provided multiple times.

Advantages

  • This makes it possible for buildpacks to provide basic/generic versions of dependencies that more specialized buildpacks could provide if added to the group.
  • This makes the provider key unnecessary in almost all cases.
  • This makes it easy for operators to override buildpack dependencies by placing custom buildpacks in front of the group. (This is a current CF buildpack feature called "manifest overrides.")
  • The interface changes below make the /bin/build and /bin/detect interfaces much more consistent.
  • The interface changes below allow for a new type of post-build metadata collection that could be used for security scanning/auditing.
  • The interface changes below make build-time environment variables available during detection.
  • New functionality (removing entries from the build plan) is entirely optional

Interface Changes


Modify the input and output interface for detection:

Executable: /bin/detect <platform[A]>, Working Dir: <app[AR]>

Input Description
/dev/stdin Merged plan from previous detections (TOML)
<platform>/env/ User-provided environment variables for build
<platform>/plan/ Contributions to the build plan
<platform>/# Platform-specific extensions
Output Description
[exit status] Pass (0), fail (100), or error (1-99, 101+)
/dev/stdout Logs (info)
/dev/stderr Logs (warnings, errors)

To contribute plan entries, buildpacks write entries to <platform>/plan/<name>. Entries do not need to have a top-level <name> key.


Modify the input interface for build:

Executable: /bin/build <platform[A]> <cache[EC]> <launch[EI]>, Working Dir: <app[AI]>

Input Description
/dev/stdin Build plan from detection (TOML)
<platform>/env/ User-provided environment variables for build
<platform>/plan/ Contributions to the build plan
<platform>/# Platform-specific extensions

To remove plan entries, buildpacks write empty files to <platform>/plan/<name>. Entries do not need to have a top-level <name> key. This file could be metadata about the exact dependency provided, e.g., checksums not available during build.

@sclevine sclevine changed the title The build plan should lose entries as they claimed by each /bin/build The build plan should lose entries as they are provided by each /bin/build Oct 30, 2018
@dsyer
Copy link

dsyer commented Oct 30, 2018

An example would help (as usual).

@sclevine
Copy link
Member Author

Here's a simple example that ignores the launch and build keys, and also ignores how the buildpacks would mark a dependency as "provided":

Detect Phase

my.buildpack.one:
/bin/detect outputs:

[my-dep-one]
...

my.buildpack.two:
/bin/detect outputs:

[my-dep-two]
...
[my-dep-three]
...

Build Phase

Assuming my.buildpack.two is requesting my-dep-two from my.buildpack.one, and the other requested layers are provided by the buildpack requesting them.

my.buildpack.one:
/bin/build receives:

[my-dep-one]
...
[my-dep-two]
...
[my-dep-three]
...

/bin/build provides my-dep-one and my-dep-two.

my.buildpack.two:
/bin/build receives:

[my-dep-three]
...

/bin/build provides my-dep-tree.

@dsyer
Copy link

dsyer commented Oct 30, 2018

Assuming my.buildpack.two is requesting my-dep-two from my.buildpack.one

How is it doing that? Surely my.buildpack.two should not know about the other buildpack - it only cares that my-dep-two is provided by something.

@nebhale
Copy link
Contributor

nebhale commented Oct 30, 2018

This proposal is problematic as there is not a a 1:1 correspondence between the entries in the build plan and the layers put down on disk. For example, when a buildpack requests[openjdk-jre], the provider of that dependency might put down not only /openjdk-jre, but /memory-calculator and /jvmkill. A second is example is that a buildpack might put [jvm-application] into the build plan, which is used by other downstream buildpacks both during detect and build phases, but no layer is ever created for it. A final example is a buildpack request [npm] during detect, but that same buildpack providing a /node_modules layer.

This feels like an attempt to tighten the contract to prevent what will be a rare occurrence when it should just be clearly stated the buildpack groups (which are quite lightweight, so you can have many of them) should be created in such a way as to remove the possibility of dependency provisioning collision. Tightening the contract here, without any evidence that this is a wide-spread problem is reducing our ability to innovate and create new patterns that do not already exist.

@sclevine
Copy link
Member Author

This proposal is problematic as there is not a a 1:1 correspondence between the entries in the build plan and the layers put down on disk. For example, when a buildpack requests [openjdk-jre], the provider of that dependency might put down not only /openjdk-jre, but /memory-calculator and /jvmkill

Buildpacks would request all three of those things instead of just openjdk-jre.

A second is example is that a buildpack might put [jvm-application] into the build plan, which is used by other downstream buildpacks both during detect and build phases, but no layer is ever created for it.

This is covered by:

"Have a separate, special build plan notation for dependencies provided in the app directory."

A final example is a buildpack request [npm] during detect, but that same buildpack providing a /node_modules layer.

That buildpack would request node_modules, as the current v3 node buildpacks do.

This feels like an attempt to tighten the contract to prevent what will be a rare occurrence when it should just be clearly stated the buildpack groups (which are quite lightweight, so you can have many of them) should be created in such a way as to remove the possibility of dependency provisioning collision.

Without this feature, it would be difficult to implement manifest dependency overrides, which are a current feature of the CF buildpacks. Additionally, the current implementation puts the onus on operators and developers to know about every layer name their chosen buildpacks provide.

That said, I think Option B satisfies all of your constraints and would solve the problem without tightening the contract.

@nebhale
Copy link
Contributor

nebhale commented Oct 30, 2018

I don’t believe that it should be a requirement that buildpacks requesting a dependency have internal knowledge of all of the dependencies that another buildpack will provide. For example, a riff buildpack only needs to request that a JRE be present. It should *not * need to know that some JRE providers will provide an additional memory calculator and jvmkiller whereas other JRE providers will not. Making all provided dependencies explicit in the build plan creates a very tight coupling where two buildpacks essentially can only depend on one another with deep internal knowledge rather than allowing buildpacks be polymorphic dependency providers that can transparently substitute in for one another.

Also the requirement for both Option A and B

Fail the build if the build plan is not fully consumed.

is problematic as some build plan entries are never meant to be consumed. At the moment [jvm-application], [node-application], and others like these are used as markers that later buildpacks such as the riff buildpack use to make decisions during their own processing. Removing them, or requiring them to be removed and preventing them from being visible during the entire execution of the build phase constrains the flexibility of the buildpack implementations

@sclevine
Copy link
Member Author

What if, at least for now, we condense this proposal down to:

A buildpack may erase any number of dependencies from the build plan in /bin/build if it chooses, indicating that subsequent /bin/builds should not consider that dependency. (No strong opinions about how /bin/build would specify the names of the dependencies to remove.)

I would still prefer a tighter contract, because I think the current contract is somewhat difficult for new buildpack authors to reason about, but this would solve the problem that I'm trying to solve. It would also be an additive, non-breaking change.

@dsyer
Copy link

dsyer commented Oct 30, 2018

A buildpack may erase any number of dependencies from the build plan in /bin/build if it chooses

It's news to me that a build plan contains dependencies. Is that explicit? The first reference I can find to "dependency" is in the "build" phase, and it's a bit vague there (doesn't mention anything to do with the build plan).

@nebhale
Copy link
Contributor

nebhale commented Oct 30, 2018

@sclevine Definitely a better choice. How would such a behavior be implemented since stdout is already occupied for logging?

@nebhale
Copy link
Contributor

nebhale commented Oct 30, 2018

@dsyer You've hit upon one of the assumptions I was arguing against here. The build plan was originally conceived as a way to create an inventory of dependencies contributed by buildpacks, conveniently allowing tools to observe and report on them. However, the design is much more generic than that, acting as a whiteboard pattern that both communicates more than just dependencies and doesn't necessary report all dependencies at the same time. My vote is for this looser design to continue (it's been very helpful for my buildpacks) rather than restricting it to the original idea.

@sclevine
Copy link
Member Author

@nebhale @dsyer I really like where this ended up. Even ignoring the original intent of this change, the detection interface is much more consistent with build. Let me know what you think!

@nebhale
Copy link
Contributor

nebhale commented Oct 31, 2018

I'm ready to see a spec PR for the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants