Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal on caching #412

Closed
wants to merge 11 commits into from
105 changes: 105 additions & 0 deletions docs/proposals/caching.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
= Proposal: Caching

== Purpose

This proposal aims to support caching of dependencies
(https://github.com/opendevstack/ods-pipeline/issues/147) in order to speed up build times.

An earlier version had proposed also a workspace caching scheme.
While workspace caching can provide a large benefit for some languages such as
Python and Node it also introduces complexities and could hide dependency issues.
As a consequence this proposal only focuses on a global cache.
A workspace caching proposal could be made at a later time.

== Background

Currently there is a single PVC per project which is shared by all build pipelines.
As a consequence only a single build at a time should run for a project. At the moment due to #394 parallel builds can happen, but with #407 and #160 solved, it will be "one build at a time for a repository".

The PVC is mounted as the workspace in all build tasks.

The `ods-start` task wipes the PVC at the beginning of each build so that no data persists between builds.

Each running tekton task mounts the workspace PVC which typically takes a noticeable time. As a consequence it makes sense to not introduce additional tasks.

== Solution

In the initial implementation instead of a single PVC per project one PVC per repo will be used.
henrjk marked this conversation as resolved.
Show resolved Hide resolved

The same PVC is used for the workspace (where a repo is checked out in) and the global cache.

The global cache enables build tasks to store cached files for at least the following purposes:

1. Dependency caching. Languages that use dependencies as packaged artifacts in Nexus may not benefit much from caching dependencies. Unless there is a clear indication that there is a large performance win, languages should not use the global cache if they can use Nexus. A language that currently is not supported by Nexus is go. Another example where a large performance win could be expected are `pnpm` or `yarn` which can install dependencies by referencing the artifacts on the same file system in a very efficient way. In other words there is no need to unpack/copy the dependencies from a global file system cache.

2. Cache prior build output to support build skipping (see proposal at https://github.com/opendevstack/ods-pipeline/pull/423). Usage of the global cache for this purpose should work analog to how this is done for dependency caching but it is not further elaborated on here as this is covered in the mentioned proposal.

3. There may be other usages for example for the gradle wrapper that could perhaps benefit from a global cache.

The following new parameter is introduced to build tasks supporting dependency caching:

* `cache-dependencies` a boolean defaulting to `false`. Only if set to `true` the task will used dependency caching.

In addition to the build task parameter, build tasks also receives the following:

* File `.ods/cache-deps-parent-dir` contains an absolute path without trailing '/' to an existing directory (no whitespace or newlines). If file `.ods/cache-deps-parent-dir` does not exits the task must not use dependency caching. This is used to switch off dependency caching dynamically for example with a special tag in a git commit message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to start without disabling via commit message. Do we need this file then at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking we wouldn't need it. However without it the caching locations would need to be hardcoded, which I would rather not have.


Cleanup will spare directories below the $(cache-deps-parent-dir) so that build tasks must keep their cached files in a subdirectory they create. For example in `$(cache-deps-parent-dir)/go-mods/` where `go-mods` is called technology-name in this proposal.

Build tasks supporting caching will also be adjusted to

- Log available disk space at the beginning of the build.

- Log how long build commands which may be long running take in seconds.

- Log available disk space at the end of the build and also the space delta.


=== ods-start and cleanup

Build tasks supporting caching can count on:

* cache cleanups not occurring while they are running
* cleanups will not be partially completed.

On the other hand build tasks must not assume that the cache is still available from a prior build.

The dependency cache is somewhat special is it typically does not make sense to remove individual files.

Instead if disk space issues arise one should manually:

* Increase the PVC space of the associated repository or
* Recreate the PVC (TODO: should this be done automatically for example on a randomized bi-weekly schedule for example)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to keep it manual for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense to me as well. A first version should do that this way.


However in general cleanup of the PVC does happen during `ods-start`.

The cleanup strategy described here is subject to change even in patch versions and specific details must not be relied on.

The following cache locations on the PVC are used:

- `+./.ods-cache/deps/<technology-name>/+` for dependency caches of a particular build technology. The technology-name would be defined by the build script and unknown to `ods-start`.

- `+./.ods-cache/<not-deps>/**+` reserved for future usages for example to enable build skipping. For each new supported location `ods-start` may delete files as it makes sense for the purpose.

Before cache cleanup `ods-start` cleans up all files not in the cache locations above. +
Next all files directly underneath `+./.ods-cache/deps/+` which are not directories are deleted. This prevents tasks to forget to define and use a `<technology-name>`.

== Pro

* Enables dependency caching.

* A global file cache also paves the way for other usages such as caching builds efficiently to enable build skipping.

* Opt-in to dependency caching makes this a conscious choice from ods-pipeline users.

* By only supporting dependency caching in build tools where there is a big benefit we reduce complexity in other cases.

* Not implementing file based cleanup for the dependency cache keeps complexity low and increases performance as deleting a to of files can take a lot of time compared to recreating a PVC instead.

== Con

* By not making dependency cleanup up a responsibility of `ods-start`` users will have to take manual actions if disk space runs out. Here an ability to plug-in an interleaved cleanup process could make sense.

* Using a file `.ods/cache-deps-parent-dir` enables skipping of using the dependency cache for example via a special tag in a git commit message, but seems otherwise odd. If we find a way to bind the parameters to the task run more explicitly that would be better for traceability. Is there a better way to pass in dynamic parameters from ods-start or the event listener to the build tasks?

* Languages where which don't have a mature way to share dependencies globally do not benefit from this approach. For these a workspace caching capability would be needed.