Skip to content

Latest commit

 

History

History
86 lines (46 loc) · 6.5 KB

20220225-caching.md

File metadata and controls

86 lines (46 loc) · 6.5 KB

Caching

Date: 2022-02-25

Last Modified: 2022-03-14 - adding build-task area (see ADR caching-build-tasks for details)

Status

Accepted

Context

ods-pipeline runs a number of Tekton tasks which have their working directory mounted to the same PVC. Currently there is one such PVC created per repository, which is shared across pipelines.

Prior to this ADR ods-start deleted all contents in the working directory which is on the root of the PVC. As a consequence no caching was possible despite having a PVC in place. One key benefit of not supporting caching is that caching has the potential to hide issues and change build outcomes if not implemented correctly.

So far the following usages of caching were considered:

  1. Global dependency caching. Languages that use dependencies as packaged artifacts in Nexus may not benefit much from caching dependencies. Unless there is a clear indication that there is a large performance win, languages should not use the global cache if they can use Nexus. A language that currently is not supported by Nexus is go. Another example where a large performance win could be expected are pnpm or yarn which can install dependencies by referencing the artifacts on the same file system in a very efficient way. In other words there is no need to unpack/copy the dependencies from a global file system cache.

  2. Workspace caching. In some languages such as Python and JavaScript/npm one can achieve a larger build acceleration by caching the workspace. For example for Python this would mean that the virtual env would be cached which makes subsequent installs of dependencies almost instant. However caching workspaces introduces more complexities. A key concern is to ensure that builds continue to work if there is a fresh build. Normally this is done by the continuous integration build. With workspace caching the question is how to continue to ensure this?

  3. For repos with multiple build tasks one frequently could skip builds if nothing has changed. This can be enabled by storing a build tasks output in the cache. This is now supported with a dedicated cache area named build-task. Details are described in ADR caching-build-tasks

There may also be other usages for example for the gradle wrapper that could perhaps benefit from a global cache.

Related proposals:

Decision

Tekton tasks have considerable overhead to start up. To avoid increasing the number of tasks to run all caching related code is done in existing tasks.

The cache is placed on the workspace PVC. This has the following benefits:

  • It enables build technologies which reference files in the cache using hard links such as pnpm.
  • Mounting takes a noticeable time. By not mounting another PVC dedicated to caching no additional delay is introduced.

Design decisions:

  • ods-start is changed to spare files folder .ods-cache/ of the working directory when removing content from previous builds. As a reminder the tasks working directory . is on the root of the PVC.

  • ods-start is aware of which areas below .ods-cache are used for which purpose and implements a file based clean up policy, but only if appropriate.

  • Other areas below .ods-cache are reserved for future use and thus may only be used for experimentation.

At this time the only supported cache area is for global dependencies, further outlined in the section below.

In general the tool-set specific build tasks utilize the cache. They always can count on:

  • cache cleanups not occurring while they are running
  • cleanups will not be partially completed

On the other hand build tasks must not assume that the cache is still available from a prior build.

Decisions regarding the Global Dependency Cache

We refer to the caching described here as global dependency caching to differentiate it from workspace caching. A workspace cache wouldn't be shareable across build pipelines. An workspace cache enables caching a Python virtual environment for example.

Decisions for global dependency caching:

  • The global dependency cache if supported by a tool-set is always on.

  • Build tasks may cache dependencies below ./.ods-cache/deps/<technology-name>/ where <technology-name> is reserved when an ods supported build task adopts that name. For example for the go language the technology name is gomod.

  • There is no file level cleanup implemented in ods-start for global dependency caching. The reason is that it makes little sense to remove individual files unless one would be aware of all existing consumers. ods-start ensures however that there are no stray files in folder .ods-cache/deps/.

If disk space runs out one shall manually

  • Increase the PVC space of the associated repository or
  • Recreate the PVC to get rid of historic content no longer needed.

Support for global dependency caching is added to ods-build-go as go does is not supported by Nexus and also benefits by reducing access to the outside network.

Consequences

If new caching areas beyond dependency caching are supported, ods-start will need to be adjusted.

Languages which benefit from global dependency caching can now utilize this. Languages which are not supported by Nexus such as go this is a big plus. For other languages we can add support as it is determined to be useful over time.

Most languages use Nexus to store their dependencies. For these it must be carefully evaluated whether introducing a file level cache is worthwhile. Some build tools always implicitly use a cache which is somewhere relative to the HOME directory. So instead of using that area placing dependencies on the PVC could also makes things slower. Build tools for which caching can be switched off entirely may also not benefit a lot from caching. So in summary this should be carefully measured before adding support for global dependency caching in individual build tasks.

By not implementing file based cleanup for the global dependency cache the complexity remains low. In addition this increases performance as deleting a lot of files can be expensive and take considerable time.