Skip to content

A cloud native file cache for Jenkins build pipelines which uses an S3-Bucket as storage provider.

License

Notifications You must be signed in to change notification settings

j3t/jenkins-pipeline-cache-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CI

A cloud native file cache for Jenkins pipelines. The files are stored in a S3-Bucket. The functionality is very similar to the one provided by GitHub Actions.

Motivation

The primary goal is to have a file cache for so called hot agent nodes. Those nodes are started on demand when an execution is scheduled by Jenkins and killed after the execution is finished (e.g. by using the kubernetes-plugin or nomad-plugin). This is fine but has also some drawbacks and some of them can be solved by having a file cache in place (e.g. to cache build dependencies or statistic data for code analysis or whatever data you want to be present for the next build execution).

Installation

  • Download the latest version (see releases)
  • Complete the installation via Manage Jenkins -> Manage Plugins -> Advanced -> Upload Plugin

For automated installations via plugin.txt you can use an entry like below:

jenkins-pipeline-cache::https://github.com/j3t/jenkins-pipeline-cache-plugin/releases/download/0.2.0/jenkins-pipeline-cache-0.2.0.hpi

Configuration

  • Go to Manage Jenkins -> Configure System -> Cache Plugin
  • Set Username (aka S3-Access-Key)
  • Set Password (aka S3-Secret-Key)
  • Set Bucket
  • Set Region
  • Click Test connection

The plugin requires the following permissions in S3 for the bucket:

  • s3:HeadObject
  • s3:GetObject
  • s3:ListBucket
  • s3:PutObject
  • s3:DeleteObject - Only if the CleanupTask is activated (threshold > 0)

Usage

Below you can find an example where the local maven repository of the spring-petclinic project is cached.

node {
    git(url: 'https://github.com/spring-projects/spring-petclinic', branch: 'main')
    cache(path: "$HOME/.m2/repository", key: "petclinic-${hashFiles('**/pom.xml')}") {
        sh './mvnw package'
    }
}

The path parameter points to the local maven repository and the key parameter is the hash sum of all maven poms, prefixed by a dash and the project name.

The hashFiles method is optional but can be helpful to generate more precise keys. The idea is to collect all files which have impact to the cache and then create a hash sum from them (e.g. hashFiles('**/pom.xml') creates one hash sum over all maven poms in the workspace).

If the job gets executed, the plugin tries to restore the maven repository from the cache by using the given key. Then the inner-step gets executed and if this was successful and the cache doesn't exist yet then the path gets cached.

Below you can find a complete list of the cache step parameters:

Name Required Description Default Example
path x Path to the directory which we want to be cached (absolute or relative to the workspace) $HOME/.m2/repository - cache the local maven repository
key x Identifier which is assigned to the cache. maven-4f98f59e877ecb84ff75ef0fab45bac5
restoreKeys Additional keys which are used when the cache gets restored. The plugin tries to resolve them in the defined order (key first then the restoreKeys) and in case this was not successful then the latest key with the same prefix gets restored. ['maven-', 'petclinic-'] - restore the latest cache where the key starts with maven- or petclinic- if the key not exists
includes Ant-style pattern applied to the path to filter the files which are included. **/* - includes all files **/*.xml or **/*.xml,**/*.html see here for more details
excludes Ant-style pattern applied to the path to filter the files which are excluded. Excludes no files see includes

Storage providers

Any S3 compatible storage provider should work. MinIO is supported first class, because all the integration tests are executed against MinIO.

In order to use an alternative provider, you probably have to change the Endpoint parameter.

  • Go to Manage Jenkins -> Configure System -> Cache Plugin
  • Update the Endpoint parameter
  • Click Test connection

Cleanup

You can define a threshold in megabyte if you want to limit the total cache size. If the value is > 0 then the plugin checks every hour the threshold and removes last recently used items from the cache as long as the total cache size is smaller than the threshold again (LRU).

  • Go to Manage Jenkins -> Configure System -> Cache Plugin
  • Update the Threshold parameter

Disclaimer

Anyone which can create/execute build jobs has basically also access to all caches. The 'attacker' just needs a way to execute the plugin, and they need to know the key which is assigned to a particular cache. There is no list available where all the keys are listed but the build logs contain them. The plugin guarantees that the same key is not created twice and also that an existing key is not replaced, but it not guarantees that a restored cache was not manipulated by someone else which has access to the S3 bucket for example.

As a general advice, sensitive data or data which cannot be restored from somewhere else or not regenerated should not be stored in caches. It should also not a big deal, besides that the build takes longer, if a cache has been deleted (e.g. by accident, by the cleanup task, by a data crash or ...).

Pitfalls

  • the hashFiles step expects an Ant-Style pattern relative to the workspace as parameter
  • the includes/excludes parameter must be an Ant-Style pattern relative to the path
  • the cache gets not stored if the key already exists or the inner-step has been failed (e.g. unit-test failures)
  • existing files are replaced but not removed when the cache gets restored
  • the plugin creates a tar archive from the path and stores it as an S3 object
  • the S3 object contains metadata
    • CREATED - Unix time is ms when the cache was created
    • LAST_ACCESS - Unix time is ms when the cache was accessed last

Further reading

About

A cloud native file cache for Jenkins build pipelines which uses an S3-Bucket as storage provider.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published