Skip to content
Permalink
Browse files

Download and static caching as a feature (#165)

Fixes #157 (filed by me)

## What this does
* Makes the download caching of pip a "first-class citizen" as an option directly in this plugin's options.  This will "fix" a few (attempts) at using the pip cache, specifically in Docker, and will simplify this feature (as the user simply has to enable it, not specify a folder).  In a future MR, I'd highly suggest enabling this by default.
* Second, it adds a new type of caching called "static caching" which allows you to cache the outputs of this plugin.  This greatly speeds up every single build as long as you have the feature enabled and do not change your requirements.txt file.  In a future MR, I'd highly suggest enabling this by default also.
* The pip download and static cache are shared between any projects of the same user through an [appdir](https://www.npmjs.com/package/appdirectory) cache folder when packaging your service.  This _especially_ helps on projects that heavily use Docker (Win/Mac) for deployments or development, or for pip modules that need to compile every time, and _especially_ for projects with long requirements.txt files.  This will also greatly help the longer and more complex your requirements.txt is, and/or if you use the same requirements.txt on multiple projects (common in team environments).

## Implementation details
* When either cache is enabled, this plugin now caches those requirements (download or static) to an "appdir" cache folder (per the [appdirectory](https://www.npmjs.com/package/appdirectory) node module).
* When this feature is NOT enabled, nothing changes
* Injection happens directly from the new cached requirements directory via a symlink created in the right place in `.serverless` or `.serverless/functionname` if deploying individually.
* As mentioned above, there is a symlink into the .serverless folder when the static cache is enabled pointing to it, so you still "know" where your cache is (for both individually and non-individually packaged functions).
* The requirements.txt "generator" was improved to remove comments, empty lines, and sort the list of items before trying to use it (or check its md5 sum).  This allows for more actual md5 matches between projects, in-case of comments and such in the requirements file.
* A new command was added to the command-line to flush the download/static cache, called cleanCache invokable with: `serverless requirements cleanCache`.  This clears all items including the download and static cache.
* A handful of new tests were created for various edge conditions I've found while doing this refactoring, some were based on bugs other people found while using this plugin with some combination of options, some are not directly related to this merge's intent, but it's just part of my stream of work/consciousness.  Sorry tests take a lot longer to run now since there are lots more now.
* A UID bug fix related to docker + pip was implemented (seen on a few other bugs) from @cgrimal 
* The following new configurable custom options were added to this plugin...

Variable Name | Value | Description
--- | --- | ---
useStaticCache | `false/true` | Default: false.  This will enable or disable the static cache.  After  some testing I would like to make this default: true, as this will greatly help everyone, and there's no reason to not enable this.  Possibly making this default: true will help weed out issues faster.  I'll gladly step-up to quickly fix any bugs people have with it since I'm now well accustomed with the code.
useDownloadCache | `false/true` | Default: false.  This will enable or disable the pip download cache.  This was previously the "example" code using a pipEnvExtraCmd to specify a local folder to cache downloads to.  This does not require a cache location to be set, if not specified it will use an appdirs.usercache() location.
cacheLocation | `<path>` | Default: [appdirectory](https://www.npmjs.com/package/appdirectory).userCache(appName: serverless-python-requirements) This will allow the user to specify where the caches (both static and download) should be stored.  This will be useful for people who want to do advanced things like storing cache globally shared between users, or for CI/CD build servers on shared-storage to allow multiple build machines to leverage a cache to speed builds up.  An example would be to mount a shared NFS store on all your CI/CD runners to `/mnt/shared` and set this value to `/mnt/shared/sls-py-cache`. 
staticCacheMaxVersions | `<integer>` | Default: 0.  This will restrict the a maximum number of caches in the cache folder.  Setting to 0 makes no maximum number of versions.  This will be useful for build/CI/CD machines that have limited disk space and don't want to (potentially) infinitely cache hundreds/thousands of versions of items in cache. Although, I would be disturbed if a project had hundreds of changes to their requirements.txt file.

## TODO
- [X] Feature Implementation
- [X] BUG: Deploying single-functions fails (Packaging works, but fails because of #161 )
- [X] Code Styling / Linting
- [X] Test to be sure Pipfile / generated requirements.txt still works
- [X] Tested a bunch on Mac / Linux with and without Docker
- [X] Adding Tests for Download Cache
- [X] Make sure zip feature still works
- [X] Ensure all existing tests pass
- [X] Adding Tests for static cache
- [X] Updating README.md to inform users how to use it
- [X] Make sure dockerSsh works
- [X] Implement error when trying to use --cache-dir with dockerizePip (won't work)
- [X] Implement suggestion when trying to use --cache-dir without dockerizePip
- [x] Test on Windows
- [x] Iterate through any feedback
- [x] Rebase with master constantly, awaiting merge...  :)

Replaces #162
  • Loading branch information...
AndrewFarley authored and dschep committed Sep 8, 2018
1 parent 01c1fb1 commit 137f8e1b1e10579a2b8db88a49c335329a3307dd
Showing with 658 additions and 95 deletions.
  1. +3 βˆ’0 .gitignore
  2. +26 βˆ’11 README.md
  3. +21 βˆ’14 index.js
  4. +31 βˆ’1 lib/clean.js
  5. +5 βˆ’2 lib/docker.js
  6. +313 βˆ’49 lib/pip.js
  7. +108 βˆ’0 lib/shared.js
  8. +5 βˆ’2 package.json
  9. +146 βˆ’16 test.bats
@@ -42,3 +42,6 @@ admin.env
#PYTHON STUFF
*.py[co]
__pycache__

#NODE STUFF
package-lock.json
@@ -140,25 +140,40 @@ custom:
```

## Extra Config Options
### extra pip arguments
You can specify extra arguments to be passed to pip like this:
### Caching
You can enable two kinds of caching with this plugin which are currently both DISABLED by default. First, a download cache that will cache downloads that pip needs to compile the packages. And second, a what we call "static caching" which caches output of pip after compiling everything for your requirements file. Since generally requirements.txt files rarely change, you will often see large amounts of speed improvements when enabling the static cache feature. These caches will be shared between all your projects if no custom cacheLocation is specified (see below).

_**Please note:** This has replaced the previously recommended usage of "--cache-dir" in the pipCmdExtraArgs_
```yaml
custom:
pythonRequirements:
dockerizePip: true
pipCmdExtraArgs:
- --cache-dir
- .requirements-cache
useDownloadCache: true
useStaticCache: true
```
_Additionally, In future versions of this plugin, both caching features will probably be enabled by default_

When using `--cache-dir` don't forget to also exclude it from the package.
### Other caching options...
There are two additional options related to caching. You can specify where in your system that this plugin caches with the `cacheLocation` option. By default it will figure out automatically where based on your username and your OS to store the cache via the [appdirectory](https://www.npmjs.com/package/appdirectory) module. Additionally, you can specify how many max static caches to store with `staticCacheMaxVersions`, as a simple attempt to limit disk space usage for caching. This is DISABLED (set to 0) by default. Example:
```yaml
custom:
pythonRequirements:
useStaticCache: true
useDownloadCache: true
cacheLocation: '/home/user/.my_cache_goes_here'
staticCacheMaxVersions: 10
```

### Extra pip arguments
You can specify extra arguments [supported by pip](https://pip.pypa.io/en/stable/reference/pip_install/#options) to be passed to pip like this:
```yaml
package:
exclude:
- .requirements-cache/**
custom:
pythonRequirements:
pipCmdExtraArgs:
- --compile
```


### Customize requirements file name
[Some `pip` workflows involve using requirements files not named
`requirements.txt`](https://www.kennethreitz.org/essays/a-better-pip-workflow).
@@ -350,4 +365,4 @@ zipinfo .serverless/xxx.zip
improved pip chache support when using docker.
* [@dee-me-tree-or-love](https://github.com/dee-me-tree-or-love) - the `slim` package option
* [@alexjurkiewicz](https://github.com/alexjurkiewicz) - [docs about docker workflows](#native-code-dependencies-during-build)

* [@andrewfarley](https://github.com/andrewfarley) - Implemented download caching and static caching
@@ -12,7 +12,7 @@ const {
const { injectAllRequirements } = require('./lib/inject');
const { installAllRequirements } = require('./lib/pip');
const { pipfileToRequirements } = require('./lib/pipenv');
const { cleanup } = require('./lib/clean');
const { cleanup, cleanupCache } = require('./lib/clean');

BbPromise.promisifyAll(fse);

@@ -39,6 +39,10 @@ class ServerlessPythonRequirements {
dockerSsh: false,
dockerImage: null,
dockerFile: null,
useStaticCache: false,
useDownloadCache: false,
cacheLocation: false,
staticCacheMaxVersions: 0,
pipCmdExtraArgs: [],
noDeploy: [
'boto3',
@@ -115,6 +119,11 @@ class ServerlessPythonRequirements {
install: {
usage: 'install requirements manually',
lifecycleEvents: ['install']
},
cleanCache: {
usage:
'Removes all items in the pip download/static cache (if present)',
lifecycleEvents: ['cleanCache']
}
}
}
@@ -128,6 +137,11 @@ class ServerlessPythonRequirements {
return args[1].functionObj.runtime.startsWith('python');
};

const clean = () =>
BbPromise.bind(this)
.then(cleanup)
.then(removeVendorHelper);

const before = () => {
if (!isFunctionRuntimePython(arguments)) {
return;
@@ -155,13 +169,13 @@ class ServerlessPythonRequirements {

const invalidateCaches = () => {
if (this.options.invalidateCaches) {
return BbPromise.bind(this)
.then(cleanup)
.then(removeVendorHelper);
return clean;
}
return BbPromise.resolve();
};

const cleanCache = () => BbPromise.bind(this).then(cleanupCache);

this.hooks = {
'after:package:cleanup': invalidateCaches,
'before:package:createDeploymentArtifacts': before,
@@ -172,16 +186,9 @@ class ServerlessPythonRequirements {
this.serverless.cli.generateCommandsHelp(['requirements']);
return BbPromise.resolve();
},
'requirements:install:install': () =>
BbPromise.bind(this)
.then(pipfileToRequirements)
.then(addVendorHelper)
.then(installAllRequirements)
.then(packRequirements),
'requirements:clean:clean': () =>
BbPromise.bind(this)
.then(cleanup)
.then(removeVendorHelper)
'requirements:install:install': before,
'requirements:clean:clean': clean,
'requirements:cleanCache:cleanCache': cleanCache
};
}
}
@@ -1,6 +1,8 @@
const BbPromise = require('bluebird');
const fse = require('fs-extra');
const path = require('path');
const glob = require('glob-all');
const { getUserCachePath } = require('./shared');

BbPromise.promisifyAll(fse);

@@ -29,4 +31,32 @@ function cleanup() {
);
}

module.exports = { cleanup };
/**
* Clean up static cache, remove all items in there
* @return {Promise}
*/
function cleanupCache() {
const cacheLocation = getUserCachePath(this.options);
if (fse.existsSync(cacheLocation)) {
if (this.serverless) {
this.serverless.cli.log(`Removing static caches at: ${cacheLocation}`);
}

// Only remove cache folders that we added, just incase someone accidentally puts a weird
// static cache location so we don't remove a bunch of personal stuff
const promises = [];
glob
.sync([path.join(cacheLocation, '*slspyc/')], { mark: true, dot: false })
.forEach(file => {
promises.push(fse.removeAsync(file));
});
return BbPromise.all(promises);
} else {
if (this.serverless) {
this.serverless.cli.log(`No static cache found`);
}
return BbPromise.resolve();
}
}

module.exports = { cleanup, cleanupCache };
@@ -49,8 +49,11 @@ function findTestFile(servicePath) {
if (fse.pathExistsSync(path.join(servicePath, 'serverless.json'))) {
return 'serverless.json';
}
if (fse.pathExistsSync(path.join(servicePath, 'requirements.txt'))) {
return 'requirements.txt';
}
throw new Error(
'Unable to find serverless.yml or serverless.yaml or serverless.json for getBindPath()'
'Unable to find serverless.{yml|yaml|json} or requirements.txt for getBindPath()'
);
}

@@ -154,7 +157,7 @@ function getDockerUid(bindPath) {
'stat',
'-c',
'%u',
'/test/.serverless'
'/bin/sh'
];
const ps = dockerCommand(options);
return ps.stdout.trim();
Oops, something went wrong.

0 comments on commit 137f8e1

Please sign in to comment.
You can’t perform that action at this time.