New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: drastically improved dev dependency exclusion performance #5574

Merged
merged 1 commit into from Dec 7, 2018

Conversation

Projects
None yet
3 participants
@MacMcIrish
Copy link

MacMcIrish commented Dec 6, 2018

What did you implement:

Closes #5568

How did you implement it:

Instead of passing the complete pattern list into Globby, a complete project file list is loaded into memory and then resolved from the patterns using micromatch. The use of micromatch here is because glob itself uses minimatch, which is why we're seeing such a significant performance difference.

How can we verify it:

Run sls package on a project with a large file count with many dev dependencies. The difference in timing is ~30s compared to ~15 minutes.

Todos:

  • Write tests
  • Write documentation
  • Fix linting errors
  • Make sure code coverage hasn't dropped
  • Provide verification config / commands / resources
  • Enable "Allow edits from maintainers" for this PR
  • Update the messages below

Is this ready for review?: YES
Is it a breaking change?: NO

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 6, 2018

Awesome @MacMcIrish! Thanks for the PR. Do you have a good package.json to test with or does it happen with pretty much any set of dependencies with yarn?

@MacMcIrish

This comment has been minimized.

Copy link

MacMcIrish commented Dec 6, 2018

@dschep of course :)
I'll come up with a small project that will recreate this, and I'll get back to you. As far as I know, this will happen for any project that has a not insignificant amount of both dependencies and devDependencies, so hopefully I can get to it soon.

I've just made a couple more changes to the PR after investigating globby's syntax to make sure there isn't any unexpected changes here (outside of what tests currently exist)

@MacMcIrish MacMcIrish force-pushed the MacMcIrish:master branch from 18bef3b to 0f21357 Dec 6, 2018

@MacMcIrish MacMcIrish force-pushed the MacMcIrish:master branch from 0f21357 to 23c37b7 Dec 6, 2018

@MacMcIrish MacMcIrish changed the title feat: drastically improved glob matching performance when resolving files from patterns feat: drastically improved dev dependency exclusion performance Dec 6, 2018

@MacMcIrish

This comment has been minimized.

Copy link

MacMcIrish commented Dec 6, 2018

@dschep I've got a snippet here to reproduce, using serverless@1.34.1 using a group of our public dependencies:

#!/bin/sh
rm -rf testProject
sls create --template aws-nodejs --path testProject 
(cd testProject \
	&& echo 'yarn-offline-mirror "./offline/package"' > .yarnrc \
	&& yarn add app-root-path@2.1.0 \
	    aws-sdk@2.371.0 \
	    aws-sdk-wrap@1.1.1 \
	    glob@7.1.3 \
	    lambda-serverless-api@1.32.7 \
	    lodash.get@4.4.2 \
	    object-deep-contain@1.0.1 \
	    object-hash@1.3.1 \
	    object-rewrite@2.1.0 \
	    object-scan@2.1.0 \
	    serverless@1.34.1 \
	    s3-cached@2.1.2 \
	    yaml-boost@1.8.1 \
    && yarn add --dev @babel/cli@7.2.0 \
	    @babel/core@7.2.0 \
	    @babel/plugin-proposal-object-rest-spread@7.2.0 \
	    @babel/plugin-transform-flow-comments@7.2.0 \
	    @babel/preset-flow@7.0.0 \
	    @babel/register@7.0.0 \
	    babel-eslint@10.0.1 \
	    chai@4.2.0 \
	    coveralls@3.0.2 \
	    js-gardener@1.34.2 \
	    lambda-tdd@2.1.0 \
	    lru-cache@5.1.1 \
	    micromatch@^3.1.10 \
	    minimist@1.2.0 \
	    mocha@5.2.0 \
	    nock@10.0.3 \
	    nyc@13.1.0 \
	    semantic-release@15.12.4 \
	&& echo -e 'package:\n  exclude:\n    - offline/**' >> serverless.yml \
	&& time sls package)

Outputs:
sls package 582.59s user 3.70s system 102% cpu 9:34.26 total

Whereas using my branch outputs:
~/local/path/bin/serverless package 23.99s user 2.21s system 141% cpu 18.511 total

@simlu

This comment has been minimized.

Copy link

simlu commented Dec 6, 2018

We would love to see this merged in and released quickly as it really hinders our ability to deploy quickly and clogging up ci. Cheers!

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 7, 2018

Thanks for the snippet @MacMcIrish. I'll test it tomorrow. We'll release this soon @simlu :shipit:

@dschep dschep self-assigned this Dec 7, 2018

@dschep
Copy link
Member

dschep left a comment

Looks & works great @MacMcIrish! It doesn't seem we have much test coverage of resolveFilePathsFromPatterns. Would you mind adding some? It's probably best to use os.tmpdir() and set this.serverless.config.servicePath to the created temp dir then fill it with some empty files&dirs. I'm happy to do it, but it might take longer 😉

@@ -4,6 +4,7 @@ const BbPromise = require('bluebird');
const path = require('path');
const globby = require('globby');
const _ = require('lodash');
const nanomatch = require('nanomatch');

This comment has been minimized.

@dschep

dschep Dec 7, 2018

Member

Wow. @micromatch publishes micromatch, nanomatch picomatch, and anymatch. That's not confusing at all 🙄

This comment has been minimized.

@simlu

simlu Dec 7, 2018

We had our fun with glob and *match yesterday :) Globby uses fast glob. Fast glob only uses default wildcard pattern according to their docs. However internally they use micromatch, not nanomatch. So they might actually expose the extended syntax that it provides over nanomatch.

We could start using micromatch instead of nanomatch here. I have to check what fast glob does internally. Maybe that's a test we could be adding. This whole thing is so confusing haha... Would have loved to fix it downstream, but honestly don't have the time right now and this is pressing.

Somewhat unrelated but interesting:

  • Globby seems so calls fast glob for each pattern which results in many file scans. Who wrote that and thought it was performant?! All we did is replicate that functionality in memory.
  • Still not entirely sure what the difference between picomatch and nanomatch really is. No dependencies for picomatch?
  • We also have npm glob which provides its own custom syntax and uses minimatch (which is apparently slow).
@simlu

This comment has been minimized.

Copy link

simlu commented Dec 7, 2018

@dschep There is honestly a lot of test coverage already testing this (test were testing globby functionality, now testing our code), which is why we didn't add more tests. I'm confused what else you want tested here? Are you just asking because we didn't add tests in this pr?

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 7, 2018

Oh, I failed to find them then @simlu, I expected them in lib/plugins/package/lib/packageService.test.js. Mind pointing me to them since you seem to know where they are? :)

@MacMcIrish

This comment has been minimized.

Copy link

MacMcIrish commented Dec 7, 2018

@dschep There are a few integration tests that check this in zipService.test.js, namely
'should exclude with globs'
'should re-include files using ! glob pattern'
'should re-include files using include config'
'should throw error if no files are matched'
'#zipFiles()'

Please let me know if that is sufficient :)

@dschep

dschep approved these changes Dec 7, 2018

Copy link
Member

dschep left a comment

Yup. That's more than enough. I should've looked harder.

probably gonna :shipit: sometime next week

@dschep dschep merged commit 2606bb2 into serverless:master Dec 7, 2018

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@MacMcIrish

This comment has been minimized.

Copy link

MacMcIrish commented Dec 7, 2018

Awesome, cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment