Skip to content
This repository has been archived by the owner on Sep 7, 2020. It is now read-only.

Reduced load time #12

Open
marcelobern opened this issue Oct 18, 2018 · 23 comments
Open

Reduced load time #12

marcelobern opened this issue Oct 18, 2018 · 23 comments

Comments

@marcelobern
Copy link

@pimterry as I experienced lambda-git load times of up to 7 secs @ 128MB, I have been looking for alternate ways to have lambda-git work so we can reduce its load time.

Wanted to check if you have any similar work underway.

If not, would you want me to open a PR in case I get anywhere?

@pimterry
Copy link
Owner

@marcelobern no, I'm not working on that. I'd definitely be open to pull requests if you have a way to speed things up.

@marcelobern
Copy link
Author

@pimterry will do.

BTW, I submitted a PR for serverless and lambda-git is in there ;-)

@marcelobern
Copy link
Author

@pimterry here is the first result of running the new and existing codes (not a proper performance comparison though).

Existing code (initial run):

314d0872-d572-11e8-a157-6d279cbe4e9b { status: 'lambda-git: loaded' }
Duration: 4869.99 ms Billed Duration: 4900 ms Memory Size: 128 MB Max Memory Used: 45 MB

Existing code (subsequent run):

519d4717-d572-11e8-bb59-df8701e3c6ac { status: 'lambda-git: already loaded' }
Duration: 0.69 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 46 MB

New code (initial run):

5c96a4bf-d570-11e8-b46f-212ccdc20a1b { status: 'lambda-git: loaded' }
Duration: 2449.90 ms Billed Duration: 2500 ms Memory Size: 128 MB Max Memory Used: 29 MB

New code (subsequent run):

72b5f226-d570-11e8-bb41-77fbb4a850ec { status: 'lambda-git: already loaded' }
Duration: 10.40 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 29 MB

I will work on cleaning up the code and looking into some better performance comparison for us to check if we can sustain this 49% reduction in run time ;-)

Cheers!

@marcelobern
Copy link
Author

marcelobern commented Oct 24, 2018

@pimterry through a lambci thread we have access to a git 2.13.5 tar file.

Please let me know if you would like me to go ahead and slap git 2.13.5 to the PR or if you prefer to have it stay @ 2.4.3

@pimterry
Copy link
Owner

That all sounds excellent! Where does that performance improvement come from?

Regardless, I would love to see a PR with those changes so I can take a look and get this in. Updating to a newer build of git too sounds great to me 👍.

@marcelobern
Copy link
Author

marcelobern commented Oct 25, 2018

@pimterry great, will do!

The performance improvement comes from a little "sleight of hands". Rather than untar at runtime, that is all done before packaging lambda-git. We than have to solve the problem of size of the package (i.e. get rid of "duplicate" files), which is where I have been focusing my effort.

At runtime we then just re-create the original file structure with links, which are much faster to create that the current untar approach. BTW, I already tested it with the newer git 2.13.5 and it worked perfectly as well! So we should be able to upgrade git in a straight forward way from now on ;-)

I have been tinkering with the runtime code to optimize it and my results last night (using artillery for testing at "scale") show a load time of 900-1300 ms using link creation versus 7300-9700ms using the untar approach. That was for a lambda with 128MB RAM.

As for the PR, I believe it is worth packaging this link based approach on its own then folks with need for other binaries can benefit from it as well. So I am in the process of creating a new npm package and will submit the lambda-git PR using this npm package. This will make the change to lambda-git minimal.

@pimterry
Copy link
Owner

Very neat, that makes sense. Very nice work, I'm looking forward to the PR 👍

@marcelobern
Copy link
Author

@pimterry I am about 95% done with the PR.

As I ran the test, the second test Should install the Git binary to a given directory fails.

I am not familiar with bats so after a few hours of checking code and the test case I decided to run the same test cases using mocha/chai and they all pass.

Please let me know if you have any suggestions on how you would like me to proceed.

In the mean time feel free to check the new modules I will be using:

@pimterry
Copy link
Owner

pimterry commented Nov 6, 2018

Great! The test issue is surprising though. Bats is very simple, and it really shouldn't break unless something has changed. Once you've got everything else working, can you push the code & failing bats tests to a branch somewhere, so I can take a look?

I'm definitely open to converting them to something more substantial like Mocha & Chai, but I'd just like to know why they're failing first, so we can confirm the failure isn't actually a real problem.

@marcelobern
Copy link
Author

marcelobern commented Nov 12, 2018

@pimterry code is ready for the PR (except for the bats testing).

You can check what is going on in Travis CI. Some of the failures are bats related some are compatibility issues.

Please let me know if you want me to start the PR as is or if you have any additional suggestions/asks.

BTW, you can run the performance benchmark yourself if you want to ;-)

Have a great week!

@marcelobern
Copy link
Author

OK, bats testing is now:

  • Working for Node 6-9
  • Failing for Node 4 & 5 in all test cases - I am not overly concerned as these Node versions are no longer supported. Your take?
  • Failing only test case # 3 for Node 10 & 11 - even though I inspected and the variables are in there

Will give it another go tomorrow but if I cannot figure it out I will submit the PR as is (unless I hear something else from you).

@pimterry
Copy link
Owner

All sounds very promising. Yes, if the code is anywhere near ready to review, you should definitely open a PR, and we can discuss details from there.

On the specifics:

  • Very happy to drop support for Node 4 & 5.
  • We do need to work out what's up with Node 10 & 11 though, and fix that (or fix the tests).

@marcelobern
Copy link
Author

marcelobern commented Nov 14, 2018

Submitted PR #13.

I ended up changing default test from bats to mocha before the PR, as mocha tests are passing while bats is failing for Nodejs 10 & 11 (test case 3 only).

And by moving to mocha we now have code coverage information as well ;-)

Please add coveralls to the repo so the coveralls badge in README.md will work.

@marcelobern
Copy link
Author

I have been thinking about the changes and based on the fact that an existing folder is not re-written, we might want to bump the version to 1.0.0.

If that is the case it might be worth considering other behavior changes (e.g. resolve/reject values, moving config.js parameters to index.js) which I avoided including thinking thisa might be a version 0.2.0

Please let me know your thoughts.

@ryanblock
Copy link
Collaborator

Hey @marcelobern, good news! We're gonna un-deprecate this package, and I'm stoked to be helping maintain it moving forward.

Reopened your PR from last year – there's a lot of good in there. I was hoping we could work together on it, if you're into it.

It is a really large, sweeping change and if you're amenable, I'd like to figure out with you the best approach to breaking some of those changes up into smaller bits, and sequencing them.

Seems like there's a few themes going on there:

  • Core changes (impl bin-minify & lambda-bin, implementing minPack)
  • Git updates (2.4.3 to new stuff)
  • Testing (changing up test suite, adding coverage, etc.)
  • Housekeeping (changelog, package.json files, etc.)

Please let me know if you'd like to dig into that!

@marcelobern
Copy link
Author

@ryanblock what is the rationale to un-deprecate the package?

I am ok to break down the changes in smaller chunks but to be honest, with what I know today I would probably do a few things differently.

In the end, I would try to leverage git-lambda-layer as much as possible to avoid re-inventing the wheel.

@ryanblock
Copy link
Collaborator

ryanblock commented Feb 16, 2019

Legit question, @pimterry asked the same thing, why not layers: layers are indeed rad!

At Begin we have a couple of reasons for preferring a package like this for a while to come. First being that we generally we take a more generic approach to code sharing in our open core, Architect.

I do expect layers support to land in Architect soon (there's a PR open), but Lambda also has a low cap on layers (5 per Lambda), which forces some scarcity. We'd rather leave those opportunities open for userland and/or use cases that absolutely require them.

So anything that doesn’t absolutely need to be a layer for us right now we'll try to keep as a normal dep – and that includes this here package. Hope that helps!

As an aside, kudos to @pimterry as even despite git-lambda-layer, this package has continued to grow in usage. Probably a decent sign!

@marcelobern
Copy link
Author

@ryanblock thanks for sharing the rationale.

Based on what I have learned in the past few months, I would go in a different route from the original PR, as I believe the fastest "load" time would be to have the git binaries already uncompressed and symlinked. The binaries could live under bin and at load time just setup the git env variables correctly. While I believe this will work, I have not tested this approach on AWS yet.

So here is how I suggest to approach:

  • Leverage git-lambda-layer to create the git binaries from source and place them under bin folder.
  • Change the index.js file to only create the env variables. This would be a breaking change as the API currently supports changing the installation path, so would need to change major version accordingly.
  • Test that all works in AWS.
  • Update the unit tests, documentation, etc.

As you seem to have a specific pace/approach for the way changes were to be applied, please let me know what your suggestion might be. For example, we could test the uncompressed approach on AWS without bumping the git version. This would still lead to a major version change and changes to index.js.

If you prefer to use the original PR "as is", please let me know what would be the ideal way for me to have the changes broken down so it fits your goals.

@ryanblock
Copy link
Collaborator

Thanks for breaking that down!

How are you thinking about leveraging git-lambda-layer, exactly? For example, git-lambda-layer's layer.zip appears to do relative path symlinking, so in theory we have the ability to extract it wherever we like.

Thus it seems like we can avoid the breaking change (in theory) – that said, I am struggling to see why anyone would absolutely need to define a custom installation path on their Lambda (which by the way, they're not getting in the layer version!), so that sounds like a reasonable breaking change if necessary.

Seems like the first four things you mentioned would be a great start!

@marcelobern
Copy link
Author

Exactly, if you take layer.zip and unzip it to bin you can upload it to lambda by zipping the module (including bin).

By doing so, you do not have to unzip at runtme and further reduce the load time.

Are you looking to have the 4 items in a single PR? I thought you commented that there were too many changes in the PR...

@ryanblock
Copy link
Collaborator

ryanblock commented Feb 20, 2019

Hmm, I feel like perhaps I'm missing something here or misunderstanding something. What do you mean by:

If you take layer.zip and unzip it to bin you can upload it to lambda by zipping the module (including bin).

Aside, whether it's packaged as a layer, or in a node module, something is getting unzipped to somewhere to disk at runtime. In layer-land, the Lambda is handling that for us; here, we're handling it for ourselves.

Part of what I'm saying is that I think we should just perform the unzip of current layer.zip to whatever fs path the user specifies (defaulting to /tmp/git), which means in theory we don't have a breaking change.

@ryanblock
Copy link
Collaborator

Oh and one thing, we can remove the tar-fs / zipping dependency here, let's just shell out to unzip, which we already get for free from the Lambda container.

@ryanblock
Copy link
Collaborator

ryanblock commented May 16, 2019

↑ bummer news, as of AWS Linux 2, we no longer appear to have access to unzip, gzip, tar, and other (de)compression commands. ಠ_ಠ

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants