Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are requests so slow? #134

Closed
ztolley opened this issue Sep 21, 2017 · 33 comments
Closed

Why are requests so slow? #134

ztolley opened this issue Sep 21, 2017 · 33 comments

Comments

@ztolley
Copy link

ztolley commented Sep 21, 2017

I installed the package globally

Created a simple template and js file to say hello, copied from samples

Started with sam local start-api

After the first start which downloads the docker images I can fire a request to the server from a browser or command line.

Immediately says invoking index.handler

waits 10 seconds then Then START, END and REPORT.

Whats the gap between Invoke and START, and what can you do to bring that down? Every request that follows the first has the same delay.

@sanathkr
Copy link
Contributor

Are you using a Java Jar for the code? It unzips the jar on every invoke which might be the problem. We need to implement caching of the unzipped folder to make it faster.

@ztolley
Copy link
Author

ztolley commented Sep 22, 2017

No, js a hello world js

exports.handler = (event, context, callback) => {
  console.log('Handler start')

  callback(null, {
      statusCode: 200,
      body: 'Hello world.'
  })
}

@sanathkr
Copy link
Contributor

What OS are you on? And what version of Docker?

From what I observed most of the startup latency is in spinning up Docker itself. We need a better mechanism to cache and reuse containers so you don't take this penalty.

@ztolley
Copy link
Author

ztolley commented Sep 23, 2017

Ubuntu 17.04 and current release version of docker. Does it start a new container for each request?

@PaulMaddox
Copy link
Contributor

Hi @ztolley,

We do create a new container for each invocation - we deliberately chose to not re-use containers for simplicity in the implementation.

An overhead of 10 seconds is not normal though.
This should take < 1 second on most machines. For example, a full end-to-end request/response for a NodeJS function on my machine (through sam local start-api takes 0.9 seconds.

$ time curl localhost:3000
hello
curl localhost:3000  0.00s user 0.00s system 0% cpu **0.970** total

If you change the Timeout property on your Lambda function in your SAM template, does it change how long you're waiting for a response?

@jugglingcats
Copy link

jugglingcats commented Oct 2, 2017

@PaulMaddox I have the same issue as @ztolley. If anything mine is worse...

$ time curl localhost:3000
real    0m20.640s
user    0m0.000s
sys     0m0.004s

Similar delay on each invocation. I notice that during most of this time docker ps shows nothing, so perhaps there is some issue caching the docker container (just a guess)?

I'm also using a simple node lambda function. It runs fine after the delay.

I'm running Ubuntu 16.04.3 LTS in a VirtualBox VM on Windows.

$ sudo docker version
Client:
 Version:      17.09.0-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:42:18 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.0-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:40:56 2017
 OS/Arch:      linux/amd64
 Experimental: false

I'm building aws-sam-local from source:

$ sudo /home/alfie/go/bin/aws-sam-local --version
sam version snapshot

I'll try with the release version now...

EDIT: same results with v0.2.2

EDIT: sudo docker run hello-world is v. fast

@jugglingcats
Copy link

I increased the docker daemon debug level and can see the first pull of the image. This only happens on the first invocation.

Oct 02 18:53:33 ubuntu dockerd[988]: time="2017-10-02T18:53:33.935204589+01:00" level=debug msg="Trying to pull lambci/lambda from https://registry-1.docker.io v2"
Oct 02 18:53:35 ubuntu dockerd[988]: time="2017-10-02T18:53:35.837557589+01:00" level=debug msg="Pulling ref from V2 registry: lambci/lambda:nodejs4.3"

On each invocation there is nothing in the docker log until 20s later when I see:

Oct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.851466966+01:00" level=debug msg="Calling POST /v1.30/containers/create"
Oct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.852087120+01:00" level=debug msg="form data: {\"AttachStderr\":false,\"AttachStdin\":false,\"AttachStdout\":faOct 02 19:05:01 ubuntu dockerd[988]: \":\"\"},\"Memory\":0,\"MemoryReservation\":0,\"MemorySwap\":0,\"MemorySwappiness\":null,\"NanoCpus\":0,\"NetworkMode\":\"\",\"OomKillDiOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.875103569+01:00" level=debug msg="container mounted via layerStore: /var/lib/docker/overlay2/d805b257ab3a3388aOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.881657771+01:00" level=debug msg="Calling POST /v1.30/containers/314eaea03124b393812ee787375aad3e4cbd32c493d72Oct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.884345881+01:00" level=debug msg="container mounted via layerStore: /var/lib/docker/overlay2/d805b257ab3a3388aOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.884690306+01:00" level=debug msg="Assigning addresses for endpoint jovial_bardeen's interface on network bridgOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.884840668+01:00" level=debug msg="RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[])"
Oct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.889042145+01:00" level=debug msg="Assigning addresses for endpoint jovial_bardeen's interface on network bridgOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.907086603+01:00" level=debug msg="Programming external connectivity on endpoint jovial_bardeen (e2e147fbfd8dd4Oct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.909861288+01:00" level=debug msg="EnableService 314eaea03124b393812ee787375aad3e4cbd32c493d7282896a8dfbd32708fOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.909881413+01:00" level=debug msg="EnableService 314eaea03124b393812ee787375aad3e4cbd32c493d7282896a8dfbd32708fOct 02 19:05:01 ubuntu dockerd[988]: time="2017-10-02T19:05:01.989433717+01:00" level=debug msg="sandbox set key processing took 30.706067ms for container 314eaea03124b39381Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.022468704+01:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-containeOct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.027110772+01:00" level=debug msg="libcontainerd: event unhandled: type:\"start-container\" id:\"314eaea03124b3Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.027819518+01:00" level=debug msg="Calling POST /v1.30/containers/314eaea03124b393812ee787375aad3e4cbd32c493d72Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.028139111+01:00" level=debug msg="attach: stdout: begin"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.028454941+01:00" level=debug msg="attach: stderr: begin"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.134642968+01:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"3Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.134985982+01:00" level=debug msg="attach: stdout: end"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.135364281+01:00" level=debug msg="Revoking external connectivity on endpoint jovial_bardeen (e2e147fbfd8dd4d6eOct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.135768914+01:00" level=debug msg="DeleteConntrackEntries purged ipv4:0, ipv6:0"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.136201835+01:00" level=debug msg="attach: stderr: end"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.137003682+01:00" level=debug msg="Calling POST /v1.30/containers/314eaea03124b393812ee787375aad3e4cbd32c493d72Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.164579263+01:00" level=debug msg="Releasing addresses for endpoint jovial_bardeen's interface on network bridgOct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.164619106+01:00" level=debug msg="ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.2)"
Oct 02 19:05:02 ubuntu dockerd[988]: time="2017-10-02T19:05:02.179511426+01:00" level=debug msg="Calling DELETE /v1.30/containers/314eaea03124b393812ee787375aad3e4cbd32c493d

You can see that this whole sequence takes <1s so it seems docker is not the culprit. Not sure where to look next?

@jugglingcats
Copy link

I also tried running the hello-world sample using invoke with similar results:

$ sudo time ~/aws-sam-local/sam local invoke --event event.json
2017/10/02 19:18:07 Successfully parsed template.yaml
2017/10/02 19:18:07 Connected to Docker 1.32
2017/10/02 19:18:07 Fetching lambci/lambda:nodejs6.10 image for nodejs6.10 runtime...
nodejs6.10: Pulling from lambci/lambda
Digest: sha256:7eb4ced6a15ae3c30effc4ec0cd3aabb2bd57c9a8330b37920c3d5d722d81083
Status: Image is up to date for lambci/lambda:nodejs6.10
2017/10/02 19:18:09 Invoking index.handler (nodejs6.10)
START RequestId: abadca01-4ed0-18a5-4f7a-5a0cd2af55e1 Version: $LATEST
2017-10-02T18:18:30.101Z        abadca01-4ed0-18a5-4f7a-5a0cd2af55e1    LOG: Name is Bob
END RequestId: abadca01-4ed0-18a5-4f7a-5a0cd2af55e1

"Hello Bob"
REPORT RequestId: abadca01-4ed0-18a5-4f7a-5a0cd2af55e1  Duration: 4.73 ms       Billed Duration: 0 ms   Memory Size: 0 MB       Max Memory Used: 28 MB


0.04user
0.00system
0:23.39elapsed
0%CPU (0avgtext+0avgdata 16268maxresident)k
0inputs+0outputs (0major+1688minor)pagefaults 
0swaps

@m5
Copy link
Contributor

m5 commented Oct 5, 2017

@ztolley, @jugglingcats, do you have AWS credentials configured? If not, see if adding them speeds up your execution time.

Before each execution, sam attempts to gather credentials. If no local credentials are found, it falls back to gathering credentials from ec2 instance metadata, which can take a very long time to fail if you are not running on an ec2 instance.

@jugglingcats
Copy link

Thanks @m5 that fixed it for me! Worth adding to the docs.

@sanathkr
Copy link
Contributor

sanathkr commented Oct 5, 2017

Looks like this issue is resolved by supplying credentials. I will close the ticket..

@sanathkr sanathkr closed this as completed Oct 5, 2017
@dannymcpherson
Copy link

Thanks for this thread... I had 6-7s total times, and adding the AWS creds environment variables brought the total times down to about ~1.3s ... I would love to see the response times be even lower, but honestly, I'll take aws sam local in any form. Incredibly helpful.

@aravinds03
Copy link

I am facing the same issue.
Everytime url is hit, it says "Decompressing ..jar" and it waits for 5 to 8 seconds.
I have also set ENV variables, but it is still slow.

@jebberjeb
Copy link

@aranvinds03. The way I worked around this was to extract my jar to a temp dir, then point CodeUri at that temp dir.

Decompressing each time I make an API call is also pretty slow for me since my jar file is pretty large.

@aravinds03
Copy link

I tried that as well. I extracted jar and run with CodeUri pointing to folder. Still, the requests are too slow. If i upload the jar to lambda, then requests hitting aws endpoint is much faster than local box.

I realized that "decompressing jar" step is not time consuming. There is something else which takes more time in local box.

@jebberjeb
Copy link

@aravinds03 That's true. There is a lot more going on. Each request involves spinning up a new container (they are not reused), and then running a java process. For me, since my Lambda is Clojure, it also involves loading the Clojure runtime. This all adds up to my requests taking several seconds at a minimum. At this point, I am more or less just living with that limitation.

@dannymcpherson
Copy link

dannymcpherson commented Dec 4, 2017 via email

@jamescgibson
Copy link

Just adding to this issue for future searchers: If you configured your credentials with a profile, as of sam v0.2.2, you need to pass --profile <profile_name> to use them.

This allowed me to get the speedups described in this thread.

@aravinds03
Copy link

@jamescgibson ,
The requests are still slow even after setting credential and passing --profile argument.
Requests to actual api gateway server is faster than hitting local server.

@orzechowskid
Copy link

maybe it's obvious in hindsight, but sam couldn't find my credentials even using the --profile argument until I placed that argument before the SAM ID:

$ sam local invoke --profile=my-profile -e test-event.json MyFunction

@pkasson
Copy link

pkasson commented Mar 9, 2018

I find that the missing AWS credentials is not the issue I or seems are seeing which is causing the delay. Its clear that each invocation causes a decompressing of the JAR and then cleanup following the invocation.

I saw a post where someone used a folder in the CodeUri property but could not get that working -- IS THERE a way to point to a folder which has already been decompressed / unzipped and keeps it "warm" essentially.

Clearly, there needs to be a way to keep the lambda warm between invocations. I would be will to assist if given direction on this.

@marczis
Copy link

marczis commented Mar 14, 2018

AWS Credential solved for me, but I think this still would worth some investigation, why the SAM local looks for the credentials ? Some of the developers in larger organizations may not have access to the AWS environment.

@zhawei
Copy link

zhawei commented Apr 2, 2018

I test lambda by running SAM local in centos7 with vagrant, ~/.aws/config and ~/.aws/credentials were configured.But it always cost 30s.
I tried every way in many issues but None worked.
Once I tried to set aws credentials in root not in the sudoer account, It worked, run time has decreased by 20s.
So if someone meet this problem, try copy ~/.aws to /home/root or exucte "aws configure" in root account, be sure /home/root/.aws has configured success before run sam local.
It looks like sam docker use root's home path as the configure path instead of sudoer's home path
Wish this comment will be helpful to others who trapped in this problem.

@sprt sprt mentioned this issue Apr 16, 2018
@rchatsiri
Copy link

It's the same @zhawei described problems from a Aws-sam-local run by use normal user in the local environment. I change normal to the root user. Response message send from binary running with aws-sam-local fast more run than normal user.

@wishdevs
Copy link

in my case
skip pull docker image
before 8s
after 1.5s

sam local start-api --skip-pull-image

@gregaeq
Copy link

gregaeq commented Jun 1, 2018

I couldn't find it in the sam-local docs, but following the AWS SAM docs, you can increase the ram usage of your local service too.

Took me a while to find this answer, so I'll leave this here.

In your template.yml

Properties:
    MemorySize: 256

You can view the actual usage of this too using
docker stats
Invoke your sam-local request, and you'll see it spin up!

@hendore
Copy link

hendore commented Jul 21, 2018

Although I wasn't suffering delays as bad as other reports, each request was taking over 2s but after setting the --skip-pull-image flag as recommended by @wishdevs it has brought that down to 1s 👍

@kevanpng
Copy link

kevanpng commented Oct 11, 2018

I did an experiment with container reuse. This is just with a lambda in python, I'm developing on ubuntu 16.04. In summary, docker container spinning up only takes an extra second. So it is not worth making the feature for container reuse. Link to my code https://github.com/kevanpng/aws-sam-local .

For a fixed query, both my colleague and I have 4s invocation time on sam local. His is a windows machine. With giving the profile flag and the container reuse, it goes down to 2.5s in my ubuntu.

My colleague is running on mac and when he tried the same query with lambda reuse and profile flag, he still had 11-14 seconds to run.

Maybe it could be that docker is slow on mac?

@skinofstars
Copy link

For those finding this issue first, also useful is the --warm-containers EAGER flag.

Ref #239

@bjorn-ove
Copy link

I had this exact issue today, but did not find any indication what caused is. It turns out the fix in #159 only applies to golang, so the python environment does not warn about it. Also, being able to run without credentials is important to me as I want to be certain nothing affects what is in the cloud.

The problem is that the credentials cache is basically broken. It will try to get the credentials and then store them for use in later invocations. However, if no credentials are returned it keeps trying every time it is invoked because it does not know the difference between not initialized and not found.

I fixed this in a fork, but don't know if the solution breaks something else so I didn't post it as a pull-request. In case someone else has the capacity to do so, here is my fix: bjorn-ove@b9e5afe

@ssousa33-OS
Copy link

I had this exact issue today, but did not find any indication what caused is. It turns out the fix in #159 only applies to golang, so the python environment does not warn about it. Also, being able to run without credentials is important to me as I want to be certain nothing affects what is in the cloud.

The problem is that the credentials cache is basically broken. It will try to get the credentials and then store them for use in later invocations. However, if no credentials are returned it keeps trying every time it is invoked because it does not know the difference between not initialized and not found.

I fixed this in a fork, but don't know if the solution breaks something else so I didn't post it as a pull-request. In case someone else has the capacity to do so, here is my fix: bjorn-ove@b9e5afe

@BearOve, has this solution been stable for you since the time of this post?

@bjorn-ove
Copy link

@BearOve, has this solution been stable for you since the time of this post?

Yes, no issues since related to this.

@Gefyaqiilah
Copy link

i follow @skinofstars to use --warm-containers flag but the value is LAZY, and now the response time is decreased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests