Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aws lambda runtime support #101

Closed
wants to merge 18 commits into from

Conversation

NickSeagull
Copy link

@NickSeagull NickSeagull commented Jun 20, 2019

This PR is a continuation of #82 , I decided to start from scratch because master had advanced a lot.

These changes allow writing handlers with the v2 version of the runtime.

This version packages the runtime together with the user's project, so it is much faster than before:

  • Cold starts reduced from ~120ms to ~60ms
  • Execution time reduced from ~12ms to ~0.9ms

cloudwatch times

Note: These benchmarks were done with a Hello World lambda function


Regarding integration tests, I have modified the integration tests so they test using the runtime. Everything works except local invocation, as I'm having a weird error going on:

{
  "errorType": "Runtime.ExitError",
  "errorMessage": "RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72 Error: fork/exec /var/task/bootstrap: exec format error"
}

Copy link

@axman6 axman6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks too simple to be true... :) I don't know the internals well enough to be able to comment much further than the suggestion I've put but thought I would set the ball rolling. @koterpillar thoughts on the PR?

integration-test/run.sh Show resolved Hide resolved
Copy link
Contributor

@koterpillar koterpillar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how simple this is. I hope I'll get time to test some of our existing projects with this soon (or maybe @axman6?).

I note that the README, documentation and changelog are unchanged, we'll need to work on that - I think I have a good understanding of how the new plugin works, let me know if you need help with this.

events:
- http:
path: hello/{name}
method: get

subdir:
handler: subdir/subtest.main
handler: src/SubDir/SubModule.handler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original version supported multiple Stack projects. If this is removed, there's no reason to include this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes, because this is what the runtime dispatcher uses for knowing which handler to use.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't for the life of me get this to work, I keep getting errors about not being able to find an executable named handler.

@@ -17,6 +17,7 @@ dependencies:
- amazonka-core
- amazonka-kinesis
- amazonka-s3
- aws-lambda-haskell-runtime >= 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this has to be in stable Stackage before this PR can be merged.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, working on it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as this gets accepted, we will have it on LTS-13.25 commercialhaskell/lts-haskell#236

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -299,15 +297,15 @@ class ServerlessPlugin {

const res = this.runStack(directory, buildCommand);

// Copy the executable to the destination directory
// Copy the executable and runtime to the destination directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the runtime copied - is this comment left around from some removed changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks! 03b49e1

integration-test/run.sh Outdated Show resolved Hide resolved
@koterpillar
Copy link
Contributor

I am having problems running the integration tests on my machine. sls invoke local test fails with:

> /var/task/bootstrap: error while loading shared libraries: libpcre.so.1: cannot open shared object file: No such file or directory

Does the invoke local expect binaries compiled for Lambda or local machine native ones?

@NickSeagull
Copy link
Author

I just pushed a fix to 1cf14c3 , the proper executable wasn't being copied. Try the invoke local now :)

@@ -303,11 +301,12 @@ class ServerlessPlugin {
const stackInstallRoot = this.runStackOutput(
directory,
[
'--docker',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a mechanism for disabling Docker (search the file for localRun and docker.skip), please use that instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, it always skips it, that's why I had to manually add it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about the integration tests? They also test "disable Docker completely" mode, not sure how that plays with the new invoke local.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new invoke local relies on always running Docker, so it won't work with that.

Still, I'm talking about running sls invoke local as it is. With the --docker flag, or without it, the result is the same one, the wrong executable gets copied (the one that is not built with Docker).

@NickSeagull
Copy link
Author

I'm currently trying to fix the tests, but there are two main issues:

  • aws-lambda-haskell-runtime is now on LTS-13.27, but not in any resolver before that. This means that the tests with LTS-8/9/10/11/12 are failing now, unless we add the runtime to the extra-deps
  • On the other hand, there is no fpco/stack-build:lts-13.27 image, only 13.25, so it the runtime is not available yet on the docker for the tests.

I wanted to solve this with the extra-deps field, but only for the tests, as the users will use 13.27+ . Is that okay for you?

@koterpillar
Copy link
Contributor

That shouldn't be a problem, the image will be added soon enough. You can override the resolver for now, and remove 12- tests.

@NickSeagull
Copy link
Author

I'm trying to fix Travis in order to make this PR mergeable, but some jobs are failing with this message:

W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://packagecloud.io/github/git-lfs/ubuntu trusty InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 6B05F25D762E3157
W: Failed to fetch https://packagecloud.io/github/git-lfs/ubuntu/dists/trusty/InRelease  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 6B05F25D762E3157
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-updates/main/binary-amd64/Packages  Writing more data than expected (1177583 > 1177297)
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-updates/main/binary-i386/Packages.bz2  Hash Sum mismatch
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-updates/universe/binary-amd64/Packages.bz2  Hash Sum mismatch
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-updates/universe/binary-i386/Packages  Writing more data than expected (504209 > 504177)
W: Some index files failed to download. They have been ignored, or old ones used instead.

Any idea what can I do in order to fix this?

@koterpillar
Copy link
Contributor

I've restarted everything. Sorry, I don't think there's a way to do anything on Travis if you're not a member of the repository.

@koterpillar
Copy link
Contributor

Although if all the jobs have failed, you can do a trivial amend and force push to start them all again.

Just to see if it fixes Travis' tests
@koterpillar
Copy link
Contributor

Looks like whatever process Stack had for the Docker image has failed: commercialhaskell/stack#4945

I'm wondering about the quality of --docker support, if it's not there we might need to implement the required juggling ourselves.

@NickSeagull
Copy link
Author

@koterpillar from my experience, --docker doesn't have always the latest image tagged by it's resolver, and rather its tagged as latest which is an issue, because stack expects the resolver...

@axman6
Copy link

axman6 commented Jul 23, 2019

FYI, it looks like the docker images for lts-13.26-29 are now available.

@axman6
Copy link

axman6 commented Jul 23, 2019

I'm trying to test this out and failing when going through the README - when running sls invoke local -f <mypackage>-func, I get the following:

sls invoke local -f hsQRBot-func
Serverless: Building handler hsQRBot-func with Stack...
hsQRBot-exe: Error initializing, variable not set: AWS_LAMBDA_RUNTIME_API
CallStack (from HasCallStack):
  error, called at src/Aws/Lambda/Runtime.hs:79:3 in aws-lambda-haskell-runtime-2.0.1-DcFSjyCpB4GH8Qlcjgkmv7:Aws.Lambda.Runtime

Any idea what's going on?

@NickSeagull
Copy link
Author

This is weird, because the runtime process expects that variable to be set, which is where the runtime then gets the events and posts the results, it is weird, because when I was working on the PR it was working 🤔

@axman6
Copy link

axman6 commented Jul 23, 2019

Edit: Looks like I probably am not using the updated serverless-haskell npm module.

I decided to bypass the local invocation and see if I could run things on AWS, and now I'm running into even more fun problems. I can upload the handler (use the example project as my template) with

functions:
  hello:
    handler: serverless-haskell-example.bootstrap

but once I try to use sls invoke -f hello I get

"Handler serverless-haskell-example.bootstrap does not exist on project"
 
  Error --------------------------------------------------
 
  Invoked function failed
 
     For debugging logs, run again after setting the "SLS_DEBUG=*" environment variable.
 
  Get Support --------------------------------------------
     Docs:          docs.serverless.com
     Bugs:          github.com/serverless/serverless/issues
     Issues:        forum.serverless.com
 
  Your Environment Information ---------------------------
     Operating System:          darwin
     Node Version:              12.6.0
     Serverless Version:        1.48.2
     Enterprise Plugin Version: 1.3.2
     Platform SDK Version:      2.0.4
 sls deploy list functions
Serverless: Listing functions and their last 5 versions:
Serverless: -------------
Serverless: hello: $LATEST, 1

@axman6
Copy link

axman6 commented Jul 23, 2019

Looks like we're definitely going to need some updates to the README with this PR.

@mputz86
Copy link

mputz86 commented Mar 24, 2020

Hi all, is the state of this PR the failing integration tests on travis or are there other reasons?
For example: Realized that, for me, there is an issue with a non-docker run in the integration tests since serverless-offline always chooses to use docker as runtime if it faces an unknown runtime. And fixing this in this repo to always use docker leads to different output (i.e. missing any logs :/, i.e. putStrLn). So, i.e., I dont have the missing-lib error from travis.

Anyway, this is just a side note, the real question is: What is the state here and what has to be done to get here some progress :D ?

@axman6
Copy link

axman6 commented Mar 24, 2020

My understanding is that this current PR is a significant change to the way that the project is structured and operates. I don't think any of us at SEEK have had the time to sit down and work through the implications of those changes and make sure everything's working properly. We're using this in production for several services, many of which would probably benefit from the reduced latency and resource requirements, but just don't have time to make the changes - maybe at our next hackathon we'll get together and sort it out.

@koterpillar koterpillar mentioned this pull request Apr 21, 2020
6 tasks
@koterpillar
Copy link
Contributor

Closed by #130, turns out we didn't have to change the interface this much.

@koterpillar koterpillar closed this May 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants