Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation on Lambda tests and handler #3

Closed
t-pascal opened this issue Jan 11, 2016 · 13 comments
Closed

Update documentation on Lambda tests and handler #3

t-pascal opened this issue Jan 11, 2016 · 13 comments
Assignees

Comments

@t-pascal
Copy link

I found it difficult to start the code because I was missing a few helpful pieces that can be addressed in the README file. I would be happy to submit a pull request, but it seems easy enough to describe.

  1. The correct handler to use in the lambda function is "index.handler". This is the default, but it's not obvious what it should be in the README deployment section.
  2. (kinesis) When configuring the "test" section in the lambda tab, it is useful to mention that the sample event data template should be Kinesis, and that the "eventSourceARN" key should be set to the correct (or test) kinesis ARN.

If you do not, you will see errors like: TypeError: Cannot call method 'split' of undefined at Object.exports.getStreamName (/var/task/index.js:163:33) at exports.handler (/var/task/index.js:393:28)

Lastly, if the tag is set incorrectly on the Kinesis stream, you will see an error similar to: Delivery Stream undefined does not exist in region us-east-1

@IanMeyers
Copy link
Contributor

Hello,

Thanks for the feedback. We have a new version of the streams-to-firehose forwarder that will support deaggregation of Kinesis Data written by Kinesis Producer Library, and I will ensure your points are addressed in that.

Regards,

Ian

@IanMeyers IanMeyers self-assigned this Jan 11, 2016
@IanMeyers
Copy link
Contributor

For the last error - it will be quite unwieldy to validate the Delivery Stream Name rather than letting the PutRecordBatch request do this, so I expect this will stay the same.

Thx,

Ian

@t-pascal
Copy link
Author

Thanks for your very quick response.

To be clear, I was only suggesting some changes to the documentation in the README file, not any code changes. The code works great as it stands and I have no problems there. The issue for me was getting the code successfully running required a few hints that might help others in a similar situation get up and running more smoothly.

Do you want to close this issue or leave it open?

@t-pascal
Copy link
Author

I will close this, because I believe it is being properly addressed. I have found that the nodejs code performance is too slow for our use and have migrated to using some python code based on the python kinesis example. Performance is nearly 10x better. Thanks for the useful repository example, however, in getting the project started!

@IanMeyers IanMeyers reopened this Jan 23, 2016
@IanMeyers
Copy link
Contributor

I'd love to hear about the performance metrics you observed, and if your Python code is functionally the same or if you have a model more tailored to your specific data?

@t-pascal
Copy link
Author

Our code is a drop-in replacement for yours. The only "customisation" is that our records are text so we batch separate lines into single records with newline delimiters for Firehose. It would not work for binary data at present. We also did not implement any features around formatting the records. It is about 30 lines of python based on the python kinesis lambda example in the AWS console. We rewrote it in Python because that is a native language for us, and I was unable to figure out where the potential bottleneck was in the nodejs code.

With your nodejs lambda function, we started with a Kinesis input batch size of 5000 but quickly ran into the 4MB firehose output batch size limit. We dropped the Kinesis batch size to 2000 and had several hundred invocation errors per hour (60 second timeouts!). Execution times were averaging 10-20 seconds, and memory utilisation was between 150MB-200MB. More concerning was that the Kinesis iterator age hovered in the 70K-85K second (one day) range. We were never able to catch up with the incoming 20K-40K records per second flow into Kinesis.

Using our python lambda function and a batch size of 2000, we were able to consume a whole day of Kinesis messages in several hours, as well as keep current with incoming flows. The iterator age hovers around 20-30 seconds, which is very acceptable. Execution times are between 1 and 2 seconds using 45MB-50MB memory. Occasionally we get timeouts at 10s, perhaps a few invocation errors per hour at peak. We have seen between 1K-2K invocations per second with 900-2000 records per invocation.

It will take us several weeks to finalise and document and get approvals for public distribution and open source. I will let you know when we do.

@IanMeyers
Copy link
Contributor

Thanks for your note. I've done some testing and found a small issue with the batch sizing calculation which has now been improved, but other than that the performance looks to be ~1 second Firehose writes for batches of 500 records @ 10k. If you want to revisit testing this, consider increasing the amount of RAM, and feel free to drop me a note on meyersi@amazon.com to further investigate.

@t-pascal
Copy link
Author

Once again, I appreciate your rapid responses. I am currently experiencing the following error with the kinesis test event:

{ "Records": [ { "eventID": "shardId-000000000000:49545115243490985018280067714973144582180062593244200961", "eventVersion": "1.0", "kinesis": { "partitionKey": "partitionKey-3", "data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0IDEyMy4=", "kinesisSchemaVersion": "1.0", "sequenceNumber": "49545115243490985018280067714973144582180062593244200961" }, "invokeIdentityArn": "arn:aws:iam::EXAMPLE", "eventName": "aws:kinesis:record", "eventSourceARN": "arn:aws:kinesis:us-east-1:11111111:stream/my-stream", "eventSource": "aws:kinesis", "awsRegion": "us-east-1" } ] }

Returns:

START RequestId: 6f0ae07d-c5e3-11e5-8be9-13f0adb67535 Version: $LATEST 2016-01-28T17:20:32.023Z 6f0ae07d-c5e3-11e5-8be9-13f0adb67535 TypeError: Argument must be a string at Object.exports.getBatchRanges (/var/task/index.js:149:23) at Object.exports.processTransformedRecords (/var/task/index.js:231:25) at /var/task/index.js:369:15 at /var/task/node_modules/async/lib/async.js:52:16 at /var/task/node_modules/async/lib/async.js:363:13 at /var/task/node_modules/async/lib/async.js:52:16 at done (/var/task/node_modules/async/lib/async.js:248:21) at /var/task/node_modules/async/lib/async.js:44:16 at /var/task/node_modules/async/lib/async.js:360:17 at /var/task/index.js:356:10 END RequestId: 6f0ae07d-c5e3-11e5-8be9-13f0adb67535 REPORT RequestId: 6f0ae07d-c5e3-11e5-8be9-13f0adb67535 Duration: 427.46 ms Billed Duration: 500 ms Memory Size: 512 MB Max Memory Used: 36 MB Process exited before completing request

And, just in terms of testing, DynamoDB test events do the same. I am investigating if I'm doing anything wrong.

@t-pascal
Copy link
Author

Sorry for the formatting problems above. I am using version 1.2.0 with the zip file located in dist in this repository.

@IanMeyers
Copy link
Contributor

Sorry about that - please resync - there was a bug in the Buffer deserialisation after transformation that my tests missed.

@t-pascal
Copy link
Author

Amazing: your latest code is now comparable to the python rewrite. Plus it seems you have added deaggregation (http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-kpl-consumer-deaggregation.html) which we were going to implement. I will test that and respond back when I have some more findings.

For now, we are having a lot of success with unaggregated records (we disabled aggregation). Memory utilization for batches of 2000 records is about 54MB. Execution times are between 200 and 600 ms which is very good. I think we have a winner!

Many, many thanks for your help. Writing the python lambda function was educational but now we do not have to support that codebase.

@IanMeyers
Copy link
Contributor

That's great - really glad to hear it.

@t-pascal
Copy link
Author

Further testing with deaggregation shows that code works (it was not working in 1.1.0). We are now using this lambda function in the POC. We are taking 1K aggregated records per second from kinesis and pushing 3K new-line-joined records per second into firehose. Total bytes to S3 is about 3.5MB/s. Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants