Local streaming consumer implementation of AWS Kinesis with AWS Lambda functions via Docker
Switch branches/tags
Nothing to show
Clone or download
Latest commit e679386 Mar 11, 2018
Permalink
Failed to load latest commit information.
docker Readme and cleanup details Mar 10, 2018
gradle/wrapper Code deployed Mar 10, 2018
src/main/java/com/gosmarten/mockcloud Code deployed Mar 10, 2018
.gitignore Code deployed Mar 10, 2018
README.md readme detail Mar 10, 2018
build.gradle Code deployed Mar 10, 2018
gradlew Code deployed Mar 10, 2018
gradlew.bat Code deployed Mar 10, 2018
settings.gradle Code deployed Mar 10, 2018

README.md

AWS Kinesis-lambda streaming local mock

Intro

AWS offers the cool possibility to consume from Kinesis streams in real time in a serverless fashion via AWS Lambda. However, it can become extremely annoying to have to deploy a Lambda function into AWS and have a Kinesis stream (e.g. queue) up & running just to test code. This repo attempts to solve this exact problem. It contains base code useful to reproduce locally what happens behind the scenes when AWS is plugging AWS lambda functions to consume from a Kinesis stream.

In order to do so, we are using open sourced Docker images:

Last but not least, this project is implementated in Java. However, stay tunned, as we have planned to extend it for Python as well.

Running environment

Start docker env

Though we are big fans of docker compose, we have rather chosen to implement bootstrapping docker containers via bash script for two main reasons:

a) We wanted to give developers the flexibility to choose which Dockers to start. For example, Java consumer implementation requires using a local DynamoDB, whereas Python doesn't; b) We wanted to have the flexibility to automate additional functionality, such as creating Kinesis streams and DynamoDB tables, listing them after boot, creating local AWS CLI profile, etc.

To bootstrap all containers:

cd docker/bootstrapEnv
bash +x runDocker.sh

If you check the runDocker.sh bash script, you will see that it will:

a) start DynamoDB docker b) setup locally a fake AWS profile to interact with local Kinesis c) start Kinesis container d) create a new kinesis stream

Note that we are not persisting any data from these containers, so every time you start any Docker it will be "brand new".

Also relevant to point out is that we are creating the stream with three shards. This in WS Kinesis terminology means the queue partitioning, and also how one would improve read/write capacity of a given stream. However, in reality this is completly mocked, since we are running a single docker container, which "pretends" to have X amount of partitions.

Publish fake data to Kinesis stream

To help you getting started, we provided yet another bash script that pushes mock data (json encoding) to the same Kinesis stream previously created.

To start producing events:

cd docker/common
bash +x generateRecordsKinesis.sh

This will continuously publish events to the Kinesis stream until you decide to stop it (for example by hitting Ctrl+C). Note also that we are randomly publishing to any of the 3 available shards. For future reference, bsides adapting our mock event data towards your requirements, the specification of partition key might also be something you want to enforce depending on how your producers are publishing data into Kinesis.

Start consuming from kinesis stream

At this point, you have everything to test the execution of a Lambda function. We have provided an example of a basic Lambda - com.gosmarten.mockcloud.localrunner.TestLambda - that just prints each event. To actually test it running, you need to run com.gosmarten.mockcloud.localrunner.RawEventsProcessorLambdaRunner class. This class continuously iterates over each stream shard and pulls for data, which it will then pass to our lambda as a collection of Records.

How to test your own Lambda functions

We finally reached the moment where you can test your own Lambda. To do so, simply pass it to a new instance of the class "com.gosmarten.mockcloud.consumer.KinesisConsumerProcessorManager", such as our previous example:

final KinesisConsumerProcessorManager<Void> recordProcessorFactory =
                new KinesisConsumerProcessorManager<>(TestLambda::new);
        recordProcessorFactory
                .setAwsProfile(AWS_PROFILE)
                .runWorker(STREAM_NAME, APP_NAME);

Instead of "TestLambda", specify your own. And ... that's it, happy coding!