AWS Kinesis-lambda streaming local mock
AWS offers the cool possibility to consume from Kinesis streams in real time in a serverless fashion via AWS Lambda. However, it can become extremely annoying to have to deploy a Lambda function into AWS and have a Kinesis stream (e.g. queue) up & running just to test code. This repo attempts to solve this exact problem. It contains base code useful to reproduce locally what happens behind the scenes when AWS is plugging AWS lambda functions to consume from a Kinesis stream.
In order to do so, we are using open sourced Docker images:
- vsouza/kinesis-local for Kinesis Streams emulation
- dwmkerr/dynamodb DynamoDB docker image, required for kinesis client library (KCL)
Last but not least, this project is implementated in Java. However, stay tunned, as we have planned to extend it for Python as well.
Start docker env
Though we are big fans of docker compose, we have rather chosen to implement bootstrapping docker containers via bash script for two main reasons:
a) We wanted to give developers the flexibility to choose which Dockers to start. For example, Java consumer implementation requires using a local DynamoDB, whereas Python doesn't; b) We wanted to have the flexibility to automate additional functionality, such as creating Kinesis streams and DynamoDB tables, listing them after boot, creating local AWS CLI profile, etc.
To bootstrap all containers:
cd docker/bootstrapEnv bash +x runDocker.sh
If you check the runDocker.sh bash script, you will see that it will:
a) start DynamoDB docker b) setup locally a fake AWS profile to interact with local Kinesis c) start Kinesis container d) create a new kinesis stream
Note that we are not persisting any data from these containers, so every time you start any Docker it will be "brand new".
Also relevant to point out is that we are creating the stream with three shards. This in WS Kinesis terminology means the queue partitioning, and also how one would improve read/write capacity of a given stream. However, in reality this is completly mocked, since we are running a single docker container, which "pretends" to have X amount of partitions.
Publish fake data to Kinesis stream
To help you getting started, we provided yet another bash script that pushes mock data (json encoding) to the same Kinesis stream previously created.
To start producing events:
cd docker/common bash +x generateRecordsKinesis.sh
This will continuously publish events to the Kinesis stream until you decide to stop it (for example by hitting Ctrl+C). Note also that we are randomly publishing to any of the 3 available shards. For future reference, bsides adapting our mock event data towards your requirements, the specification of partition key might also be something you want to enforce depending on how your producers are publishing data into Kinesis.
Start consuming from kinesis stream
At this point, you have everything to test the execution of a Lambda function. We have provided an example of a basic Lambda - com.gosmarten.mockcloud.localrunner.TestLambda - that just prints each event. To actually test it running, you need to run com.gosmarten.mockcloud.localrunner.RawEventsProcessorLambdaRunner class. This class continuously iterates over each stream shard and pulls for data, which it will then pass to our lambda as a collection of Records.
How to test your own Lambda functions
We finally reached the moment where you can test your own Lambda. To do so, simply pass it to a new instance of the class "com.gosmarten.mockcloud.consumer.KinesisConsumerProcessorManager", such as our previous example:
final KinesisConsumerProcessorManager<Void> recordProcessorFactory = new KinesisConsumerProcessorManager<>(TestLambda::new); recordProcessorFactory .setAwsProfile(AWS_PROFILE) .runWorker(STREAM_NAME, APP_NAME);
Instead of "TestLambda", specify your own. And ... that's it, happy coding!