Chapter 5 of Scalable Data Streaming with Amazon Kinesis covers Kinesis Data Firehose. The JSON configuration files and the python code included with Chapter 5 in this repository can be used with the Use Case example: Bikeshare Station data pipeline with KDF section of the chapter. Follow the directions provided in the chapter to execute the example.
- Amazon CLI V2: https://aws.amazon.com/cli/
- Python 3.8 or later
- station_addresses.csv - Contains the address data for SmartCity bike stations in csv format.
- loadDynamoDBStationAddresses.py - Python code to load the address data to a Amazon DynamoDB table.
- TrustPolicyForLambda.json - Contains the trust policy for the role used with the Lambda transform in the KDF delivery stream.
- KDFSmartCityLambdaPolicy.json - Contains the IAM policy for the role used with the Lambda transform in the KDF delivery stream.
- KDFLookupAddressTransform.py - The Lambda function to lookup and transform incoming data in KDF to include station address data.
- CreateLambdaKDFLookupAddressTransform.json - Contains the configuration to create the Lambda function.
- SmartCityGlueTable.json - Contains the configuration to create the Glue table, whose schema is used by KDF for data format conversion to parquet.
- TrustPolicyForFirehose.json - Contains the trust policy for the role used with KDF.
- KDFSmartCityDeliveryStreamPolicy.json - Contains the IAM policy for the role used with KDF.
- KDFCreateDeliveryStreamSmartCityBikes.json - Contains the configuration to create the KDF delivery stream.