s3-sns-lambda-replay

A simple tool for replaying S3 file creation Lambda invocations. This is useful for backfill or replay on real-time ETL pipelines that run transformations in Lambdas triggered by S3 file creation events.

Steps:

Collect inputs from user
Scan S3 for filenames that need to be replayed
Batch S3 files into payloads for Lambda invocations
Spawn workers to handle individual Lambda invocations/retries
Process the work queue, keeping track of progress in a file in case of interrupts

Getting Started

First step is to setup a python3 venv to hold our deps

./setup-venv.sh

Run the command and follow the prompts

python s3-lambda-replay.py

Options

Run the help command for a full list of available command line options

python s3-lambda-replay.py --help

Example Command Line Execution

Note that we must escape the $ This example is also included in the file run_replay.sh

python3 s3-lambda-replay.py \
-b gamesight-collection-pipeline-us-west-2-prod \
-p twitch/all/chatters/\$LATEST/objects/dt=2020-01-02-08-00/,twitch/all/chatters/\$LATEST/objects/dt=2020-01-02-09-00/ \
-l gstrans-prod-twitch-all-chatters

Using the command line also allows us to quickly include paths that aren't along / separation lines. For example, we can use the path twitch/all/chatters/\$LATEST/objects/dt=2020-01-02-1 to look at all records between 10:00 and 20:00 on 2020-01-02, or twitch/all/chatters/\$LATEST/objects/dt=2020-01-02 to just run all of the objects from that day.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
ops		ops
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_replay.sh		run_replay.sh
s3-lambda-replay.py		s3-lambda-replay.py
setup-venv.sh		setup-venv.sh
skaffold.yaml		skaffold.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

s3-sns-lambda-replay

Getting Started

Options

Example Command Line Execution

About

Contributors 3

Languages

License

Gamesight/s3-sns-lambda-replay

Folders and files

Latest commit

History

Repository files navigation

s3-sns-lambda-replay

Getting Started

Options

Example Command Line Execution

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages