This project demonstrates an event driven image pipeline that when a user uploads an image to an s3 bucket, a lambda function will create a thumbnail and extract metadata from the image and send to another s3 bucket. Another lambda function will pull this from the SQS queue and metadata will be stored in a DynamoDB table. Any failed jobs will be in a separate queue (DLQ) which can be checked manually. We will also set up a Cloudwatch alarm if the DLQ receives any messages.
Event-driven architecture is a design pattern where events trigger actions in the system, instead of services calling through each other they can communicate through events.
Object storage service (store files in the cloud) S3 events can trigger lambda functions
Run code without managing servers
Managed message queue service
Dead Letter Queue (DLQ): Where failed messages go
NoSQL database (key-value and document store)
- Servers still exist
- You just don't manage them
- Provider handles: provisioning, scaling, patching
Good for (event-driven workloads, variable traffic, microservices, rapid development)
Bad for (long-processes(15min), predictable constant load, websockets(fargate better), large files processing(use ECS))
Events - something that happens in the past (user uploaded an image)
Producers - services that generate events (s3 produces upload event)
Consumers - services that respond to events (lambda functions)
Queues - middleware that routes events, services that store events (SQS)
- Why SQS is used between lambda functions
- Decoupling: If DynamoDB is temporarily down, messages queue up instead of failing.
- Retry logic: SQS automatically retries failed messages.
- Asynchronous processing: Image processing doesnt wait for DB writes.
- Why two lambda functions?
- Single Responsibility: Each function has one job
- Independent scaling: image processing and DB writes scale seperately.
- Easier debugging and cost efficient
- Why DLQ:
- Resilience: Failed messages arent lost
- Monitoring: Alerts on failures
- Manual review: Investigate problematic images
- Terraform
- AWS CLI
- Python 3.11
- Git
git clone https://github.com/wegoagain-dev/event-driven-image-pipeline.git
cd terraformyou will need to install the dependencies for each lambda function, as lambda functions are deployed as zip files which arent packaged with code
cd into each directory and run the docker command (it ensures you install them with python 3.11)
cd image_processor
docker run --rm -v "$(pwd)":/var/task -w /var/task \
--platform linux/amd64 \
--entrypoint pip \
public.ecr.aws/lambda/python:3.11 \
install -r requirements.txt -t .cd db_writer
docker run --rm -v "$(pwd)":/var/task -w /var/task \
--platform linux/amd64 \
--entrypoint pip \
public.ecr.aws/lambda/python:3.11 \
install -r requirements.txt -t .Our project has variables set in terraform/variables.tf. These variables can be customised for your project by modifying terraform.tfvars.example and renaming it to terraform.tfvars.
go to the terraform directory:
# Initialize Terraform
terraform init
# Preview changes
terraform plan
# Deploy
terraform apply
# Save outputs (optional)
terraform output > outputs.txt
# When finished, destroy the infrastructure
terraform destroy(if commands dont work copy the name and paste instead of the function name)
aws s3 cp s3://$(terraform output -raw upload_bucket_name)/
aws s3 ls s3://$(terraform output -raw thumbnail_bucket_name)/thumbnails/
aws logs tail /aws/lambda/$(terraform output -raw image_processor_function_name) --follow
aws dynamodb scan --table-name $(terraform output -raw dynamodb_table_name)
open $(terraform output -raw dashboard_url) # Mac
xdg-open $(terraform output -raw dashboard_url) # Linux
Common Issues
- Lambda Timeout Symptom: Lambda times out after 60 seconds Solutions:
- Increase timeout in
lambda.tf - Optimize image processing
- Increase memory (more CPU)
- Out of Memory Symptom: Lambda crashes with "Runtime exited with error" Solutions:
- Increase memory allocation
- Process smaller batches
- DynamoDB Throttling Symptom: ProvisionedThroughputExceededException Solutions:
- Switch to on-demand billing
- Increase provisioned capacity
- Implement exponential backoff
- SQS Message Stuck Symptom: Messages remain in queue Solutions:
- Check Lambda event source mapping
- Verify IAM permissions
- Check visibility timeout
- S3 Trigger Not Working Symptom: Lambda not invoked on upload Solutions:
- Verify S3 event notification configuration
- Check Lambda permission for S3
- Ensure file extension matches filter
- Set up SNS topic for alarm notifications
- Add github integration
- API Gateway for direct uploads
