Skip to content

sepulworld/serverless-aws-emr-boilerplate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Serverless AWS EMR Boilerplate

This is a Serverless boilerplate setup meant to demo multiple ways in which you can provision AWS EMR on-demand through the following triggers:

sns_to_emr

SNS message body contains the input and output data parameters for EMR Step to run

  • Message body of SNS to contain comma separated string of args to pass to EMR Step
"s3://silvermullet-data-bucket/input/,s3://silvermullet-data-bucket/output/"

See launch_emr_via_sns folder

api_gateway_to_emr

Event driven by API gateway GET query with 'input' and 'output' query parameters for EMR step to work with. https://docs.aws.amazon.com/lambda/latest/dg/eventsources.html#eventsources-api-gateway-request

curl --header 'X-Api-Key: YOUR_API_KEY_SERVERLESS_CREATES' \
https://serverlessendpoint.aws.com/launch_emr_wordcount?input=s3://silvermullet-data-bucket/input/?output=s3://silvermullet-data-bucket/output/

See launch_emr_via_api_gateway folder

EMR configuration notes

The EMR job flow in this example will leverage EMR Instance Fleet

Requirements

Serverless.com framework

serverless.yml.exmaple contains items that will need to be updated to match your use case Replace <your_code_bucket> in serverless.yml with your s3 bucket for example.

This serverless.yml.example is a boilerplate example and is meant to provide a central place for use cases.