Skip to content

AWS Lambda map-reduce stack to create COG and mosaic-json

License

Notifications You must be signed in to change notification settings

developmentseed/cogeo-watchbot

Repository files navigation

cogeo-watchbot

Convert file to COGs and create mosaic at scale using AWS Lambda

What is this

This repo host the code for a serverless architecture enabling creation of Cloud Optimized GeoTIFF and Mosaic-JSON at scale using a map-reduce like model.

Map-Reduce events

  1. Start job (distribute tasks)
  2. Run processing task in parallel (e.g COG creation)
  3. Run a summary task (e.g. Create mosaic-json)

Note: This work was inspired by the awesome ecs-watchbot.

Architecture

Serverless ?

Not really. To be able to run a map-reduce like model we need a fast and reliable database to store the job status. We use AWS ElastiCache Redis this part thus the stack is not fully serverless.

Deploy

Requirements

  • serverless
  • docker
  • aws account
  1. Install and configure serverless
# Install and Configure serverless (https://serverless.com/framework/docs/providers/aws/guide/credentials/)
$ npm install serverless -g 
  1. Create VPC and Redis Database
$ cd services/redis
$ sls deploy --region us-east-1
  1. Create Lambda package
$ make build
  1. Create Bucket (optional)

We need to create a bucket to store the COGs and mosaic-json. The bucket must be created before the lambda deploy.

$ aws s3api create-bucket --bucket my-bucket --region us-east-1
  1. Deploy the Watchbot Serverless stack
$ sls deploy --stage production --bucket my-bucket --region us-east-1

How To

Example

  1. Get a list of files you want to convert
$ aws s3 ls s3://spacenet-dataset/spacenet/SN5_roads/test_public/AOI_7_Moscow/PS-RGB/ --recursive | awk '{print "https://spacenet-dataset.s3.amazonaws.com/"$NF}' > list_moscow.txt

Note: we use https://spacenet-dataset.s3.amazonaws.com prefix because we don't want to add IAM role for this bucket

  1. Use scripts/create_job.py
$ pip install rio-cogeo
$ cd scripts/
$ cat ../list_moscow.txt | python -m create_job - \
   -p webp \
   --co blockxsize=256 \
   --co blockysize=256 \
   --op overview_level=6 \
   --op overview_resampling=bilinear > test.json
  1. Validate JSON (Optional)
$ jsonschema -i test.json schema.json
  1. upload to S3 and start processing
$ aws s3 cp spacenet_moscow.json s3://my-bucket/jobs/spacenet_moscow.json

About

AWS Lambda map-reduce stack to create COG and mosaic-json

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published