Skip to content

AltoStack/dynamodump

Repository files navigation

dynamodump: DynamoDB Backups and Restores

Table of Contents

Background

This is a fork of dynamodbdump that aims to complete some of it's TODOs.

What is it?

This tool performs a backup of a given DynamoDB table and pushes it to a given folder in s3 in a format compatible with the AWS DataPipeline functionality.

It is also capable of restoring a backup from s3 to a given table both from this tool or from a backup generated using the AWS DataPipeline functionality.

Why create this tool?

Using the AWS DataPipelines to backup DynamoDB tables spawns EMR clusters which can take some time, and for small tables it will cost you 20min of EMR runs for just a few seconds of backup time, which makes no sense.

This tool can be run in a command-line, in a docker container and ending up on a Kubernetes cron job very easily, allowing you to leverage your existing architecture without additional costs.

How to use it?

With the command-line

To build dynamodump, you can use the make build command or manually get the dependencies (using glide or go get) and then use go build to build.

Then you can use the dynamodump binary you just built to start a backup or restore.

Backup

Usage:

Usage:
  dynamodump backup [flags]

Flags:
  -s, --dynamo-table-batch-size int        Max number of records to read from the Dynamo table at once. Environment variable: DYN_DYNAMO_TABLE_BATCH_SIZE (default 1000)
  -w, --dynamo-table-batch-wait-time int   Number of milliseconds to wait between batches. Environment variable: DYN_WAIT_TIME (default 100)
  -t, --dynamo-table-name string           Name of the Dynamo table to actions. Environment variable: DYN_DYNAMO_TABLE_NAME (required)
  -o, --dynamo-table-region string         AWS region of the Dynamo table. Environment variable: DYN_DYNAMO_TABLE_REGION (required)
  -h, --help                               help for backup
  -f, --s3-bucket-folder-name string       Path inside the S3 bucket where to put actions. Environment variable: DYN_S3_BUCKET_FOLDER_NAME (required)
  -p, --s3-bucket-folder-name-suffix       Adds an autogenerated suffix folder named using the UTC date in the format YYYY-mm-dd-HH24-MI-SS to the provided S3 folder. Environment variable: DYN_S3_BUCKET_NAME_SUFFIX
  -b, --s3-bucket-name string              Name of the S3 bucket where to put the actions. Environment variable: DYN_S3_BUCKET_NAME (required)
  -d, --s3-bucket-region string            AWS region of the s3 Bucket. Environment variable: DYN_S3_BUCKET_REGION (required)

Example:

export AWS_PROFILE=awesome-profile
./dynamodump backup \
  -t table-name \
  -o eu-west-1 \
  -b bucket-name \
  -f some/folder \
  -d us-east-1

Todo

  • Cross Region Support (DynamoDB Table and S3 Bucket can be in different AWS regions)
  • Cross Account Support (DynamoDB Table and S3 Bucket can be in different AWS accounts)
  • Flag to force restore even if the _SUCCESS file is absent (Warn data may not be accurate)
  • Ability to define S3 StorageClass of backed up files
  • Ability to backup all DynamoDB Tables (Based on AWS Tags)
  • Ability to discover DynamoDB Tables (Based on AWS Tags)
  • Switch logging to logrus
  • Integrate https://goreleaser.com/
  • Migrate to https://github.com/spf13/cobra & https://github.com/spf13/viper

Contributing to the project

Anybody is more than welcome to create PR if you want to contribute to the project. A minimal testing and explanations about the problem will be asked but that's for sanity purposes.

We're friendly people, we won't bite if the code is not done the way we like! :)

If you don't have a lot of ideas but still want to contribute, we maintain a list of ideas we want to explore in the Todo section, you can start here!