Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .eslintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"extends": "airbnb-base",
"rules": {
"func-names": ["error", "never"],
"indent": ["error", 4],
"semi": ["error", "never"],
"brace-style": ["error", "stroustrup"],
'no-restricted-syntax': [
"error",
"ForInStatement",
"LabeledStatement",
"WithStatement"
],
"comma-dangle": ["error", "never"],
"no-unused-expressions": 0,
"class-methods-use-this": 0,
"import/no-extraneous-dependencies": [2, {}],
"no-param-reassign": 0,
"prefer-template": 0,
"no-console": 0,
"arrow-parens": 0,
"arrow-body-style": 0
}
}
20 changes: 20 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
language: node_js
node_js:
- '6.10'
addons:
postgresql: '9.4'
after_success: npm run coverage
before_deploy:
- npm run deploy
cache:
directories:
- node_modules
deploy:
provider: releases
skip_cleanup: true
api_key:
secure: f44oSsSYiUDdT8SeFZRchsAdQrCePt/IqVdKuR8aVkbODatX7hkAOgt/VoTqfDM/Pb2tsfue+3+xO0IOLJSZxLBzb8wysfxhtooip8BVWn8qI9o5B4N6Z6LbpRW5822anilD77MuQetYsiH4ZCObt0ZFsgphuJEMMqVEYTUffP/4mZhPneQUjCZUsZbFT7WbIf8aXOuGOYr6KubgVu0IK/bqH2Fc7VbtqTAoatLa3iJXrBX3YFOpV2peqBraCpo2CI7hWR3ma47k++1NM/NI40LE52LJ/81A2QQG1WQrKrC3u9l54RAKE3OvmU6Lm1o+Ikm7NadcwaR/dV/zUYeVoz57zJJWS0LXwuYwEju0A1loeJpobfZlWG8kBiEveR206x3oOvuNMIyUvWnhGddY5y7GnNqtMn414cxaxcWGRSEdH0+j+dU+JHxCCENX3cLyQNBgGWFVnggljxcx5UgOBsJdjOmuuEEsHITzpdArVggXXgOmEddFtgzlNUtE+kjMOjCbS0EGn81kH+nUT9M5Rrcx7VBQdPG8wZxeYH6o8vHft0bKGO5L2xoktiokQvmNpvuVBOnuwe3c6u3rgsAhKYufxdTqkSqvcDTw7bOSBUSuyErvBViWKOUwjZPTyXfUnSsZj2T/H+gD221TTGOOJBFIcaHCl73RtoV8H4IlRIk=
file: dist/pgdump-aws-lambda.zip
on:
branch: general-improvements
tags: true
72 changes: 45 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,60 @@
# pgdump-aws-lambda

[![Build Status](https://travis-ci.org/jameshy/pgdump-aws-lambda.svg?branch=master)](https://travis-ci.org/jameshy/pgdump-aws-lambda)
# Overview

A simple AWS Lambda function that runs pg_dump and streams the output to s3.

Using AWS, you can schedule it to run periodically.


# Instructions

1. Create an AWS lambda function using the [zip](https://github.com/jameshy/pgdump-aws-lambda/releases/download/v0.0.2/pgdump-aws-lambda.zip) as "function package".
2. Add a "CloudWatch Events - Schedule" trigger.
3. In the 'rule', setup your schedule and configure the lamba input to something like:
```json
{
"PGHOST": "database.myserver.com",
"PGUSER": "my-user",
"PGPASSWORD": "my-password",
"PGDATABASE": "my-database-name",
"S3_BUCKET": "my-s3-backup-bucket",
"SUBKEY": "production"
}
```
An AWS Lambda function that runs pg_dump and streams the output to s3.

It can be configured to run periodically using CloudWatch events.

# Quick start

1. Create an AWS lambda function:
- Runtime: Node.js 6.10
- Code entry type: Upload a .ZIP file
([pgdump-aws-lambda.zip](https://github.com/jameshy/pgdump-aws-lambda/releases/download/v0.0.2/pgdump-aws-lambda.zip))
- Configuration -> Advanced Settings
- Timeout = 5 minutes
- Select a VPC and security group (must be suitable for connecting to the target database server)
2. Create a CloudWatch rule:
- Event Source: Fixed rate of 1 hour
- Targets: Lambda Function (the one created in step #1)
- Configure input -> Constant (JSON text) and paste your config, e.g.:
```json
{
"PGDATABASE": "oxandcart",
"PGUSER": "staging",
"PGPASSWORD": "uBXKFecSKu7hyNu4",
"PGHOST": "database.com",
"S3_BUCKET" : "my-db-backups",
"ROOT": "hourly-backups"
}
```

Note: you can test the lambda function using the "Test" button and providing config like above.

**AWS lambda has a 5 minute maximum execution time for lambda functions, so your backup must take less time that that.**

# File Naming

This function will store your backup with the following s3 key:

s3://${S3_BUCKET}/${SUBKEY}/YYYY-MM-DD/YYYY-MM-DD@HH-mm-ss.backup
s3://${S3_BUCKET}${ROOT}/YYYY-MM-DD/YYYY-MM-DD@HH-mm-ss.backup

# PostgreSQL version compatibility

This script uses PostgreSQL pg_dump utility from PostgreSQL 9.6.2.

# Loading your own `pg_dump` binary
1. spin up Amazon AMI image on EC2 (since the lambda function will run
1. Spin up an Amazon AMI image on EC2 (since the lambda function will run
on Amazon AMI image, based off of CentOS, using it would have the
best chance of being compatiable)
2. install postgres as normal (current default version is 9.5, but you can find
packages on the official postgres site for 9.6)
3. run `scp -i YOUR-ID.pem ec2-user@AWS_IP:/usr/bin/pg_dump ./bin/` and `scp -i YOUR-ID.pem ec2-user@AWS_UP:/usr/lib64/libpq.so.5.8 ./bin/libpq.so.5`
best chance of being compatible)
2. Install PostgreSQL using yum. You can install the latest version from the [official repository](https://yum.postgresql.org/repopackages.php#pg96).
3. Add a new directory for your pg_dump binaries: `mkdir bin/postgres-9.5.2`
3. Copy the binaries
- `scp -i YOUR-ID.pem ec2-user@AWS_IP:/usr/bin/pg_dump ./bin/postgres-9.5.2/pg_dump`
- `scp -i YOUR-ID.pem ec2-user@AWS_UP:/usr/lib64/libpq.so.5.8 ./bin/postgres-9.5.2/libpq.so.5`
4. When calling the handler, pass the env variable PGDUMP_PATH=postgres-9.5.2 to use the binaries in the bin/postgres-9.5.2 directory.

NOTE: `libpq.so.5.8` is found out by running `ll /usr/lib64/libpq.so.5`
and looking at where the symlink goes to.

Binary file removed bin/libpq.so.5
Binary file not shown.
54 changes: 54 additions & 0 deletions bin/makezip.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/bin/bash
set -e

SCRIPT=`readlink -f $0`
SCRIPTPATH=`dirname $SCRIPT`
PROJECTROOT=`readlink -f $SCRIPTPATH/..`
FILENAME="pgdump-aws-lambda.zip"

command_exists () {
type "$1" &> /dev/null ;
}

if ! command_exists zip ; then
echo "zip command not found, try: sudo apt-get install zip"
exit 0
fi


cd $PROJECTROOT

echo "creating bundle.."
# create a temp directory for our bundle
BUNDLE_DIR=$(mktemp -d)
# copy entire app into BUNDLE_DIR
cp -r * $BUNDLE_DIR/

# prune things from BUNDLE_DIR
echo "running npm prune.."
cd $BUNDLE_DIR
# prune dev-dependancies from node_modules
npm prune --production >> /dev/null

rm -rf dist coverage test


# create and empty the dist directory
if [ ! -d $PROJECTROOT/dist ]; then
mkdir $PROJECTROOT/dist
fi
rm -rf $PROJECTROOT/dist/*

# create zip of bundle/
echo "creating zip.."
zip -q -r $FILENAME *
echo "zip -q -r $FILENAME *"
mv $FILENAME $PROJECTROOT/dist/$FILENAME

echo "successfully created dist/$FILENAME"

# remove bundle/
rm -rf $BUNDLE_DIR


cd $PROJECTROOT
Binary file removed bin/pg_dump
Binary file not shown.
Binary file added bin/postgres-9.6.2/libpq.so.5
Binary file not shown.
Binary file added bin/postgres-9.6.2/pg_dump
Binary file not shown.
131 changes: 2 additions & 129 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,130 +1,3 @@
// string formatting library
const format = require('string-format')
const AWS = require('aws-sdk')
const spawn = require('child_process').spawn
const path = require('path')
const moment = require('moment')
const through2 = require('through2')
const handler = require('./lib/handler')

// enabling "Method mode" for String.format
format.extend(String.prototype)

// configure AWS to log to stdout
AWS.config.update({
logger: process.stdout
})

// config
var PG_DUMP_ENV = {
LD_LIBRARY_PATH: './bin'
}

function uploadToS3(env, readStream, key, cb) {
console.log(format('streaming to s3 bucket={}, key={} region={}', env.S3_BUCKET, key, env.S3_REGION))
var s3Obj = new AWS.S3({params: {
Bucket: env.S3_BUCKET,
Key: key,
ACL: 'bucket-owner-full-control'
}})

s3Obj.upload({Body: readStream})
.send(function(err, data) {
if (err) {
console.log(err.stack)
cb(err)
}
else {
console.log('Uploaded the file at', data.Location)
cb(null)
}
})
}

/*
Invokes bin/pg_dump binary with configured environment variables
streaming the output to s3
*/
exports.handler = function(event, context, cb) {
// using variables from the lambda event, prepare environment variables for pg_dump
var env = Object.assign({}, PG_DUMP_ENV, event)

// use the region provided by the event or default to eu-west-1
env.S3_REGION = env.S3_REGION || 'eu-west-1'

if (!env.PGDATABASE || !env.S3_BUCKET) {
return cb('configuration not found in the event data')
}

// determine the filename for our dump file, using the current date
var timestamp = moment().format('DD-MM-YYYY@HH-mm-ss')
var day = moment().format('YYYY-MM-DD')
var filename = format('{}-{}.backup', env.PGDATABASE, timestamp)

// determine the s3 key (includes directory)
var subkey = env.SUBKEY || ''
var key = path.join(subkey, day, filename)

// spawn pg_dump process
var pgDumpProcess = spawn('./bin/pg_dump', ['-Fc'], {
env: env
})

// capture stderr for printing when pg_dump fails
var stderr = ''
pgDumpProcess.stderr.on('data', (data) => {
stderr += data.toString('utf8')
})

// check for errors when pg_dump finishes
pgDumpProcess.on('close', (code) => {
if (code === 1) {
return cb(new Error('pg_dump process failed: {}'.format(stderr)))
}
if (code === 0 && !pgDumpStarted) {
return cb(new Error('pg_dump didnt send us a recognizable dump (output did not start with PGDMP)'))
}
})

var pgDumpStarted

// check the first few bytes to check we have a valid stream
// then pipe the rest directly to s3
var buffer = through2(function (chunk, enc, callback) {
this.push(chunk)
// if stdout begins with 'PGDMP', we know pg_dump is going strong, so continue with dumping
// we assume that the first chunk is large enough to contain PGDMP under all circumstances
if (!pgDumpStarted && chunk.toString('utf8').startsWith('PGDMP')) {
pgDumpStarted = true
uploadToS3(env, buffer, key, function(err, result) {
if (!err) {
var msg = format('successfully dumped {} to {}', env.PGDATABASE, key)
console.log(msg)
return cb(null, msg)
}
else {
return cb(err, result)
}
})
}
return callback()
})

// pipe pg_dump to buffer
pgDumpProcess.stdout.pipe(buffer)
}

// for testing locally
// PG_DUMP_ENV = {
// PGDATABASE: 'my-database',
// PGUSER: 'postgres',
// PGPASSWORD: 'dev',
// PGHOST: 'localhost',
// LD_LIBRARY_PATH: './lib',
// SUBKEY: 'testing/',
// S3_BUCKET: 'my-database-backups'
// }
// exports.handler(null, null, (err, response) => {
// if (err) {
// console.error(err)
// }
// })
module.exports.handler = handler
48 changes: 48 additions & 0 deletions lib/handler.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
const utils = require('./utils')
var uploadS3 = require('./upload-s3')
var pgdump = require('./pgdump')

const DEFAULT_CONFIG = {
S3_REGION: 'eu-west-1'
}

module.exports = function (event, context, cb) {
const config = Object.assign({}, DEFAULT_CONFIG, event)

if (!config.PGDATABASE) {
return cb('PGDATABASE not provided in the event data')
}
if (!config.S3_BUCKET) {
return cb('S3_BUCKET not provided in the event data')
}

// determine the path for the database dump
const key = utils.generateBackupPath(
config.PGDATABASE,
config.ROOT
)

// spawn pg_dump process
const pgdumpProcess = pgdump(config)

return pgdumpProcess
.then(readableStream => {
// stream to s3 uploader
return uploadS3(readableStream, config, key)
.then(() => {
cb(null)
})
})
.catch(e => {
console.error(e)
})
}
// const event = {
// PGDATABASE: 'postgres',
// PGUSER: 'postgres',
// S3_BUCKET: 'oxandcart-db-backups',
// ROOT: 'test'
// }
// module.exports(event, {}, result => {
// console.log('handler finishsed', result)
// })
Loading