Zenko Backbeat is the core engine for asynchronous replication, optimized for queuing metadata updates and dispatching work to long-running tasks in the background.
JavaScript Other
Switch branches/tags
Clone or download
Permalink
Failed to load latest commit information.
.github FT: ZNC-26: add issue template Apr 10, 2018
bin improvement: ZENKO-762 use redis ha for all redis needs Jul 26, 2018
conf rf: ZENKO-744 more renaming of lifecycle component strings Jul 16, 2018
docs ft: increase metrics expiry window to 24hrs Aug 8, 2018
eve bugfix: use nodejs 8 for tests Aug 1, 2018
extensions bf: ZENKO-644 auto apply EODM for lifecycle acct Jul 31, 2018
lib Merge branch 'feature/ZENKO-925-increaseMetricsExpiry' into q/8.0 Aug 9, 2018
res Docs/update readme (#12) Jul 17, 2017
tests bf: ZENKO-644 auto apply EODM for lifecycle acct Jul 31, 2018
.dockerignore FT: Enable Orbit Feb 2, 2018
.eslintrc ft: core engine May 27, 2017
.gitignore improvement: Add dump.rdb to gitignore Jul 6, 2018
DESIGN.md bf: switch to node-rdkafka Mar 15, 2018
Dockerfile Speed up rebuilding docker image Apr 5, 2018
LICENSE ft: update license Jul 17, 2017
README.md ft: support for aws-s3 replication endpoint Oct 27, 2017
Using.md FT: Add using documentation Sep 5, 2017
_config.yml Set theme jekyll-theme-cayman Dec 15, 2017
circle.yml Merge remote-tracking branch 'origin/improvement/CIEnvironmentVariabl… Jul 17, 2018
docker-entrypoint.sh rf: ZENKO-744 more renaming of lifecycle component strings Jul 16, 2018
index.js Merge remote-tracking branch 'origin/improvement/CIEnvironmentVariabl… Jul 17, 2018
package-lock.json ft: increase metrics expiry window to 24hrs Aug 8, 2018
package.json ft: increase metrics expiry window to 24hrs Aug 8, 2018

README.md

Zenko Backbeat

backbeat logo

CircleCI Scality CI

OVERVIEW

Backbeat is an engine with a messaging system at its heart. It's part of Zenko, Scality’s Open Source Multi-Cloud Data Controller. Learn more about Zenko at Zenko.io

Backbeat is optimized for queuing metadata updates and dispatching work to long-running tasks in the background. The core engine can be extended for many use cases, which are called extensions, as listed below.

EXTENSIONS

Asynchronous Replication

This feature replicates objects from one S3 bucket to another S3 bucket in a different geographical region. The extension uses the local Metadata journal as the source of truth and replicates object updates in a FIFO order.

DESIGN

Please refer to the Design document

QUICKSTART

This guide assumes the following:

  • Using MacOS
  • brew is installed (get it here)
  • node is installed (version 6.9.5)
  • npm is installed (version 3.10.10)
  • aws is installed (version 1.11.1)

Run kafka and zookeeper

Install kafka and zookeeper

brew install kafka && brew install zookeeper

Make sure you have /usr/local/bin in your PATH env variable (or wherever your homebrew programs are installed):

echo 'export PATH="$PATH:/usr/local/bin"' >> ~/.bash_profile

Start kafka and zookeeper servers

mkdir ~/kafka && \
cd ~/kafka && \
curl http://apache.claz.org/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz | tar xvz && \
sed 's/zookeeper.connect=.*/zookeeper.connect=localhost:2181\/backbeat/' \
kafka_2.11-0.11.0.0/config/server.properties > \
kafka_2.11-0.11.0.0/config/server.properties.backbeat

Start the zookeeper server:

zookeeper-server-start ~/kafka/kafka_2.11-0.11.0.0/config/zookeeper.properties

In a new shell, start the kafka server:

kafka-server-start ~/kafka/kafka_2.11-0.11.0.0/config/server.properties.backbeat

Create a zookeeper node and kafka topic

In a new shell, connect to the zookeeper server with the ZooKeeper chroot /backbeat path:

zkCli -server localhost:2181/backbeat

Create the replication-populator node:

create /replication-populator my_data

We may leave the zookeeper server now:

quit

Create the backbeat-replication kafka topic:

kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication

Run Scality Components

Start Vault and Scality S3 servers

Start the Vault server (this requires access to the private Vault repository):

git clone https://github.com/scality/Vault ~/replication/vault && \
cd ~/replication/vault && \
npm i && \
chmod 400 ./tests/utils/keyfile && \
VAULT_DB_BACKEND=MEMORY node vaultd.js

In a new shell, start the Scality S3 server:

git clone https://github.com/scality/s3 ~/replication/s3 && \
cd ~/replication/s3 && \
npm i && \
S3BACKEND=file S3VAULT=scality npm start

Setup replication with backbeat

In a new shell, clone backbeat:

git clone https://github.com/scality/backbeat ~/replication/backbeat && \
cd ~/replication/backbeat && \
npm i

Now, create an account and keys:

VAULTCLIENT=~/replication/backbeat/node_modules/vaultclient/bin/vaultclient && \
$VAULTCLIENT create-account \
--name backbeatuser \
--email dev@null \
--port 8600 >> backbeat_user_credentials && \
$VAULTCLIENT generate-account-access-key \
--name backbeatuser \
--port 8600 >> backbeat_user_credentials && \
cat backbeat_user_credentials

Output will look something like (this output is stored for reference in the file backbeat_user_credentials):

...
{
    "id": "8CFJQ2Z3R6LR0WTP5VDS",
    "value": "gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR",
    "createDate": "2017-08-03T00:17:57Z",
    "lastUsedDate": "2017-08-03T00:17:57Z",
    "status": "Active",
    "userId": "038628340774"
}

Store the account's credentials using the "id" and "value" fields:

aws configure --profile backbeatuser

The completed prompt should look like:

AWS Access Key ID [None]: 8CFJQ2Z3R6LR0WTP5VDS
AWS Secret Access Key [None]: gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR
Default region name [None]:
Default output format [None]:

Set up replication on your buckets:

node ~/replication/backbeat/bin/setupReplication.js setup \
--source-bucket source-bucket \
--source-profile backbeatuser \
--target-bucket target-bucket \
--target-profile backbeatuser

Run the backbeat queue populator:

npm --prefix ~/replication/backbeat run queue_populator

In a new shell, run the backbeat queue processor:

npm --prefix ~/replication/backbeat run queue_processor

You are now ready to put data on source-bucket and watch it replicate to target-bucket!

Put an object on the source-bucket:

echo 'content to be replicated' > replication_contents && \
aws s3api put-object \
--bucket source-bucket \
--key object-to-replicate \
--body replication_contents \
--endpoint http://localhost:8000 \
--profile backbeatuser

Check the replication status of the object we have just put:

aws s3api head-object \
--bucket source-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser

The object's "ReplicationStatus" should either be "PENDING", or if some time has passed, then it should be "COMPLETED".

Check if the object has been replicated to the target bucket:

aws s3api head-object \
--bucket target-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser

After some time, the object's "ReplicationStatus" should be "REPLICA". 😺

Structure

In our $HOME directory, we now have the following directories:

$HOME
├── kafka
│   └── kafka_2.11-0.11.0.0
├── replication
    ├── backbeat
    ├── s3
    └── vault