Zenko Backbeat is the core engine for asynchronous replication, optimized for queuing metadata updates and dispatching work to long-running tasks in the background.
Switch branches/tags
42-hackathon FT/addIssueTemplate ZENKO-51-ft/ingestion add/max-bytes bennett-docker bf/S3C-849-fixAccountARNParsing bf/ZENKO-463 bf/crashing-mongo-logreader bugfix/S3C-1886-crrMetricsMissProcessedEvents bugfix/ZENKO-590/lifecycle bugfix/ZENKO-623/use-range-calculation-for-CRR bugfix/ZENKO-688_backbeat_processor_health_check bugfix/ZENKO-698-task-queue-per-cloud bugfix/ZENKO-748-remove-extra-req bugfix/ZENKO-793/gcp-part-size-less-than-1024-parts bugfix/ZENKO-1051_Change_to_sentinel_redis_config bugfix/ZENKO-1120-decrPendingOnError bugfix/ZENKO-1165/do-not-return-duplicate-entries-in-listing bugfix/ZENKO-1195/object-monitoring-during-retry bugfix/ZENKO-1374-fixIngestionProvisioner bugfix/ZENKO-1378-ingestionConfigAndReader chore/removeCluesoChanges development/7.4 development/8.0 development/8.1 doc/prettydocs feature/S3C-1622-CRRMetricsBackport feature/S3C-1640-add-retry-role-auth feature/S3C-1640-pending-status-for-retry feature/ZENKO-203-md-ingestion-design feature/ZENKO-476-addDockerBuildStepForEve feature/ZENKO-562 feature/ZENKO-587/preferredReadLocation feature/ZENKO-590/lifecycle feature/ZENKO-708-designDocForTransitionPolicies feature/ZENKO-733/lifecycle-transition-policies-ft-tests feature/ZENKO-1056-pauseResumeReturnsStatus feature/ZENKO-1163/dont-replication-deletions feature/ZENKO-1283-ingestionProcessorPauseResume feature/ZENKO-1346-ingestionUpdateLocation feature/ZENKO-1380-ingestionSourceCredentials feature/topic-prefix fix-rdkafka fix/S3C-752-bootstrapListParsing fix/S3C-1134 fix/S3C-1222/use-replication-status-processor-for-MD-update fix/all-errors-are-from-source forward/orbit-to-master ft/GCP ft/S3C-971/add-crr-to-aws-docs ft/S3C-979-add-deephealthcheck ft/S3C-1146-processing-status ft/S3C-1147-consumers ft/S3C-1200/add-lifecycle-buckets-to-zookpeer-bak ft/S3C-S3C-350-queue-processor ft/ZENKO-143-transient-source-gc ft/ZENKO-143-transient-source-poc ft/ZENKO-143-transientSourceGC ft/ZENKO-265-Ingestion-Producer-Class ft/ZENKO-265-Ingestion-Producer-DEMO ft/ZENKO-265-Ingestion-Producer-Tests ft/ZENKO-265-Ingestion-Producer ft/ZENKO-266-Ingestion-Consumer ft/ZENKO-315/CRRWithoutVersioning ft/ZENKO-453-extendSetupScriptMultiSites ft/add-replication-retry-doc ft/clueso ft/config-for-local-deployment ft/consumeS3conn ft/crr-retry/test ft/design ft/eve-ci-7.2 ft/ingestionTests ft/mongo ft/mongoOnMaster ft/mongoOrbit ft/poc-multi-scality-destinations ft/update-README-for-setup-script greenkeeper/initial hotfix/7.2.0 hotfix/7.4.0 hotfix/7.4.0.1 hotfix/7.4.1 hotfix/7.4.2 improvement/ZENKO-631/preferred_read improvement/circleci2.0 improvement/entry-point-tunables improvement/fixmocha improvement/pauseResumeMixinAndUtils master nightly_integration one-second-timeout q/7.4 q/8.0 q/8.1 rel/7.1 rel/7.2 rel/7.4-bea rel/7.4-beta rel/7.4 rf/ZENKO-265-Ingestion-Producer rf/configurable-redis-expiry snapshot/ingestion-producer test/ZENKO-300-integration test/ZENKO-1105-failingTest test/aws-sdk-version test/fix-ingestion test/kube-locConfig test/mongoWithLifecycleFix test/pendingMetrics test/sanity-stream tmp/MDIngestion update-doc-for-crr-to-public-cloud w/8.0/bugfix/S3C-1886-crrMetricsMissProcessedEvents w/8.0/improvement/fixmocha w/8.1/bugfix/S3C-1886-crrMetricsMissProcessedEvents wabernat-docs wabernatScality-patch-1 wip/S3C-938-useCRRServiceAccount-zenkoRefactor wip/backbeatpoc wip/disableTimeouts wip/ingestion-setup wip/metrics-backport wip/philz-backport z/1.0 zenko/1.0
Nothing to show
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github FT: ZNC-26: add issue template Apr 10, 2018
bin ft: ZENKO-1240 ingestion init management populator Dec 7, 2018
conf feature: ZENKO-733 Lifecycle transitions policies Dec 11, 2018
docs ft: ZENKO-1282 add ingestion schedule resume route Nov 15, 2018
eve feature: ZENKO-1376 tests for bbClient ingestion methods Dec 14, 2018
extensions feature: ZENKO-1382 Lifecycle Auth Dec 14, 2018
lib Merge branch 'bugfix/ZENKO-1373-ingestionPopulatorZkPath' into q/8.1 Dec 13, 2018
res Docs/update readme (#12) Jul 17, 2017
tests feature: ZENKO-1376 tests for bbClient ingestion methods Dec 14, 2018
.dockerignore FT: Enable Orbit Feb 2, 2018
.eslintrc ft: core engine May 27, 2017
.gitignore improvement: Add dump.rdb to gitignore Jul 6, 2018
DESIGN.md backport: S3C-1643 node-rdkafka Oct 12, 2018
Dockerfile Speed up rebuilding docker image Apr 5, 2018
LICENSE ft: update license Jul 17, 2017
README.md improvement: remove old CI tags Sep 21, 2018
Using.md FT: Add using documentation Sep 5, 2017
_config.yml Set theme jekyll-theme-cayman Dec 15, 2017
circle.yml Merge remote-tracking branch 'origin/feature/S3C-1640-functional-test… Nov 2, 2018
docker-entrypoint.sh feature: ZENKO-733 Lifecycle transitions policies Dec 11, 2018
index.js Merge remote-tracking branch 'origin/feature/S3C-1640-functional-test… Nov 2, 2018
package-lock.json feature: ZENKO-733 Lifecycle transition ft tests Dec 13, 2018
package.json feature: ZENKO-1376 tests for bbClient ingestion methods Dec 14, 2018

README.md

Zenko Backbeat

backbeat logo

OVERVIEW

Backbeat is an engine with a messaging system at its heart. It's part of Zenko, Scality’s Open Source Multi-Cloud Data Controller. Learn more about Zenko at Zenko.io

Backbeat is optimized for queuing metadata updates and dispatching work to long-running tasks in the background. The core engine can be extended for many use cases, which are called extensions, as listed below.

EXTENSIONS

Asynchronous Replication

This feature replicates objects from one S3 bucket to another S3 bucket in a different geographical region. The extension uses the local Metadata journal as the source of truth and replicates object updates in a FIFO order.

DESIGN

Please refer to the Design document

QUICKSTART

This guide assumes the following:

  • Using MacOS
  • brew is installed (get it here)
  • node is installed (version 6.9.5)
  • npm is installed (version 3.10.10)
  • aws is installed (version 1.11.1)

Run kafka and zookeeper

Install kafka and zookeeper

brew install kafka && brew install zookeeper

Make sure you have /usr/local/bin in your PATH env variable (or wherever your homebrew programs are installed):

echo 'export PATH="$PATH:/usr/local/bin"' >> ~/.bash_profile

Start kafka and zookeeper servers

mkdir ~/kafka && \
cd ~/kafka && \
curl http://apache.claz.org/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz | tar xvz && \
sed 's/zookeeper.connect=.*/zookeeper.connect=localhost:2181\/backbeat/' \
kafka_2.11-0.11.0.0/config/server.properties > \
kafka_2.11-0.11.0.0/config/server.properties.backbeat

Start the zookeeper server:

zookeeper-server-start ~/kafka/kafka_2.11-0.11.0.0/config/zookeeper.properties

In a new shell, start the kafka server:

kafka-server-start ~/kafka/kafka_2.11-0.11.0.0/config/server.properties.backbeat

Create a zookeeper node and kafka topic

In a new shell, connect to the zookeeper server with the ZooKeeper chroot /backbeat path:

zkCli -server localhost:2181/backbeat

Create the replication-populator node:

create /replication-populator my_data

We may leave the zookeeper server now:

quit

Create the backbeat-replication kafka topic:

kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication

Run Scality Components

Start Vault and Scality S3 servers

Start the Vault server (this requires access to the private Vault repository):

git clone https://github.com/scality/Vault ~/replication/vault && \
cd ~/replication/vault && \
npm i && \
chmod 400 ./tests/utils/keyfile && \
VAULT_DB_BACKEND=MEMORY node vaultd.js

In a new shell, start the Scality S3 server:

git clone https://github.com/scality/s3 ~/replication/s3 && \
cd ~/replication/s3 && \
npm i && \
S3BACKEND=file S3VAULT=scality npm start

Setup replication with backbeat

In a new shell, clone backbeat:

git clone https://github.com/scality/backbeat ~/replication/backbeat && \
cd ~/replication/backbeat && \
npm i

Now, create an account and keys:

VAULTCLIENT=~/replication/backbeat/node_modules/vaultclient/bin/vaultclient && \
$VAULTCLIENT create-account \
--name backbeatuser \
--email dev@null \
--port 8600 >> backbeat_user_credentials && \
$VAULTCLIENT generate-account-access-key \
--name backbeatuser \
--port 8600 >> backbeat_user_credentials && \
cat backbeat_user_credentials

Output will look something like (this output is stored for reference in the file backbeat_user_credentials):

...
{
    "id": "8CFJQ2Z3R6LR0WTP5VDS",
    "value": "gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR",
    "createDate": "2017-08-03T00:17:57Z",
    "lastUsedDate": "2017-08-03T00:17:57Z",
    "status": "Active",
    "userId": "038628340774"
}

Store the account's credentials using the "id" and "value" fields:

aws configure --profile backbeatuser

The completed prompt should look like:

AWS Access Key ID [None]: 8CFJQ2Z3R6LR0WTP5VDS
AWS Secret Access Key [None]: gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR
Default region name [None]:
Default output format [None]:

Set up replication on your buckets:

node ~/replication/backbeat/bin/setupReplication.js setup \
--source-bucket source-bucket \
--source-profile backbeatuser \
--target-bucket target-bucket \
--target-profile backbeatuser

Run the backbeat queue populator:

npm --prefix ~/replication/backbeat run queue_populator

In a new shell, run the backbeat queue processor:

npm --prefix ~/replication/backbeat run queue_processor

You are now ready to put data on source-bucket and watch it replicate to target-bucket!

Put an object on the source-bucket:

echo 'content to be replicated' > replication_contents && \
aws s3api put-object \
--bucket source-bucket \
--key object-to-replicate \
--body replication_contents \
--endpoint http://localhost:8000 \
--profile backbeatuser

Check the replication status of the object we have just put:

aws s3api head-object \
--bucket source-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser

The object's "ReplicationStatus" should either be "PENDING", or if some time has passed, then it should be "COMPLETED".

Check if the object has been replicated to the target bucket:

aws s3api head-object \
--bucket target-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser

After some time, the object's "ReplicationStatus" should be "REPLICA". 😺

Structure

In our $HOME directory, we now have the following directories:

$HOME
├── kafka
│   └── kafka_2.11-0.11.0.0
├── replication
    ├── backbeat
    ├── s3
    └── vault