Skip to content

tutorial about how to set up

Vinllen Chen edited this page May 17, 2019 · 4 revisions

This is a tutorial about how to set up. You can also check out this Chinese tutorial.

We highly suggest to read wiki faq first if you meet some problems in usage. If this document can't solve your problem, please add into the issue.

The usage including the following 4 steps:

  1. download code
  2. compile code or use binary in bin.
  3. modify the configuration.
  4. run

1. download code

   Using git clone to clone the code or download the tar/zip package or you can download the binary directly without compile step.

2. compile code

   Users can compile in the way described in the README's Usage, but since the dependent libraries need to be downloaded. So you need to download govendor:go get -u github.com/kardianos/govendor first, then use govendor to download the dependencies (govendor sync command). Note: GOPATH needs to be configured when downloading dependencies. After downloading, you can directly call the build.sh script to compile and execute.
  Users can also use the collector binary file in the bin directory to run directly, but usually, the update of the binary file will lag behind the code update. You can print the version number -version to check this. We recommend using a compiled way to run.

3. modify the configuration

   This part should be the most confusing place for users. In order to meet the flexible configuration for users, more configuration items are currently open, but users do not need to manage so many items at first. Usually, only two configurations are needed for the synchronization of the replica set for the first time usage: the address of the source mongodb mongo_urls and the address of the destination mongodb tunnel.address (consistent with the url style of mongodb, the db node is separated by a comma), the rest of the configuration can remain default. If you have other needs, please read the following configuration item information carefully. Below I will outline how to configure different requirements.

3.1 how to configure for sharding?

   For the case where the source node is sharding, the address mongo_urls of the source mongodb needs to be configured with the address of each shard, separated by a semicolon (;). Same for tunnel.address. In addition, you need to configure context.storage.url, which is the address used to store the checkpoint. In the case of a replica set, this item does not need to be configured, because the default checkpoint is written to the source MongoDB. The default collection is mongoshake.ckpt_default. For sharding, since we do not know the address of the source mongos, users need to configure the checkpoint address. You need to pass in the address of the config-server.

3.2 how to configure when the tunnel type is tcp/rpc/kafka/file?

   Set tunnel to the type you want and modify the corresponding address tunnel.address.

3.3 how to configure the full sync?

   Starting with v1.5, we support the full sync, users can configure through sync_mode, all means full and increment sync, document means full sync only, and oplog means increment sync only.

3.4 how to enable DDL sync?

   Set replayer.dml_only false.

3.5 what does replayer.executor means?

   There are several fields under replayer.executor used for writing:

  • replayer.executor: do not modify this field in current open source version.
  • replayer.executor.upsert: oplog changes to Insert while Update found non-exist (_id or unique-index)
  • replayer.executor.insert_on_dup_update: oplog changes to Update while Insert found duplicated key (_id or unique-index)
  • replayer.conflict_write_to: how to solve the conflict.

3.6 the data read from tcp/rpc/kafka/file is corrupt, how to solve it?

   This is because the written data has control information in it, it needs to be received by the receiver, the control information is stripped, and then the subsequent docking is performed.

For the other field in configuration, please see the comments.

4. run

   Run ./collector -conf=collector.conf in bin directory. If you want print the log in the stdout, please add -verbose.

Clone this wiki locally