Skip to content

Migrating Sharded Cluster Online

Leif Walsh edited this page Feb 13, 2014 · 2 revisions

Migrating from MongoDB: Sharded cluster (online)

This guide explains how to do an online data migration from MongoDB to TokuMX, for a sharded cluster, converting the existing data in MongoDB to TokuMX on a shard-by-shard basis

For other migration strategies, start with Migrating from MongoDB and Migrating from MongoDB: Sharded clusters.

  1. Turn off the balancer, and wait for any migrations to complete.

    > sh.setBalancerState(false);
    > while (sh.isBalancerRunning()) {
    ...     print('waiting...');
    ...     sleep(1000);
    ... }
    
  2. Perform an online migration of each shard individually as a replica set, until all shards are caught up. Learn how in Migrating from MongoDB: Replica set (online). These migrations can be done in parallel.

  3. Leave one instance of mongo2toku running from each MongoDB shard to its corresponding TokuMX shard.

  4. Shut down one of the MongoDB config servers. This will not stop application reads and writes, but it will stop splits and chunk migrations from completing.

  5. Perform an offline migration of the shut down config server to TokuMX, and import its backup once for each TokuMX config server, to bring the set of TokuMX config servers online.

    See Migrating from MongoDB: Sharded cluster (offline, individual shards) for details on how to handle shard and config server hostname changes when migrating MongoDB sharded cluster config data.

  6. Start TokuMX mongos routers on the TokuMX config servers. In the last step, the config data should have been modified so that the mongos routers can find the TokuMX shards, you can check sh.status() at this point on the TokuMX cluster to verify that everything is working normally.

  7. Pause your application's writes and wait for all mongo2toku processes to report that they are "fully synced".

  8. Redirect your application to the TokuMX mongos routers.

  9. Once your application is running on the TokuMX cluster, you can stop all the mongo2toku processes, the remaining MongoDB config servers, the MongoDB shard servers, and the MongoDB router servers, delete the MongoDB dbpaths, and shut down those machines.

Rolling migration

For a large MongoDB sharded cluster, it can be expensive to set up another cluster of machines of the same size concurrently, in order to bring up the TokuMX cluster. Much like the online replica set migration, it is possible to do a "rolling migration" for a sharded cluster. To do this, you need to do a rolling migration of each shard separately and then migrate the config servers and switch over the application once the TokuMX cluster has enough capacity to support the application. See Migrating from MongoDB: Replica set (online) for more details on rolling migrations.