Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEW] Support slot-based data migration #412

Closed
ChrisZMF opened this issue Dec 1, 2021 · 2 comments · Fixed by #430
Closed

[NEW] Support slot-based data migration #412

ChrisZMF opened this issue Dec 1, 2021 · 2 comments · Fixed by #430
Labels
A-cluster area cluster feature type new feature major decision Requires project management committee consensus release notes

Comments

@ChrisZMF
Copy link
Contributor

ChrisZMF commented Dec 1, 2021

1 Background

Data online migration is an essential feature for database servers if they are deployed in the cluster. Since kvrocks already supports redis cluster mode #219, it is more necessary to support data online migration. Different from the key-based data migration method of the Redis community, we proposed a slot-based data migration method for kvrocks.

2 Implementation

2.1 Data encoding format

To support slot-based migration, we need to encode slotid onto every key to improve the efficiency of iterating data. The string key and hash key are adopted to explain the slotid encoding for simple key and complex key respectively as follows.

String key:

    +--------+----+---------+----------+
    | ns_len | ns | slot_id | user_key |
    +--------+----+---------+----------+

Hash key:

    hash metakey
    +--------+----+---------+----------+
    | ns_len | ns | slot_id | user_key |
    +--------+----+---------+----------+
    hash subkey
    +--------+----+---------+--------------+----------+---------+-------+
    | ns_len | ns | slot_id | user_key_len | user_key | Version | field |
    +--------+----+---------+--------------+----------+---------+-------+

As shown above, slotid is encoded onto the prefix of every key. Keys of the same slot will have the same prefix, they will be stored in an adjacent location in rocksdb which will improve data iterating efficiency. Encoding slotid onto keys has been supported at #291 for kvrocks.

2.2 Slot-based migration brief design

Slot-based migration process mainly includes the following stages.

  1. Start migrating
  2. Migrating existing data
  3. Migrating incremental data
  4. End migrating

The main process of slot-based migration can be described in the following diagram.
kvrocks data migration
Figure 1. Slot-based migration process diagram

2.3 Detail implementation

2.3.1 Details of migrating process

As shown in the process diagram (Figure 1), the data migration will be triggered by sending a request to the source server. The source server will create a migration task after it got the data migration request. The main processes of slot-based migration are processed by this migration task. The details can be described as the following stages.

1) Start migrating stage

At this stage, the source server will notify the destination server to prepare to import data. If the destination server is ready, the source server will go to the next stage to migrate data. Otherwise, the source server will stop the migration task.

2) Migrating existing data stage

At this stage, the existing data will be migrated. Existing data of kvrocks is described by rocksdb snapshot at the migration beginning moment. Then, the source server will iterate all data of the snapshot, and construct data into Redis commands to send to the destination server. Constructed Redis commands will be sent by pipeline to improve efficiency.

3) Migrating incremental data stage

While migrating existing data, the migrating slot can keep writing. In other words, new data will be written during migrating existing data. These new data must be migrated to the destination server too. Before migrating incremental data, the migrating slot will be forbidden to write to maintain consistency. New data of the target slot cannot be written to the source server again.
The amount of the incremental data may be very large, because it may take a long time to migrate the existing data. It will cause the slot to forbid writing for a long time. To reduce the forbidden writing time duration, the incremental data migration will be processed in the two-step.

  • First step: Slot will not be forbidden writing while migrating incremental data. This step will be repeated until the amount of new data is less than a threshold, or the repetition times reach a threshold.
  • Second step: Slot will be forbidden from writing before migrating the rest new data.
    The incremental data will be gotten via iterating WAL of rocksdb.

4) End migrating stage

The previous stages may succeed or fail. In this stage, the source server will notify the destination server that the migration task succeeded or failed. If the migration is successful, both source server and the destination server will change the cluster topology maintained by themslves, and source server will clear data belonging to migreted slot. If the migration fails, only destination server will clear imported data of the target slot.

2.3.2 Support commands

  • CLUSTERX MIGRATE $slot $dst_nodeid
    $dst_nodeid is the node id of destination server in the cluster. See [NEW] Support redis cluster mode #219 for more details.
  • CLUSTER IMPORT $slot $state
    It is an internal command which will be sent by the source server to notify the destination server to prepare for data importing. This command cannot be used directly by clients.

3 Advantages and Disadvantages

3.1 Advantages

  • Efficiency
    Compare with the key-based data migration method, the slot-based method supports data transmission with the pipeline, it is more efficient.
  • More convenient failure rollback
    Data in the source server can be deleted only when all data is migrated successfully. If any failure happens during data migrating, the migration task will be stopped, data won't be deleted.
  • Consistency
    Writing to the migrating slot is forbidden while transmitting the last piece of incremental data. It can guarantee data consistency.
  • asynchronous migration
    Data migrating is processed in an independent thread, and does not affect the main threads to process requests.

3.2 Disadvantages

  • Currently, it only supports to migrate slot one by one.

4 Extra work

  • Support concurrent data migration.
@git-hulk git-hulk added A-cluster area cluster major decision Requires project management committee consensus feature type new feature labels Dec 1, 2021
@ShooterIT
Copy link
Member

Since we encode slot id into key only when enabling cluster mode, so i think we should only support slot migration in cluster mode, and the migrate command should be consistent with cluster command.

i prefer to use CLUSTER subcommand to implement slot migration instead of separate commands , see also redis issue redis/redis#2807 and in redis cluster v2 project redis/redis#8948, they also want to support this.

@ChrisZMF
Copy link
Contributor Author

Thanks for your suggestion, it really makes a lot of sense. The implementation of slot-based migration will be modified to adapt to the cluster mode. @ShooterIT

ShooterIT pushed a commit that referenced this issue Jan 27, 2022
A new command CLUSTERX MIGRATE is used for migrate slot data, slot-based migration
process mainly includes the following stages: migrating existing data and migrating
incremental data.

Command format:
CLUSTERX MIGRATE $slot $dst_nodeid
  - $slot is the slot which is to migrate
  - $dst_nodeid is the node id of destination server in the cluster.

We also introduce an internal command CLUSTER IMPORT for importing the migrating
slot data into destination server.

Migration status are shown into the output of CLUSTER INFO command.

After migration slot, you also should use CLUSTERX SETSLOT command to change cluster slot
distribution.

For more details, please see #412 and #430
ShooterIT pushed a commit to ShooterIT/kvrocks that referenced this issue Jan 27, 2022
A new command CLUSTERX MIGRATE is used for migrate slot data, slot-based migration
process mainly includes the following stages: migrating existing data and migrating
incremental data.

Command format:
CLUSTERX MIGRATE $slot $dst_nodeid
  - $slot is the slot which is to migrate
  - $dst_nodeid is the node id of destination server in the cluster.

We also introduce an internal command CLUSTER IMPORT for importing the migrating
slot data into destination server.

Migration status are shown into the output of CLUSTER INFO command.

After migration slot, you also should use CLUSTERX SETSLOT command to change cluster slot
distribution.

For more details, please see apache#412 and apache#430
ShooterIT pushed a commit to ShooterIT/kvrocks that referenced this issue Jan 28, 2022
A new command CLUSTERX MIGRATE is used for migrate slot data, slot-based migration
process mainly includes the following stages: migrating existing data and migrating
incremental data.

Command format:
CLUSTERX MIGRATE $slot $dst_nodeid
  - $slot is the slot which is to migrate
  - $dst_nodeid is the node id of destination server in the cluster.

We also introduce an internal command CLUSTER IMPORT for importing the migrating
slot data into destination server.

Migration status are shown into the output of CLUSTER INFO command.

After migration slot, you also should use CLUSTERX SETSLOT command to change cluster slot
distribution.

For more details, please see apache#412 and apache#430
ShooterIT pushed a commit that referenced this issue Jan 28, 2022
A new command CLUSTERX MIGRATE is used for migrate slot data, slot-based migration
process mainly includes the following stages: migrating existing data and migrating
incremental data.

Command format:
CLUSTERX MIGRATE $slot $dst_nodeid
  - $slot is the slot which is to migrate
  - $dst_nodeid is the node id of destination server in the cluster.

We also introduce an internal command CLUSTER IMPORT for importing the migrating
slot data into destination server.

Migration status are shown into the output of CLUSTER INFO command.

After migration slot, you also should use CLUSTERX SETSLOT command to change cluster slot
distribution.

For more details, please see #412 and #430
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cluster area cluster feature type new feature major decision Requires project management committee consensus release notes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants