Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use shard function #441

Closed
mitiger opened this issue Mar 2, 2015 · 11 comments · Fixed by #481
Closed

how to use shard function #441

mitiger opened this issue Mar 2, 2015 · 11 comments · Fixed by #481

Comments

@mitiger
Copy link

mitiger commented Mar 2, 2015

I hava start vttablet and vtgate, everything is OK.
but now, need use shard function ,i create two shard that is -64,64- . but i don't know how to copy data from keyspace/0 to keyspace/-64 and keyspace/64- .

can autors give some examples with shell or or introductions about how to use shard function.

thanks a lot.

@aaijazi aaijazi self-assigned this Mar 2, 2015
@aaijazi
Copy link
Contributor

aaijazi commented Mar 2, 2015

Taking this, with the hope of having a quick document or write-up in a few hours.

@aaijazi
Copy link
Contributor

aaijazi commented Mar 3, 2015

I've started writing a document, but it's a first draft and probably will take a couple of days to finalize.

@mitiger, if you need something faster than that, I can provide you with a minimal set of commands that need to be run. You can also take a look at test/initial_sharding.py:test_resharding to see what commands are run for the end-to-end test.

@mitiger
Copy link
Author

mitiger commented Mar 3, 2015

@aaijazi , I hava read initial_sharding.py , and try in my server.

there is a question, when i exec command : "./vtworker -min_healthy_rdonly_endpoints=1 -cell=test SplitDiff test_keyspace/-80"

vttablet logs show :

can't read startPosition: error Table '_vt.blp_checkpoint' doesn't exist (errno 1146) during query: SELECT pos, flags FROM _vt.blp_checkpoint WHERE source_shard_uid=0 in selecting from recovery table SELECT pos, flags FROM _vt.blp_checkpoint WHERE source_shard_uid=0

blp_chekpoint not found , i don't know when table blp_checkpoint create ?

@mitiger
Copy link
Author

mitiger commented Mar 3, 2015

@aaijazi , please give me the minimal set of commands ,so i can see if some commands lost in my demo.
thanks a lot

@aaijazi
Copy link
Contributor

aaijazi commented Mar 3, 2015

@mitiger try adding --strategy=-populate_blp_checkpoint to your vtworker command when doing SplitClone:

./vtworker -min_healthy_rdonly_endpoints=1 -cell=test SplitClone --strategy=-populate_blp_checkpoint test_keyspace/0

I'll get you the minimal set of commands in a few minutes.

@mitiger
Copy link
Author

mitiger commented Mar 3, 2015

@aaijazi , the command
./vtworker -min_healthy_rdonly_endpoints=1 -cell=test SplitClone --strategy=-populate_blp_checkpoint test_keyspace/0"

take a long time , and not success?
Always print log as follows:
Running:
Copying from: test-0000000003
ETA: 2015-03-03 14:56:01.238102924 +0800 CST
test_table: copy done, copied 200 rows

is it normally?

@aaijazi
Copy link
Contributor

aaijazi commented Mar 3, 2015

@mitiger: I haven't tested these exact commands myself, but I think they should you get going.

Preparing source shard

  • Make sure you have a column that you can use as a sharding key on the source shard.
  • Indicate to Vitess what that sharding column is, e.g., vtctl SetKeyspaceShardingInfo -force test_keyspace keyspace_id uint64
  • Make sure you have an rdonly tablet in the source shard

Preparing destination shards

  • Reparent each shard, e.g., vtctl ReparentShard -force test_keyspace/-64 <master tablet alias> so that each destination shard has selected a master tablet.

Cloning the data

  • Copy the schema onto each destination shard: vtctl CopySchemaShard <source rdonly tablet alias> test_keyspace/-64
  • Run a SplitClone vtworker to stream the data from the source to destinations. Example: vtworker -min_healthy_rdonly_endpoints=1 --cell=test SplitClone --strategy=-populate_blp_checkpoint test_keyspace/0

Verify data was copied correctly (optional, but recommended)

  • Run a single SplitDiff vtworker: vtworker -min_healthy_rdonly_endpoints=1 --cell=test SplitDiff test_keyspace/-64
  • If the above is successful, repeat the step for the next destination shard. Make sure that you don't run the commands in parallel, but let each shard's diff run sequentially.

Start serving traffic from destination shard

  • Migrate rdonly traffic: vtctl MigrateServedTypes test_keyspace/0 rdonly
  • If successful, migrate replica traffic: vtctl MigrateServedTypes test_keyspace/0 replica
  • If successful, migrate master traffic: vtctl MigrateServedTypes test_keyspace/0 master

Scrap the source shard

Note: only do these steps if all the above steps were successful.

  • For each tablet in the source shard: vtctl ScrapTablet <source tablet alias>
  • For each tablet in the source shard: vtctl DeleteTablet <source tablet alias>
  • Rebuild serving graph: vtctl RebuildKeyspaceGraph test_keyspace
  • vtctl DeleteShard test_keyspace/0

@aaijazi
Copy link
Contributor

aaijazi commented Mar 3, 2015

@mitiger That looks like fairly normal output for the copy command. What makes you say it won't succeed? How many tables/rows were you expecting? How long did it run for?

You can monitor the status of the copy by viewing the status page in your browser. It should also log/print the status every second.

@mitiger
Copy link
Author

mitiger commented Mar 3, 2015

@aaijazi , in my keyspace , just one table and 200 rows .
this log print then 5 minutes, then i stop it with ctrl + C !!

@mitiger
Copy link
Author

mitiger commented Mar 3, 2015

@aaijazi i found the table _vt.blp_checkpoint not been created .
and i see the source code , create blp_checkpoint is in command CopySchemaShard ??

i think the reason is here .

@aaijazi
Copy link
Contributor

aaijazi commented Mar 3, 2015

@mitiger: if the source shard has the _vt.blp_checkpoint table, then it will be created with the CopySchemaShard command. If the source shard does not have that table, it should be created by the SplitClone command, but only if --strategy=-populate_blp_checkpoint is given.

You might want to try deleting the destination tablets, and then trying the sharding steps again (in case your destination tablets are in a bad state).

If that doesn't succeed, please provide a link to the full logs from the failing operation, and I can try to help track down what's going wrong.

notfelineit pushed a commit to planetscale/vitess that referenced this issue Apr 5, 2022
rsajwani pushed a commit to planetscale/vitess that referenced this issue Aug 1, 2022
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
dbussink pushed a commit that referenced this issue Jan 30, 2023
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants