Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.04: volumeServer.evacuate moves one volume only #1534

Closed
onlyjob opened this issue Oct 13, 2020 · 6 comments
Closed

2.04: volumeServer.evacuate moves one volume only #1534

onlyjob opened this issue Oct 13, 2020 · 6 comments

Comments

@onlyjob
Copy link

onlyjob commented Oct 13, 2020

volumeServer.evacuate fails to move all volumes but succeeds in moving one volume at a time.

I've started a new volume server and I'm trying to move data from another volume server (which I want to decommission):

> volumeServer.evacuate -node 192.168.0.250:9334 -force
moving volume 2 192.168.0.250:9334 => 192.168.0.204:8080
2020-10-14 00:24:04.230186 I | copying volume 2 from 192.168.0.250:9334 to 192.168.0.204:8080
2020-10-14 00:24:07.278448 I | tailing volume 2 from 192.168.0.250:9334 to 192.168.0.204:8080
2020-10-14 00:24:19.473121 I | deleting volume 2 from 192.168.0.250:9334
2020-10-14 00:24:19.682220 I | moved volume 2 from 192.168.0.250:9334 to 192.168.0.204:8080
error: failed to move volume 19 from 192.168.0.250:9334

> volumeServer.evacuate -node 192.168.0.250:9334 -force
moving volume 4 192.168.0.250:9334 => 192.168.0.204:8080
2020-10-14 00:24:30.661134 I | copying volume 4 from 192.168.0.250:9334 to 192.168.0.204:8080
2020-10-14 00:24:32.954465 I | tailing volume 4 from 192.168.0.250:9334 to 192.168.0.204:8080
2020-10-14 00:24:45.109108 I | deleting volume 4 from 192.168.0.250:9334
2020-10-14 00:24:45.272594 I | moved volume 4 from 192.168.0.250:9334 to 192.168.0.204:8080
error: failed to move volume 3 from 192.168.0.250:9334

Re-trying the command volumeServer.evacuate -node 192.168.0.250:9334 -force always stops after processing one volume which it manages to evacuate successfully. With one volume processed per command invocation, running volumeServer.evacuate enough times eventually managed to move/delete all the volumes from server 192.168.0.250.

@chrislusf
Copy link
Collaborator

I could not reproduce this locally. It would be good to see the output of volume.list before a volumeServer.evacuate execution.

@PeterCxy
Copy link
Contributor

PeterCxy commented Mar 13, 2021

I can confirm that this is the case in my deployment -- it always stops after moving one volume stating it "failed to move" the next volume.

@PeterCxy
Copy link
Contributor

one thing to note though, is that my setup consists of servers of different sizes, so no two servers have the same number of available volumes. This seems to cause issues with volume.balance, and apparently volumeServer.evacuate shares some routines with volume.balance in terms of selecting the new best server to move to.

@PeterCxy
Copy link
Contributor

@chrislusf Could you test the case where every volume server has a different number of available volumes and see if this can be reproduced? Also volume.balance does not seem to behave correctly in this case.

@chrislusf
Copy link
Collaborator

Please share the output of volume.list

@PeterCxy
Copy link
Contributor

PeterCxy commented Mar 13, 2021

@chrislusf I will DM you (on Patreon) with the link to my volume list because 1) it's huge; 2) it contains some of my S3 bucket names that I may not want to make public.

chrislusf added a commit that referenced this issue Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants