Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After interrupting replication, we get a broken volume ID #1567

Closed
kmlebedev opened this issue Oct 27, 2020 · 1 comment
Closed

After interrupting replication, we get a broken volume ID #1567

kmlebedev opened this issue Oct 27, 2020 · 1 comment

Comments

@kmlebedev
Copy link
Contributor

Describe the bug
After interrupting replication, we get a broken volume ID

System Setup

2.06

Expected behavior
After replication, check the volume for compliance with the source

Additional context

  1. Run upgrade seaweedfs service in k8s with sequential restart of pods
  2. Next runs cronjob with volume.fix.replication
    Since the pods are restarted, fixes random volume id and copying is interrupted
replicating volume 108 010 from 10.1.1.2:8080 to dataNode 10.1.1.5:8080 ...
error: copying from 10.1.1.2:8080 => 10.1.1.5:8080 : rpc error: code = Unavailable desc = transport is closing
  1. Since restarts continue and overtake DataNode 10.1.1.5:8080 and broken volume id 108 stay on datanode
  2. Next runs cronjob with volume.fix.replication
volume 108 replication 010, but over replicated +3
deleting volume 108 from 10.1.1.3:8080 ...
  1. Finally, we have two volumes for id 108 with different sizes
      DataNode 10.1.1.5:8080 volume:19/22 active:18 free:3 remote:0
        volume id:108 size:20591935488 collection:"logs" file_count:43959 delete_count:11191 deleted_byte_count:23748792802 read_only:true replica_placement:10 version:3 compact_revision:1 modified_at_second:1603826757
      DataNode 10.1.1.2:8080 volume:19/22 active:19 free:3 remote:0
        volume id:108 size:89633122384 collection:"logs" file_count:43961 delete_count:11189 deleted_byte_count:23748752094 replica_placement:10 version:3 compact_revision:1 modified_at_second:1603762458
@chrislusf
Copy link
Collaborator

Added a mechanism to avoid incomplete volume files if restarted in the middle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants