Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume balance #1878

Closed
LazyDBA247-Anyvision opened this issue Mar 9, 2021 · 14 comments
Closed

volume balance #1878

LazyDBA247-Anyvision opened this issue Mar 9, 2021 · 14 comments

Comments

@LazyDBA247-Anyvision
Copy link
Contributor

Hi,
using sw 2.30 + dir.idx for volume server
when trying to run volume balance, the operation fails with the same error always, and if we restart the volume server the volume disappears because the vif file is missing?

> lock
> volume.balance
  moving  volume testPerformance2_113 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance2_114 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume 101 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance1_112 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
> volume.balance --force
  moving  volume 101 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
2021/03/09 07:42:25 copying volume 101 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080
error: copy volume 101 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080: rpc error: code = Unknown desc = failed to copy /data/101.vif file: receiving /data/101.vif: rpc error: code = Unknown desc = open /data/101.vif: no such file or directory

vol 1 logs (new empty server):

I0309 07:42:20     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-1.seaweedfs-volume:8080 heartbeat
I0309 07:42:25     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-1.seaweedfs-volume:8080 heartbeat
I0309 07:42:25     1 volume_grpc_copy.go:173] writing to /data/101.dat
I0309 07:42:25     1 volume_grpc_copy.go:173] writing to /idx/101.idx
I0309 07:42:25     1 volume_grpc_copy.go:173] writing to /data/101.vif
I0309 07:42:30     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-1.seaweedfs-volume:8080 heartbeat
I0309 07:42:35     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-1.seaweedfs-volume:8080 heartbeat
I0309 07:42:40     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-1.seaweedfs-volume:8080 heartbeat
I0309 07:42:45     1 disk_location.go:370] dir /data freePercent 18.54% < min 7.00%, isLowDiskSpace: false

vol 0 logs (source server)

I0309 07:42:22     1 volume.go:240] CollectStatus volume 110
I0309 07:42:25     1 volume_grpc_admin.go:149] volume mark readonly volume_id:101
I0309 07:42:25     1 volume_grpc_admin.go:164] volume mark writable volume_id:101
I0309 07:42:27     1 volume_grpc_client_to_master.go:205] volume server seaweedfs-volume-0.seaweedfs-volume:8080 heartbeat
I0309 07:42:27     1 volume.go:240] CollectStatus volume 101

image

if needed any more data, I will provide it gladly

Thanks!

@kmlebedev
Copy link
Contributor

Yes, it is normal.
If the operation did not go through to the end (there is no .vif file), then the volume .idx file is considered broken
See
e207620
#1709

@LazyDBA247-Anyvision
Copy link
Contributor Author

volume list before restart volume pods:

> volume.list
Topology volumeSizeLimit:30000 MB hdd(volume:10/40000 active:10 free:39990 remote:0)
  DataCenter DefaultDataCenter hdd(volume:10/40000 active:10 free:39990 remote:0)
    Rack DefaultRack hdd(volume:10/40000 active:10 free:39990 remote:0)
      DataNode seaweedfs-volume-0.seaweedfs-volume:8080 hdd(volume:10/20000 active:10 free:19990 remote:0)
        Disk hdd(volume:10/20000 active:10 free:19990 remote:0)
          volume id:100 size:59610952 file_count:17 version:3 modified_at_second:1615275647 
          volume id:101 size:36711752 file_count:9 version:3 modified_at_second:1615275589 
          volume id:107 size:8 collection:"testPerformance" version:3 compact_revision:1 modified_at_second:1615275515 
          volume id:110 size:21436824 collection:"testPerformance1" file_count:2594 version:3 modified_at_second:1615275533 
          volume id:111 size:21552520 collection:"testPerformance1" file_count:2608 version:3 modified_at_second:1615275533 
          volume id:112 size:20651744 collection:"testPerformance1" file_count:2499 version:3 modified_at_second:1615275533 
          volume id:113 size:29618184 collection:"testPerformance2" file_count:3584 version:3 modified_at_second:1615275610 
          volume id:114 size:30386736 collection:"testPerformance2" file_count:3677 version:3 modified_at_second:1615275610 
          volume id:115 size:30882576 collection:"testPerformance2" file_count:3737 version:3 modified_at_second:1615275610 
          volume id:116 size:30675976 collection:"testPerformance2" file_count:3712 version:3 modified_at_second:1615275610 
        Disk hdd total size:281527272 file_count:22437 
      DataNode seaweedfs-volume-0.seaweedfs-volume:8080 total size:281527272 file_count:22437 
      DataNode seaweedfs-volume-1.seaweedfs-volume:8080 hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd total size:0 file_count:0 
      DataNode seaweedfs-volume-1.seaweedfs-volume:8080 total size:0 file_count:0 
    Rack DefaultRack total size:281527272 file_count:22437 
  DataCenter DefaultDataCenter total size:281527272 file_count:22437 
total size:281527272 file_count:22437 

restart both volume servers:

> volume.list
Topology volumeSizeLimit:30000 MB hdd(volume:0/40000 active:0 free:40000 remote:0)
  DataCenter DefaultDataCenter hdd(volume:0/40000 active:0 free:40000 remote:0)
    Rack DefaultRack hdd(volume:0/40000 active:0 free:40000 remote:0)
      DataNode seaweedfs-volume-0.seaweedfs-volume:8080 hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd total size:0 file_count:0 
      DataNode seaweedfs-volume-0.seaweedfs-volume:8080 total size:0 file_count:0 
      DataNode seaweedfs-volume-1.seaweedfs-volume:8080 hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd(volume:0/20000 active:0 free:20000 remote:0)
        Disk hdd total size:0 file_count:0 
      DataNode seaweedfs-volume-1.seaweedfs-volume:8080 total size:0 file_count:0 
    Rack DefaultRack total size:0 file_count:0 
  DataCenter DefaultDataCenter total size:0 file_count:0 
total size:0 file_count:0 

ALL DATA LOST?

@kmlebedev
Copy link
Contributor

It is necessary to understand why .vif is missing

@LazyDBA247-Anyvision
Copy link
Contributor Author

I am in log level 4 and nothing in the logs of the volume servers or master.
I can reproduce and check (screenshot) before trying to balance

@kmlebedev
Copy link
Contributor

kmlebedev commented Mar 9, 2021

It seems we need more diagnostics about this error and handle this exception.

error: copy volume 101 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080: rpc error: code = Unknown desc = failed to copy /data/101.vif file: receiving /data/101.vif: rpc error: code = Unknown desc = open /data/101.vif: no such file or directory

@LazyDBA247-Anyvision
Copy link
Contributor Author

i can compile and test, can you show me where to add more logging?

@kmlebedev
Copy link
Contributor

It seems to me that this may be due to the option of storing metadata on the ssd disk

@LazyDBA247-Anyvision
Copy link
Contributor Author

i will test without....

@LazyDBA247-Anyvision
Copy link
Contributor Author

disable the dir.idx setting:
image

same result:

> volume.balance
  moving  volume testPerformance3_548 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance3_547 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume 542 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume 543 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance4_553 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance4_552 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
> volume.balance --force
  moving  volume 542 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
2021/03/09 08:22:26 copying volume 542 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080
error: copy volume 542 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080: rpc error: code = Unknown desc = failed to copy /data/542.vif file: receiving /data/542.vif: rpc error: code = Unknown desc = open /data/542.vif: no such file or directory

@chrislusf
Copy link
Collaborator

disable the dir.idx setting:
image

same result:

> volume.balance
  moving  volume testPerformance3_548 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance3_547 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume 542 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume 543 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance4_553 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
  moving  volume testPerformance4_552 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
> volume.balance --force
  moving  volume 542 seaweedfs-volume-0.seaweedfs-volume:8080 => seaweedfs-volume-1.seaweedfs-volume:8080
2021/03/09 08:22:26 copying volume 542 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080
error: copy volume 542 from seaweedfs-volume-0.seaweedfs-volume:8080 to seaweedfs-volume-1.seaweedfs-volume:8080: rpc error: code = Unknown desc = failed to copy /data/542.vif file: receiving /data/542.vif: rpc error: code = Unknown desc = open /data/542.vif: no such file or directory

If the directory listing screenshot is before the move, do you know why there are no ".vif" files?

@LazyDBA247-Anyvision
Copy link
Contributor Author

LazyDBA247-Anyvision commented Mar 9, 2021

before the move, no, i don't know...
another new 2.30 install, no .vif files

@chrislusf
Copy link
Collaborator

ok. seems a problem with recent changes that .vif files are not generating.

@chrislusf
Copy link
Collaborator

Added fix to add missing vif files.

@LazyDBA247-Anyvision
Copy link
Contributor Author

Tested, Works now... 10x!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants