Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file handle read 404 Not Found, missung file chunk after upload via filer.copy #2162

Closed
Suika opened this issue Jun 27, 2021 · 8 comments
Closed

Comments

@Suika
Copy link
Contributor

Suika commented Jun 27, 2021

Describe the bug
Continuation of #2154
So, this time I had uploaded 11k files via filer.copy and once again a few files error out.

weed -v 4 -logdir /tmp/wlogs filer.copy * http://localhost:8888/buckets/client_files/ >> /tmp/wlogs/somelog.log

from log:

Log file created at: 2021/06/27 15:47:36
Running on machine: suika
Binary: Built with gc go1.16.5 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss threadid file:line] msg
I0627 15:47:36 23621 config.go:31] Reading : Config File "security" Not Found in "[/snapraid/storage_c2/client_files /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"

From filer.copy:

uploading 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png in 6 chunks ...
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-2 to http://172.27.0.5:8080/187,0671b908f6fb6de1 [4194304,8388608)
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-1 to http://172.27.0.5:8080/187,0671b9057a844c4c [0,4194304)
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-6 to http://172.27.0.5:8080/187,0671b9049ca2abd9 [20971520,22320198)
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-3 to http://172.27.0.5:8080/187,0671b9076770557b [8388608,12582912)
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-4 to http://172.27.0.5:8080/187,0671b909236e4258 [12582912,16777216)
uploaded 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png-5 to http://172.27.0.5:8080/187,0671b90a1e5eea3b [16777216,20971520)
copied f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png => http://localhost:8888/buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png

From mount:

E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 reader_at.go:146] fetching chunk &{FileId:187,0671b9057a844c4c Offset:0 Size:4194304 LogicOffset:0 ChunkSize:4194304 CipherKey:[] IsGzipped:false}: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
E0627 16:44:08 15798 filehandle.go:150] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found
W0627 16:44:08 15798 filehandle.go:90] file handle read /buckets/client_files/f59/599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png 0: http://172.27.0.5:8080/187,0671b9057a844c4c?readDeleted=true: 404 Not Found

I did not use other tools, I uploaded it in one go. So something is not right. If this is happening now, I wonder if this happens with S3 and other parts. But will only be able to check on the next version.

System Setup

Expected behavior
File to no have errors when uploaded via filer.copy. If needed, double check if the uplaoded chunk is really uploaded.
Sure, will slow down the whole process, but data loss is kinda bad. Tho I will only be able to say it in the next release.

@chrislusf
Copy link
Collaborator

chrislusf commented Jun 27, 2021

To confirm:

  • Are there any updates or deletions at the same time?
  • Are the files copied mostly more than 4MB? What is the average size?
  • Do you have volume server logs?

btw: https://github.com/chrislusf/seaweedfs/releases/dev has the latest dev version. You can use the volume.fsck in weed shell directly without reinstall or restart the servers.

@Suika
Copy link
Contributor Author

Suika commented Jun 28, 2021

Are there any updates or deletions at the same time?

If you mean that another application is uploading files to seaweed, then yes. There is an application that is pushing files via S3 every other second.

Are the files copied mostly more than 4MB? What is the average size?

Highly variable. Range from a few kb to 300mb, but most ranges are between 1 and 6mb.

Do you have volume server logs?

Yes, checked them for any new entreis at the time of the uplaod, but nothing to be found there. That's why I didn't include them. And I mean that there was nothing.

Running volume.fsck right now. Will reply in ~12h.

@Suika
Copy link
Contributor Author

Suika commented Jun 28, 2021

> volume.fsck
volume:5        entries:466072  orphan:4        0.00%   4769883B
volume:11       entries:449755  orphan:18       0.00%   28946002B
volume:7        entries:455311  orphan:8        0.00%   11050412B
volume:32       entries:337661  orphan:21       0.01%   32366239B
volume:14       entries:619639  orphan:32       0.01%   44208469B
volume:28       entries:134106  orphan:2        0.00%   2182196B
volume:29       entries:142691  orphan:4        0.00%   6697815B
volume:41       entries:320197  orphan:2        0.00%   1416169B
volume:6        entries:454435  orphan:7        0.00%   7646521B
volume:8        entries:455980  orphan:10       0.00%   17340541B
volume:174      entries:113130  orphan:1        0.00%   373592B
volume:12       entries:439289  orphan:6        0.00%   5093745B
volume:19       entries:587281  orphan:3        0.00%   4205545B
volume:21       entries:582048  orphan:2        0.00%   3460775B
volume:13       entries:497765  orphan:24       0.00%   33330143B
volume:183      entries:327244  orphan:2        0.00%   4194333B
volume:34       entries:367142  orphan:33       0.01%   52605823B
volume:2        entries:466767  orphan:8        0.00%   9523889B
volume:10       entries:447580  orphan:19       0.00%   20058589B
volume:26       entries:134067  orphan:7        0.01%   13184032B
volume:22       entries:564770  orphan:6        0.00%   11518666B
volume:44       entries:320326  orphan:1        0.00%   2342469B
volume:23       entries:226212  orphan:4        0.00%   8244137B
volume:20       entries:589109  orphan:3        0.00%   3186914B
volume:15       entries:620767  orphan:7        0.00%   8951507B
volume:30       entries:133763  orphan:2        0.00%   4047186B
volume:36       entries:460390  orphan:83       0.02%   127218576B
volume:18       entries:604574  orphan:4        0.00%   4836475B
volume:27       entries:132598  orphan:8        0.01%   12605766B
volume:181      entries:337697  orphan:2        0.00%   8388666B
volume:33       entries:343184  orphan:1        0.00%   883660B
volume:24       entries:136207  orphan:3        0.00%   5935481B
volume:35       entries:360159  orphan:29       0.01%   49237044B
volume:42       entries:320821  orphan:3        0.00%   978574B
volume:40       entries:308500  orphan:33       0.01%   47597002B
volume:39       entries:403129  orphan:28       0.01%   45658789B
volume:4        entries:476477  orphan:7        0.00%   8935252B
volume:48       entries:46649091        orphan:8        0.00%   28766B
volume:37       entries:500886  orphan:83       0.02%   136111080B
volume:175      entries:347687  orphan:24       0.01%   20283307B
volume:17       entries:602162  orphan:2        0.00%   1978152B
volume:38       entries:501009  orphan:90       0.02%   141566157B
volume:31       entries:276014  orphan:21       0.01%   35448827B
volume:25       entries:133963  orphan:3        0.00%   5870478B
volume:16       entries:603781  orphan:7        0.00%   9325909B
volume:184      entries:328584  orphan:1        0.00%   4194333B
volume:9        entries:454736  orphan:7        0.00%   10191176B
volume:3        entries:478338  orphan:3        0.00%   4127425B

Total           entries:98329766        orphan:686      0.00%   1022346487B

@Suika
Copy link
Contributor Author

Suika commented Jun 28, 2021

It would also be nice to know what files are broken, so I can try and see if they can be repaired.

@chrislusf
Copy link
Collaborator

Use the 2.56 version, and run this to see which file has missing chunks:

volume.fsck -findMissingChunksInFilerPath=/ -findMissingChunksInFiler -v

@Suika
Copy link
Contributor Author

Suika commented Jun 28, 2021

Ran the volume.fsck -findMissingChunksInFilerPath=/ -findMissingChunksInFiler -v, greped the log for 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d and it came out empty.
The file 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d.png is still giving me (Input/output error). The logs never show any errors from the directory that contains 599b49d0abe3b569748a15598744d1a648d368bec806bce9e63d10453f94087d0.

Ontop of that, the other application that is uploading data to seaweed via S3 has timestamps in the file nemes. The earliest file has a timestamp 1624800578106, a day ago. Which makes no sense with how some files were missing chunks long before.

Something doesn'T add up.

@Suika
Copy link
Contributor Author

Suika commented Jun 28, 2021

I've run volume.fsck against the directory where the broken file is and it retruned nothing. No error. This is not good. Will hop on slack later on.

@Suika
Copy link
Contributor Author

Suika commented Jul 4, 2021

The files are readable once again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants