Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected out of order reads streaming with block size 32768 #26

Closed
lrowe opened this issue Oct 23, 2015 · 6 comments
Closed

Unexpected out of order reads streaming with block size 32768 #26

lrowe opened this issue Oct 23, 2015 · 6 comments

Comments

@lrowe
Copy link
Contributor

lrowe commented Oct 23, 2015

Trying to dig into why gunzip was slow, I enabled the debug logging while reading a file:

sudo goofys -o allow_other --dir-mode 0555 --file-mode 0444 --stat-cache-ttl 0 --debug_fuse --debug_s3 <bucket> /mnt

When reading with a block size of 131072 (the same as cat) I see 311 responses logged with status 206 Partial Content.

$ time dd if=/mnt/TEST.fastq.gz of=/dev/null bs=131072
12414+1 records in
12414+1 records out
1627202877 bytes (1.6 GB) copied, 61.5945 s, 26.4 MB/s

real    1m1.812s
user    0m0.024s
sys 0m0.644s

When reading with a block size of 32768 (the same as gunzip --stdout) I see 621 responses logged with status 206 Partial Content.

$ time dd if=/mnt/TEST.fastq.gz of=/dev/null bs=32768
49658+1 records in
49658+1 records out
1627202877 bytes (1.6 GB) copied, 75.1237 s, 21.7 MB/s

real    1m15.252s
user    0m0.008s
sys 0m0.804s

The difference in requests seems to be accounted for by 310 log lines reporting:

out of order read: 6 [TEST.fastq.gz] [5242880 5308416]
@kahing
Copy link
Owner

kahing commented Oct 25, 2015

could you upload the log somewhere? I ran some smaller tests and don't see any out of order logs.

out of order read: 6 [TEST.fastq.gz] [5242880 5308416]

indicates that read is requesting offset 5242880 when the expected offset is 5308416 (last byte read was 5308415). Which is very strange given that you are just using dd.

@lrowe
Copy link
Contributor Author

lrowe commented Oct 25, 2015

Here's the log while running dd if=/mnt/TEST.fastq.gz of=/dev/null bs=32768: https://gist.github.com/lrowe/14dc52df1052d33f799c

This is on Ubuntu 15.10 with goofyfs at commit c3de75e

@kahing
Copy link
Owner

kahing commented Oct 25, 2015

I don't have time to look into it in details today, but my initial hunch is that they are innocuous logging statements because they seem to happen at 5MB boundaries (which is the hardcoded range read size).

@lrowe
Copy link
Contributor Author

lrowe commented Oct 26, 2015

I count almost double the number of HTTP requests being made when using 32KB reads, so I don't think it's innocuous. I've redone the tests at commit a8fe0ae reading only the first 33MB so I can paste full logs into the gist: https://gist.github.com/lrowe/275da6f2fc04bfa3bf2f

A search for HTTP/1.1 206 Partial Content shows 7 matches for the 32KB block size and 13 matches for the 128KB block size.

Fuse will read ahead by increasing read sizes up to a maximum of 128KB. In the dd bs=128K case there is no read ahead as the reads are aligned with the maximum read size. In the dd bs=32K case the first read is rounded up to 64KB and then subsequently to 128KB, so its reads are offset around the 5MB boundary. https://gist.github.com/lrowe/275da6f2fc04bfa3bf2f#file-_relevant-log

It looks like there is a race condition because fh.reader is not closed and set to nil when reaching EOF in tryReadAll.

@kahing kahing closed this as completed in 3379a87 Oct 26, 2015
@kahing
Copy link
Owner

kahing commented Oct 26, 2015

I think I fixed the problem but since I had problem reproducing it, could you double check for me? Thanks a lot for your investigations!

kahing added a commit that referenced this issue Oct 26, 2015
@lrowe
Copy link
Contributor Author

lrowe commented Oct 26, 2015

Thanks! Updating to current master I see that the expected number of requests and the 'out of order' warnings are gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants