Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffered S3 reader #247

Merged
merged 11 commits into from
Dec 28, 2021
Merged

Buffered S3 reader #247

merged 11 commits into from
Dec 28, 2021

Conversation

belltailjp
Copy link
Member

@belltailjp belltailjp commented Dec 26, 2021

This tries to fix #241 .

What is done in this PR

Main purpose is to provide buffering in S3 reader.

As left in the TODO comment here, _ObjectReader provided insufficient feature in order to use with io.BufferedReader.
In this PR I implemented readall and readinto methods that will be called from BufferedReader in _ObjectReader, and make it subclass of io.RawIOBase.

This is basically same as how normal filesystem is handled in Python; open("xxxx", "rb") returns a BufferedReader, whose original stream is a FileIO (unbuffered raw stream for local files) object.

There are also several works included.

  • Add kwargs **_ trap to each filesystem.
    • S3 accepts "buffering" and some other options in open_url and from_url. Similarly other filesystem may have their specific arguments. An option for a specific filesystem type will cause of "an unexpected keyword argument". This makes open_url or from_url call filesystem dependent, which hurts pfio transparency power.
    • By adding **_ to __init__ of each filesystem, it can simply ignore unsupported argument t
    • The downside would be:
      • Code readability. I need to provide sufficient document
      • Debuggability (e.g., typo): ??
  • Slightly reorganized tests
    • I split some of existing test cases in accordance with their test purpose, to test them clearly with and without buffering.

Usage

>>> with pfio.v2.from_url("s3://<bucket>/") as fs:
...     fs.open("foo.dat")  <---- It returns a BufferedReader (which wraps `_ObjectReader`)
>>> with pfio.v2.open_url("s3://<bucket>/", "rb") as f:
...     f   <------- It is also a `BufferedReader`

The buffering can be controlled by buffering option in __init__ (suggested by #247 (comment)),

  • buffering=0: No buffering
  • buffering=-1 (default): Use the default buffer size of 32MiB
  • buffering>0: Use the specified value (integer) as the buffer size

buffering option can be passed to from_url and open_url.

>>> with pfio.v2.from_url("s3://<bucket>/", buffering=0) as fs:
...     fs.open("foo.dat")  <---- It'll be an `_ObjectReader`
>>> with pfio.v2.open_url("s3://<bucket>/", "rb", buffering=0) as f:
...     f   <------- It is also a `_ObjectReader`

Buffering option has no effect when opening a file in text mode or write mode.

Performance

I confirmed a significant speedup when loading the same test data (a PIL image) as #241 .

pfio 2.0.1 (brought from #241)

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(f).size)
(2626, 1776)
CPU times: user 642 ms, sys: 56.5 ms, total: 698 ms
Wall time: 2.22 s

pfio 2.0.1 with BytesIO (brought from #241 as well)

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(io.BytesIO(f.read())).size)
(2626, 1776)
CPU times: user 46.9 ms, sys: 0 ns, total: 46.9 ms
Wall time: 108 ms

This PR

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(f).size)
(2626, 1776)
CPU times: user 51.5 ms, sys: 0 ns, total: 51.5 ms
Wall time: 115 ms

Thanks to the buffering, now it performs very similar speed to the BytesIO workaround.

Discussion

I set buffer size 32MiB to pfio.v2.s3.DEFAULT_BUFFER_SIZE.

When handling local (posix) files, Python uses 8192 bytes as the default buffer size ((io.DEFAULT_BUFFER_SIZE](https://docs.python.org/3/library/io.html#io.DEFAULT_BUFFER_SIZE)).
However for S3 since HTTP overhead is the way larger than native filesystem, the buffer size should be much larger for efficiency. I thought 32MiB would be a good balance, but maybe it's better to do some sort of micro-benchmarks.

If you suggest some specific values or way to identify the best number, please make an advice 🙇.

Another content

_ObjectReader.readall was implemented as follows:

    def readall(self):
        self.seek(0)         <---------
        return self.read(-1)

However this behavior (specifically, seeking to the head as shown as an arrow above) is inconsistent to normal filesystems.

>>> f = open('hoge.dat', 'rb')
>>> print(f.read())
>>> f.seek(5, os.SEEK_SET)
>>> print('tell() =', f.raw.tell())
>>> print(f.raw.readall())
b'0123456789'
tell() = 5
b'56789'

This behavior prevented the buffering to work properly, this PR fixes it too.

References

Python official document provides by far insufficient information. These resources helped me a lot learning how/what to implement in readinto.

@belltailjp belltailjp changed the title [WIP] Buffering S3 reader [WIP] Buffered S3 reader Dec 26, 2021
@kuenishi kuenishi added this to the 2.1.0 milestone Dec 27, 2021
@kuenishi kuenishi self-requested a review December 27, 2021 07:20
@kuenishi kuenishi added the cat:enhancement Implementation that does not break interfaces. label Dec 27, 2021
@kuenishi
Copy link
Member

Thank you for very thoughtful suggestion with real benchmarks. I'd welcome the idea and that's what I wanted to do, indicated in the TODO comment.

My suggestion would be add buffering keyword argument as well as in open() builtin function and like in other FS systems, to align the interface regarding the potential possible options like in open_url(path, buffering=-1) . See also Local#open and Hdfs#open . The document of open() says:

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer.

Thus the default value of 32MB would be nice, but I want to leave a mean for app developers to configure the buffer size.

@belltailjp belltailjp force-pushed the make-s3-buffered branch 2 times, most recently from 1a085c6 to 52d5cf5 Compare December 27, 2021 15:20
@belltailjp
Copy link
Member Author

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 52d5cf5:

1 similar comment
@pfn-ci-bot
Copy link

Successfully created a job for commit 52d5cf5:

@belltailjp belltailjp force-pushed the make-s3-buffered branch 2 times, most recently from 7419f2e to b322a63 Compare December 27, 2021 15:24
@belltailjp
Copy link
Member Author

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit b322a63:

1 similar comment
@pfn-ci-bot
Copy link

Successfully created a job for commit b322a63:

@belltailjp
Copy link
Member Author

belltailjp commented Dec 28, 2021

Although the performance benchmark would heavily depend on S3 configurations, network conditions and the content, I conducted a small toy single-stream data read throughput benchmark with varying file size and buffer size on our internal Ozone cluster.

Benchmark code
import os
import pickle
import random
import string
import time

import pfio
import numpy as np


def rand_str(n):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=n))


def bench(path, n_loop=5):
    print('mode', mode)
    print("| File size (MiB) | buffer size (MiB) | time (s) | stddev (s) | throughput (MiB/s) |")
    print("|:----|:----|:----|:----|:----|")

    for k in range(-2, 13, 2):
        # approx 2^k MB of pickle data.
        size = int(1024 * (2 ** k))
        data = {rand_str(32): rand_str(1024) for _ in range(size)}
        with pfio.v2.open_url(path, 'wb') as f:
            pickle.dump(data, f)
        with pfio.v2.from_url('s3://<bucket>/') as fs:
            actual_size = fs.stat(os.path.basename(path)).size

        for s in range(11):
            buffer_size = int(1024 * 1024 * (2 ** s))

            times = []
            for _ in range(n_loop):
                with pfio.v2.open_url(path, 'rb', buffering=buffer_size) as f:
                    before = time.time()
                    loaded_data = pickle.load(f)
                    assert list(loaded_data.keys())[0] == list(data.keys())[0]
                    after = time.time()
                    times.append(after - before)
            times = np.array(times)
            throughput = (actual_size / times.mean()) / (1024 ** 2)
            print('| {:.2f} | {:.0f} | {:.3f} | {:.3f} | {:.3f} |'
                  .format(actual_size / (1024 ** 2), buffer_size / (1024 ** 2),
                          times.mean(), times.std(), throughput))


bench("s3://<bucket>/bench.pkl")
The full output of the script.
File size (MiB) buffer size (MiB) time (s) stddev (s) throughput (MiB/s)
0.26 1 0.061 0.071 4.298
0.26 2 0.025 0.002 10.541
0.26 4 0.026 0.001 10.090
0.26 8 0.029 0.001 8.901
0.26 16 0.035 0.004 7.386
0.26 32 0.047 0.002 5.546
0.26 64 0.068 0.001 3.834
0.26 128 0.099 0.006 2.634
0.26 256 0.175 0.001 1.484
0.26 512 0.315 0.019 0.826
0.26 1024 0.590 0.038 0.441
1.04 1 0.110 0.010 9.498
1.04 2 0.039 0.002 26.376
1.04 4 0.038 0.001 27.619
1.04 8 0.038 0.001 27.375
1.04 16 0.044 0.002 23.856
1.04 32 0.058 0.005 17.891
1.04 64 0.076 0.005 13.619
1.04 128 0.106 0.003 9.785
1.04 256 0.179 0.002 5.797
1.04 512 0.324 0.011 3.207
1.04 1024 0.580 0.031 1.792
4.16 1 0.321 0.007 12.982
4.16 2 0.212 0.012 19.615
4.16 4 0.142 0.010 29.297
4.16 8 0.096 0.018 43.422
4.16 16 0.114 0.021 36.466
4.16 32 0.092 0.007 45.190
4.16 64 0.102 0.007 40.959
4.16 128 0.122 0.002 33.966
4.16 256 0.190 0.007 21.931
4.16 512 0.334 0.007 12.454
4.16 1024 0.579 0.023 7.189
16.64 1 1.251 0.048 13.298
16.64 2 0.798 0.053 20.866
16.64 4 0.526 0.018 31.658
16.64 8 0.411 0.024 40.462
16.64 16 0.366 0.011 45.504
16.64 32 0.304 0.015 54.684
16.64 64 0.305 0.020 54.649
16.64 128 0.349 0.047 47.695
16.64 256 0.323 0.020 51.504
16.64 512 0.386 0.005 43.116
16.64 1024 0.645 0.043 25.818
66.57 1 5.125 0.120 12.989
66.57 2 3.048 0.107 21.842
66.57 4 2.059 0.041 32.334
66.57 8 1.632 0.074 40.787
66.57 16 1.291 0.074 51.580
66.57 32 1.315 0.118 50.609
66.57 64 1.227 0.082 54.256
66.57 128 1.140 0.085 58.408
66.57 256 1.138 0.039 58.519
66.57 512 1.217 0.052 54.716
66.57 1024 1.302 0.090 51.142
266.29 1 20.679 0.271 12.877
266.29 2 12.699 0.095 20.969
266.29 4 8.538 0.075 31.188
266.29 8 6.700 0.307 39.744
266.29 16 5.461 0.077 48.765
266.29 32 5.011 0.161 53.140
266.29 64 5.149 0.187 51.720
266.29 128 5.067 0.272 52.549
266.29 256 4.708 0.156 56.562
266.29 512 4.794 0.304 55.548
266.29 1024 4.493 0.077 59.262
1065.15 1 87.949 1.321 12.111
1065.15 2 52.801 0.467 20.173
1065.15 4 35.530 0.441 29.979
1065.15 8 27.057 0.421 39.367
1065.15 16 21.776 0.416 48.914
1065.15 32 20.257 0.650 52.583
1065.15 64 20.440 0.476 52.111
1065.15 128 20.220 0.423 52.679
1065.15 256 19.298 0.626 55.194
1065.15 512 19.216 0.646 55.431
1065.15 1024 19.457 1.350 54.745

image

When reading relatively small files (~1MiB), relatively small buffer performs the best.
Files more than several megabytes, buffer size of 16MiB or more becomes efficient. Making buffer as large as possible seems to be better for very large files in terms of throughput, but the difference is relatively slight. Due to the concerning of performance on small files as well as wasting memory, it should be better not to use too much buffer.

As a conclusion I'd say the buffer size of around 16MiB to 64MiB is the best in my experimental setup.

@belltailjp
Copy link
Member Author

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 6405725:

@pfn-ci-bot
Copy link

Successfully created a job for commit 6405725:

@belltailjp belltailjp changed the title [WIP] Buffered S3 reader Buffered S3 reader Dec 28, 2021
Copy link
Member

@kuenishi kuenishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is awesome and looks very good... but, please fix the CI failure of annoying isort check.

@belltailjp
Copy link
Member Author

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit af6f2f5:

1 similar comment
@pfn-ci-bot
Copy link

Successfully created a job for commit af6f2f5:

@kuenishi kuenishi merged commit 0f59be9 into pfnet:master Dec 28, 2021
@kuenishi
Copy link
Member

TODO: if the benchmark result with 16MB wasn't that good as expected, we may change the default size later. cc: @belltailjp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:enhancement Implementation that does not break interfaces.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File-like object returned from open_url is extremely slow with S3
3 participants