Buffered S3 reader #247

belltailjp · 2021-12-26T14:45:12Z

This tries to fix #241 .

What is done in this PR

Main purpose is to provide buffering in S3 reader.

As left in the TODO comment here, _ObjectReader provided insufficient feature in order to use with io.BufferedReader.
In this PR I implemented readall and readinto methods that will be called from BufferedReader in _ObjectReader, and make it subclass of io.RawIOBase.

This is basically same as how normal filesystem is handled in Python; open("xxxx", "rb") returns a BufferedReader, whose original stream is a FileIO (unbuffered raw stream for local files) object.

There are also several works included.

Add kwargs **_ trap to each filesystem.
- S3 accepts "buffering" and some other options in open_url and from_url. Similarly other filesystem may have their specific arguments. An option for a specific filesystem type will cause of "an unexpected keyword argument". This makes open_url or from_url call filesystem dependent, which hurts pfio transparency power.
- By adding **_ to __init__ of each filesystem, it can simply ignore unsupported argument t
- The downside would be:
  - Code readability. I need to provide sufficient document
  - Debuggability (e.g., typo): ??
Slightly reorganized tests
- I split some of existing test cases in accordance with their test purpose, to test them clearly with and without buffering.

Usage

>>> with pfio.v2.from_url("s3://<bucket>/") as fs:
...     fs.open("foo.dat")  <---- It returns a BufferedReader (which wraps `_ObjectReader`)
>>> with pfio.v2.open_url("s3://<bucket>/", "rb") as f:
...     f   <------- It is also a `BufferedReader`

The buffering can be controlled by buffering option in __init__ (suggested by #247 (comment)),

buffering=0: No buffering
buffering=-1 (default): Use the default buffer size of 32MiB
buffering>0: Use the specified value (integer) as the buffer size

buffering option can be passed to from_url and open_url.

>>> with pfio.v2.from_url("s3://<bucket>/", buffering=0) as fs:
...     fs.open("foo.dat")  <---- It'll be an `_ObjectReader`
>>> with pfio.v2.open_url("s3://<bucket>/", "rb", buffering=0) as f:
...     f   <------- It is also a `_ObjectReader`

Buffering option has no effect when opening a file in text mode or write mode.

Performance

I confirmed a significant speedup when loading the same test data (a PIL image) as #241 .

pfio 2.0.1 (brought from #241)

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(f).size)
(2626, 1776)
CPU times: user 642 ms, sys: 56.5 ms, total: 698 ms
Wall time: 2.22 s

pfio 2.0.1 with BytesIO (brought from #241 as well)

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(io.BytesIO(f.read())).size)
(2626, 1776)
CPU times: user 46.9 ms, sys: 0 ns, total: 46.9 ms
Wall time: 108 ms

This PR

>>> %%time
>>> with pfio.v2.open_url('s3://<bucket>/DSC07917.jpg', 'rb') as f:
...     print(PIL.Image.open(f).size)
(2626, 1776)
CPU times: user 51.5 ms, sys: 0 ns, total: 51.5 ms
Wall time: 115 ms

Thanks to the buffering, now it performs very similar speed to the BytesIO workaround.

Discussion

I set buffer size 32MiB to pfio.v2.s3.DEFAULT_BUFFER_SIZE.

When handling local (posix) files, Python uses 8192 bytes as the default buffer size ((io.DEFAULT_BUFFER_SIZE](https://docs.python.org/3/library/io.html#io.DEFAULT_BUFFER_SIZE)).
However for S3 since HTTP overhead is the way larger than native filesystem, the buffer size should be much larger for efficiency. I thought 32MiB would be a good balance, but maybe it's better to do some sort of micro-benchmarks.

If you suggest some specific values or way to identify the best number, please make an advice 🙇.

Another content

_ObjectReader.readall was implemented as follows:

    def readall(self):
        self.seek(0)         <---------
        return self.read(-1)

However this behavior (specifically, seeking to the head as shown as an arrow above) is inconsistent to normal filesystems.

>>> f = open('hoge.dat', 'rb')
>>> print(f.read())
>>> f.seek(5, os.SEEK_SET)
>>> print('tell() =', f.raw.tell())
>>> print(f.raw.readall())
b'0123456789'
tell() = 5
b'56789'

This behavior prevented the buffering to work properly, this PR fixes it too.

References

Python official document provides by far insufficient information. These resources helped me a lot learning how/what to implement in readinto.

kuenishi · 2021-12-27T08:14:31Z

Thank you for very thoughtful suggestion with real benchmarks. I'd welcome the idea and that's what I wanted to do, indicated in the TODO comment.

My suggestion would be add buffering keyword argument as well as in open() builtin function and like in other FS systems, to align the interface regarding the potential possible options like in open_url(path, buffering=-1) . See also Local#open and Hdfs#open . The document of open() says:

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer.

Thus the default value of 32MB would be nice, but I want to leave a mean for app developers to configure the buffer size.

…stems

…tems

belltailjp · 2021-12-27T15:20:49Z

/test

pfn-ci-bot · 2021-12-27T15:20:52Z

Successfully created a job for commit 52d5cf5:

Dashboard for commit 52d5cf5

pfn-ci-bot · 2021-12-27T15:20:52Z

Successfully created a job for commit 52d5cf5:

Dashboard for commit 52d5cf5

belltailjp · 2021-12-27T15:33:00Z

/test

pfn-ci-bot · 2021-12-27T15:33:02Z

Successfully created a job for commit b322a63:

Dashboard for commit b322a63

pfn-ci-bot · 2021-12-27T15:33:03Z

Successfully created a job for commit b322a63:

Dashboard for commit b322a63

…ndency in open_url arguments)

belltailjp · 2021-12-28T06:25:11Z

Although the performance benchmark would heavily depend on S3 configurations, network conditions and the content, I conducted a small toy single-stream data read throughput benchmark with varying file size and buffer size on our internal Ozone cluster.

Benchmark code

import os
import pickle
import random
import string
import time

import pfio
import numpy as np


def rand_str(n):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=n))


def bench(path, n_loop=5):
    print('mode', mode)
    print("| File size (MiB) | buffer size (MiB) | time (s) | stddev (s) | throughput (MiB/s) |")
    print("|:----|:----|:----|:----|:----|")

    for k in range(-2, 13, 2):
        # approx 2^k MB of pickle data.
        size = int(1024 * (2 ** k))
        data = {rand_str(32): rand_str(1024) for _ in range(size)}
        with pfio.v2.open_url(path, 'wb') as f:
            pickle.dump(data, f)
        with pfio.v2.from_url('s3://<bucket>/') as fs:
            actual_size = fs.stat(os.path.basename(path)).size

        for s in range(11):
            buffer_size = int(1024 * 1024 * (2 ** s))

            times = []
            for _ in range(n_loop):
                with pfio.v2.open_url(path, 'rb', buffering=buffer_size) as f:
                    before = time.time()
                    loaded_data = pickle.load(f)
                    assert list(loaded_data.keys())[0] == list(data.keys())[0]
                    after = time.time()
                    times.append(after - before)
            times = np.array(times)
            throughput = (actual_size / times.mean()) / (1024 ** 2)
            print('| {:.2f} | {:.0f} | {:.3f} | {:.3f} | {:.3f} |'
                  .format(actual_size / (1024 ** 2), buffer_size / (1024 ** 2),
                          times.mean(), times.std(), throughput))


bench("s3://<bucket>/bench.pkl")

The full output of the script.

File size (MiB)	buffer size (MiB)	time (s)	stddev (s)	throughput (MiB/s)
0.26	1	0.061	0.071	4.298
0.26	2	0.025	0.002	10.541
0.26	4	0.026	0.001	10.090
0.26	8	0.029	0.001	8.901
0.26	16	0.035	0.004	7.386
0.26	32	0.047	0.002	5.546
0.26	64	0.068	0.001	3.834
0.26	128	0.099	0.006	2.634
0.26	256	0.175	0.001	1.484
0.26	512	0.315	0.019	0.826
0.26	1024	0.590	0.038	0.441
1.04	1	0.110	0.010	9.498
1.04	2	0.039	0.002	26.376
1.04	4	0.038	0.001	27.619
1.04	8	0.038	0.001	27.375
1.04	16	0.044	0.002	23.856
1.04	32	0.058	0.005	17.891
1.04	64	0.076	0.005	13.619
1.04	128	0.106	0.003	9.785
1.04	256	0.179	0.002	5.797
1.04	512	0.324	0.011	3.207
1.04	1024	0.580	0.031	1.792
4.16	1	0.321	0.007	12.982
4.16	2	0.212	0.012	19.615
4.16	4	0.142	0.010	29.297
4.16	8	0.096	0.018	43.422
4.16	16	0.114	0.021	36.466
4.16	32	0.092	0.007	45.190
4.16	64	0.102	0.007	40.959
4.16	128	0.122	0.002	33.966
4.16	256	0.190	0.007	21.931
4.16	512	0.334	0.007	12.454
4.16	1024	0.579	0.023	7.189
16.64	1	1.251	0.048	13.298
16.64	2	0.798	0.053	20.866
16.64	4	0.526	0.018	31.658
16.64	8	0.411	0.024	40.462
16.64	16	0.366	0.011	45.504
16.64	32	0.304	0.015	54.684
16.64	64	0.305	0.020	54.649
16.64	128	0.349	0.047	47.695
16.64	256	0.323	0.020	51.504
16.64	512	0.386	0.005	43.116
16.64	1024	0.645	0.043	25.818
66.57	1	5.125	0.120	12.989
66.57	2	3.048	0.107	21.842
66.57	4	2.059	0.041	32.334
66.57	8	1.632	0.074	40.787
66.57	16	1.291	0.074	51.580
66.57	32	1.315	0.118	50.609
66.57	64	1.227	0.082	54.256
66.57	128	1.140	0.085	58.408
66.57	256	1.138	0.039	58.519
66.57	512	1.217	0.052	54.716
66.57	1024	1.302	0.090	51.142
266.29	1	20.679	0.271	12.877
266.29	2	12.699	0.095	20.969
266.29	4	8.538	0.075	31.188
266.29	8	6.700	0.307	39.744
266.29	16	5.461	0.077	48.765
266.29	32	5.011	0.161	53.140
266.29	64	5.149	0.187	51.720
266.29	128	5.067	0.272	52.549
266.29	256	4.708	0.156	56.562
266.29	512	4.794	0.304	55.548
266.29	1024	4.493	0.077	59.262
1065.15	1	87.949	1.321	12.111
1065.15	2	52.801	0.467	20.173
1065.15	4	35.530	0.441	29.979
1065.15	8	27.057	0.421	39.367
1065.15	16	21.776	0.416	48.914
1065.15	32	20.257	0.650	52.583
1065.15	64	20.440	0.476	52.111
1065.15	128	20.220	0.423	52.679
1065.15	256	19.298	0.626	55.194
1065.15	512	19.216	0.646	55.431
1065.15	1024	19.457	1.350	54.745

When reading relatively small files (~1MiB), relatively small buffer performs the best.
Files more than several megabytes, buffer size of 16MiB or more becomes efficient. Making buffer as large as possible seems to be better for very large files in terms of throughput, but the difference is relatively slight. Due to the concerning of performance on small files as well as wasting memory, it should be better not to use too much buffer.

As a conclusion I'd say the buffer size of around 16MiB to 64MiB is the best in my experimental setup.

belltailjp · 2021-12-28T08:26:49Z

/test

pfn-ci-bot · 2021-12-28T08:26:51Z

Successfully created a job for commit 6405725:

Dashboard for commit 6405725

pfn-ci-bot · 2021-12-28T08:26:52Z

Successfully created a job for commit 6405725:

Dashboard for commit 6405725

kuenishi

The code is awesome and looks very good... but, please fix the CI failure of annoying isort check.

belltailjp · 2021-12-28T09:42:57Z

/test

pfn-ci-bot · 2021-12-28T09:42:59Z

Successfully created a job for commit af6f2f5:

Dashboard for commit af6f2f5

pfn-ci-bot · 2021-12-28T09:43:00Z

Successfully created a job for commit af6f2f5:

Dashboard for commit af6f2f5

kuenishi · 2021-12-28T09:54:20Z

TODO: if the benchmark result with 16MB wasn't that good as expected, we may change the default size later. cc: @belltailjp

belltailjp changed the title ~~[WIP] Buffering S3 reader~~ [WIP] Buffered S3 reader Dec 26, 2021

kuenishi added this to the 2.1.0 milestone Dec 27, 2021

kuenishi self-requested a review December 27, 2021 07:20

kuenishi added the cat:enhancement Implementation that does not break interfaces. label Dec 27, 2021

belltailjp force-pushed the make-s3-buffered branch from d5f1f6c to d1d0360 Compare December 27, 2021 07:31

belltailjp added 4 commits December 27, 2021 23:44

Provide readinto to support buffered reading

69fb2c5

Return S3 reader wrapped by BufferedReader

81695ed

Reproduction test for readall behavior inconsistency to normal filesy…

939b73a

…stems

Make _ObjectReader.readline behave consistently to the normal filesys…

689e676

…tems

belltailjp force-pushed the make-s3-buffered branch 2 times, most recently from 1a085c6 to 52d5cf5 Compare December 27, 2021 15:20

belltailjp force-pushed the make-s3-buffered branch 2 times, most recently from 7419f2e to b322a63 Compare December 27, 2021 15:24

belltailjp added 4 commits December 28, 2021 10:50

Test for buffering options

afca2d7

Buffering control option

c3402af

Add unused kwargs to each filesystem (to avoid strict filesystem depe…

ca7403d

…ndency in open_url arguments)

Run S3 read-related test cases both buffered and un-buffered

de1d0cb

belltailjp force-pushed the make-s3-buffered branch from b322a63 to de1d0cb Compare December 28, 2021 02:12

Slightly reorganized test cases

fda9ae6

Adaptive buffer size by default

6405725

belltailjp changed the title ~~[WIP] Buffered S3 reader~~ Buffered S3 reader Dec 28, 2021

kuenishi approved these changes Dec 28, 2021

View reviewed changes

Fix isort

af6f2f5

kuenishi merged commit 0f59be9 into pfnet:master Dec 28, 2021

belltailjp mentioned this pull request Dec 28, 2021

Provide 'create' option in each filesystem (v2 API) #245

Merged

kuenishi mentioned this pull request Jan 29, 2022

Document a tip on shutil.copyfileobj() #262

Closed

belltailjp mentioned this pull request Mar 6, 2022

S3.read(-1) for a large file (2^31+α bytes) fails due to an SSL OverflowError #271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffered S3 reader #247

Buffered S3 reader #247

belltailjp commented Dec 26, 2021 •

edited

Loading

kuenishi commented Dec 27, 2021

belltailjp commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

belltailjp commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

belltailjp commented Dec 28, 2021 •

edited

Loading

belltailjp commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

kuenishi left a comment •

edited

Loading

belltailjp commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

kuenishi commented Dec 28, 2021

Buffered S3 reader #247

Buffered S3 reader #247

Conversation

belltailjp commented Dec 26, 2021 • edited Loading

What is done in this PR

Usage

Performance

Discussion

Another content

References

kuenishi commented Dec 27, 2021

belltailjp commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

belltailjp commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

pfn-ci-bot commented Dec 27, 2021

belltailjp commented Dec 28, 2021 • edited Loading

belltailjp commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

kuenishi left a comment • edited Loading

Choose a reason for hiding this comment

belltailjp commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

pfn-ci-bot commented Dec 28, 2021

kuenishi commented Dec 28, 2021

belltailjp commented Dec 26, 2021 •

edited

Loading

belltailjp commented Dec 28, 2021 •

edited

Loading

kuenishi left a comment •

edited

Loading