ARROW-12522: [C++] Add ReadRangeCache::WaitFor #10145

lidavidm · 2021-04-23T19:59:07Z

This was split out of ARROW-11883 since it may also be useful to test with ARROW-11772.

This adds a method to get a Future<> from a ReadRangeCache so it can be easily used in an async context. Also, it adds a config flag to make the cache not perform readahead so that readahead can be handled at a different layer of the stack.

github-actions · 2021-04-23T19:59:26Z

https://issues.apache.org/jira/browse/ARROW-12522

lidavidm · 2021-04-26T15:00:12Z

CC @westonpace if you'd like to take a look.

There's an example of it being used to adapt the IPC reader here: lidavidm@24f3bb1#diff-e992169684aea9845ac776ada4cbb2b5dc711b49e5a3fbc6046c92299e1aefceR1380-R1412

westonpace

I wonder if, given a bunch of small record batches, we might sometimes want to coalesce across record batches. I think the current design preempts that. Although I think there would be more challenges than just this tool to tackle that problem.

For example, AWS likes 8MB/16MB reads source for S3 and prefers 256KB reads source and Linux doesn't really care (it does its own read coalescing in the kernel, although in theory maybe this would apply to the EBS case as well).

However, if the filesystem was waiting for 8MB to fill up before issuing a read and batch readahead was low we may never fill up enough data to trigger the read. I don't think this is something we should solve right now but maybe something to think about going forwards.

I suppose we could update batch readahead to be bytes instead of # of batches (makes more sense anyways given the unpredictable nature of batch sizes). Then the read cache could be given a max "buffering" size and then it could issue reads whenever the cache gets bigger than either the "buffering limit" or the "ideal read size" of the filesystem.

westonpace · 2021-04-27T16:30:59Z

cpp/src/arrow/io/caching.cc

+  }
+
+  RangeCacheEntry Cache(const ReadRange& range) override {
+    return {range, Future<std::shared_ptr<Buffer>>()};


What fills this future?

It's a little unclear, my bad - what'll happen is the user calls Cache(vector<Range>), which coalesces the ranges and calls Cache(Range) for each coalesced range to make a cache entry. I'll rename the functions and clarify inline.

westonpace · 2021-04-27T16:35:11Z

cpp/src/arrow/io/caching.cc

+        if (next.offset >= entry.range.offset &&
+            next.offset + next.length <= entry.range.offset + entry.range.length) {
+          include = true;
+          ranges.pop_back();


Is there no case where a range will span two entries?

You are expected to give the ranges up front in the granularity that you expect to read them, so no. In principle it could be supported and in principle if we wanted to split large ranges to take advantage of I/O parallelism we'd have to do that.

westonpace · 2021-04-27T16:41:07Z

cpp/src/arrow/io/caching.cc

+
+  Status Cache(std::vector<ReadRange> ranges) override {
+    std::unique_lock<std::mutex> guard(entry_mutex);
+    return ReadRangeCache::Impl::Cache(std::move(ranges));


The current file->ReadAsync has some leeway in it which allows the method to be synchronous if needed. If that is the case this could end up holding onto the lock for a while. Actually, it looks like you have guards on the Wait/WaitFor method as well so perhaps this isn't intended to be consumed by multiple threads?

Could you maybe add a short comment explaining how you expect this class to be used (e.g. first a thread does a bunch of cache calls and then a bunch of read calls? Or maybe there are multiple threads calling cache or read?)

Adding to this, could you create a simple test case around whatever type of multithreading you expect to guard against with these mutexes?

Hmm. I feel that if ReadAsync is synchronous, that's because it's also very fast (e.g. in-memory copy), in which case it's not a concern. I'll document the usage pattern and both variants.

Or put another way, (when lazy == true) this 'passes through' the synchronicity of ReadAsync (an oxymoron if there ever was one), which is the intent.

lidavidm · 2021-04-27T17:32:47Z

I wonder if, given a bunch of small record batches, we might sometimes want to coalesce across record batches. I think the current design preempts that. Although I think there would be more challenges than just this tool to tackle that problem.

So overall, the use pattern for this class is:

Cache() all byte ranges you expect to read in the future, in the granularity that you expect to read them. So you'd call Cache for every record batch (IPC), or for every column chunk (Parquet).
WaitFor() the ranges that you need. For IPC, this would again be one record batch; for Parquet, this would be one row group's worth of column chunks. This can be done in parallel/reentrantly and is why we need the lock in the lazy variant.
Read the ranges that you need.

Since all the byte ranges are given up front, you do get coalescing across record batches/column chunks.

pitrou · 2021-04-28T08:32:42Z

cpp/src/arrow/io/memory_test.cc

 TEST(RangeReadCache, Basics) {
  std::string data = "abcdefghijklmnopqrstuvwxyz";

-  auto file = std::make_shared<BufferReader>(Buffer(data));
+  auto file = std::make_shared<CountingBufferReader>(Buffer(data));


Should you test both lazy and non-lazy versions here?

pitrou · 2021-04-28T08:45:30Z

cpp/src/arrow/io/caching.cc

+              [](const ReadRange& a, const ReadRange& b) { return a.offset > b.offset; });
+
+    std::vector<Future<>> futures;
+    for (auto& entry : entries) {


This algorithm looks a bit unexpected to me. Basically, you're iterating all known entries in the hope that they might match a requested range? It will be a bit costly if the number of entries is much larger than the number of requested ranges, since you may iterate all entries.

Why not do the converse? For each requested range, try to find it in the existing entries. It is doable using bisection (see Read above), and you shouldn't need to sort the requested ranges.

Thanks, I've changed the implementation. This is definitely better (avoids a sort and since # ranges is likely << # entries, it's ~O(log(# entries)) instead of ~O(# entries)).

pitrou · 2021-04-28T08:48:49Z

cpp/src/arrow/io/caching.cc

+  }
+
+  // Make a cache entry for a range
+  virtual RangeCacheEntry MakeCacheEntry(const ReadRange& range) {


You may make this std::vector<RangeCacheEntry> MakeCacheEntries(const std::vector<ReadRange>&) instead and you will issue one virtual call instead of N.

lidavidm · 2021-04-28T12:56:05Z

Ah, I should backport a fix from ba7ba9e as well. (The coalescer didn't handle completely-overlapping ranges which the Parquet reader can generate when reading a 0-row file.)

lidavidm · 2021-04-29T13:44:46Z

This should be ready again; I've incorporated the feedback + added a fix for ranges that are completely identical (which the Parquet reader can generate if there's 0 rows).

pitrou

+1, LGTM

This was split out of ARROW-11883 since it may also be useful to test with ARROW-11772. This adds a method to get a Future<> from a ReadRangeCache so it can be easily used in an async context. Also, it adds a config flag to make the cache not perform readahead so that readahead can be handled at a different layer of the stack. Closes apache#10145 from lidavidm/async-cache Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>

lidavidm added the Component: C++ label Apr 23, 2021

lidavidm force-pushed the async-cache branch from 7b50efb to e2239c6 Compare April 23, 2021 20:10

lidavidm mentioned this pull request Apr 26, 2021

ARROW-11772: [C++] Provide reentrant IPC file reader #9656

Closed

lidavidm mentioned this pull request Apr 27, 2021

ARROW-11843: [C++] Provide async Parquet reader #9620

Closed

lidavidm force-pushed the async-cache branch 3 times, most recently from ea524e5 to 6f2093e Compare April 27, 2021 14:07

westonpace reviewed Apr 27, 2021

View reviewed changes

lidavidm force-pushed the async-cache branch from 6f2093e to 1b8b0d1 Compare April 27, 2021 18:07

ARROW-12522: [C++] Add ReadRangeCache::WaitFor

03ce3c8

lidavidm force-pushed the async-cache branch from 1b8b0d1 to 03ce3c8 Compare April 27, 2021 19:51

pitrou reviewed Apr 28, 2021

View reviewed changes

ARROW-12522: [C++] Improve ReadRangeCache::WaitFor

49a9dfe

ARROW-12522: [C++] Coalesce completely-overlapping ranges

c84d609

pitrou approved these changes May 4, 2021

View reviewed changes

pitrou closed this in fc10964 May 4, 2021

asfimport mentioned this pull request May 4, 2021

[C++] Implement asynchronous/"lazy" variants of ReadRangeCache #28286

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-12522: [C++] Add ReadRangeCache::WaitFor #10145

ARROW-12522: [C++] Add ReadRangeCache::WaitFor #10145

lidavidm commented Apr 23, 2021

github-actions bot commented Apr 23, 2021

lidavidm commented Apr 26, 2021

westonpace left a comment

westonpace Apr 27, 2021

lidavidm Apr 27, 2021

westonpace Apr 27, 2021

lidavidm Apr 27, 2021

westonpace Apr 27, 2021

westonpace Apr 27, 2021

lidavidm Apr 27, 2021

lidavidm Apr 27, 2021

lidavidm commented Apr 27, 2021 •

edited

Loading

pitrou Apr 28, 2021

pitrou Apr 28, 2021

lidavidm Apr 28, 2021

pitrou Apr 28, 2021

lidavidm commented Apr 28, 2021

lidavidm commented Apr 29, 2021

pitrou left a comment

ARROW-12522: [C++] Add ReadRangeCache::WaitFor #10145

ARROW-12522: [C++] Add ReadRangeCache::WaitFor #10145

Conversation

lidavidm commented Apr 23, 2021

github-actions bot commented Apr 23, 2021

lidavidm commented Apr 26, 2021

westonpace left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lidavidm commented Apr 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lidavidm commented Apr 28, 2021

lidavidm commented Apr 29, 2021

pitrou left a comment

Choose a reason for hiding this comment

lidavidm commented Apr 27, 2021 •

edited

Loading