[BEAM-3689] Fix unbounded reader leak in direct-runner. #4658

rangadi · 2018-02-12T00:53:28Z

Direct runner reads 10 records at a time from a reader. I think the intention is to reuse the reader, but it reuses only if the reader is idle initially, not when the source has messages available.

When I was testing KafkaIO with direct runner it kept opening new reader for every 10 records and soon ran out of file descriptors.

Btw, direct runner closes the reader every 20 input bundles on average. Since each bundle is at most 10 elements long.. we recreate a reader for every 200 records (not related to this fix). I know direct runner is not meant for production applications, but this seems really low even for toy applications.

rangadi · 2018-02-12T00:55:25Z

+R: @tgroh, @jkff

rangadi · 2018-02-12T04:28:25Z

finalize checkpoint and close the reader at end of reader.
avoid invoking getWatermark() twice on the reader.

tgroh

I mean, the number 10 is explicitly arbitrary (living under ARBITRARY_MAX_ELEMENTS and whatnot) so there's no reason it can't be changed, though you'll probably want to make it configurable with a larger default so you don't have to add a billion elements to all of the tests.

tgroh · 2018-02-13T15:59:38Z

.../direct-java/src/main/java/org/apache/beam/runners/direct/UnboundedReadEvaluatorFactory.java

          }
+          UnboundedSourceShard<OutputT, CheckpointMarkT> residual = UnboundedSourceShard.of(


So the missing element (in the continuous input case) is that getReader is potentially creating a new reader which we don't populate in the shard, right?

The test we have demonstrates that with an input UnboundedSourceShard with a reader we'll reuse it, but I don't know if we have a test that demonstrates that with no input reader we'll produce a shard with one, which I think is the best way to describe the bug. Can you add such a test?

In addition to the test I updated, right? Will do.

The test we have demonstrates that with an input UnboundedSourceShard with a reader we'll reuse it, but I don't know if we have a test that demonstrates that with no input reader we'll produce a shard with one, which I think is the best way to describe the bug.

Can you rephrase it? The test does start with a shard without a reader and ensures that it is reused.

I mean it ensures that the reader is created only once. And similar to other tests in the file, it does not actually assert the output is correct.

Basically, we have two cases of input:
UnboundedSourceShard{source, deduplicator, reader.absent(), checkpoint}
UnboundedSourceShard{source, deduplicator, reader.present(), checkpoint}

and we have a test to demonstrate that if given UnboundedSourceShard{source, deduplicator, reader.present(), checkpoint}, we produce UnboundedSourceShard{source, deduplicator, reader.present(), newCheckpoint}

but no test to demonstrate that if given UnboundedSourceShard{source, deduplicator, reader.absent(), checkpoint} we produce UnboundedSourceShard{source, deduplicator, reader.present(), newCheckpoint}, which is the base case that we don't handle properly right now, which means we almost never get to the reuse case.

Line 291 [[1]] creates a UnboundedSourceShard{source, deduplicator, reader.absent(), checkpoint}, and it is added to inputBundle right below that.

and we have a test to demonstrate that if given UnboundedSourceShard{source, deduplicator, reader.present(), checkpoint}, we produce UnboundedSourceShard{source, deduplicator, reader.present(), newCheckpoint}

Are you referring to evaluatorReusesReader() updated here or is this another test?

Sorry for not multiple follow up questions, I think I am misinterpreting what you are saying.

[[1]]: https://github.com/apache/beam/pull/4658/files#diff-41695f8bd3f499370918186c7f369efdR291

Ahah, yep, that's pretty busted;

Really I think the updates to evaluatorReusesReader is mostly checking that the reader is populated in the second shard.

We should have a test around reusing a reader that is then discarded because there was no input and it was at the end of time (the new branch you've added)

You can probably remove withCheckpoint from the UnboundedSourceShard class as well; I think this is probably the only caller.

Removed withCheckpoint.
Added an option to advance watermark to infinity when the reader reaches the end
Updated the test to:

verify that number of elements produced matches input.

reader is closed when end of stream is reached.

rangadi · 2018-02-13T18:46:18Z

so there's no reason it can't be changed, though you'll probably want to make it configurable

A smaller default is fine too. Shall I send a PR to make this and reuse-chance configurable?

Also close the reader at end of reader. Updated the test for reader reuse.

rangadi · 2018-02-15T19:28:44Z

Squashed commits.

tgroh reviewed Feb 13, 2018

View reviewed changes

tgroh approved these changes Feb 15, 2018

View reviewed changes

Fix unbounded reader leak in direct-runner.

faf5383

Also close the reader at end of reader. Updated the test for reader reuse.

rangadi force-pushed the fix_reader_leak branch from 5250f6a to faf5383 Compare February 15, 2018 19:28

tgroh merged commit f4ee8a1 into apache:master Feb 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEAM-3689] Fix unbounded reader leak in direct-runner. #4658

[BEAM-3689] Fix unbounded reader leak in direct-runner. #4658

rangadi commented Feb 12, 2018 •

edited

Loading

rangadi commented Feb 12, 2018

rangadi commented Feb 12, 2018

tgroh left a comment

tgroh Feb 13, 2018

rangadi Feb 13, 2018

rangadi Feb 13, 2018

rangadi Feb 13, 2018

tgroh Feb 13, 2018

rangadi Feb 13, 2018 •

edited

Loading

tgroh Feb 14, 2018

rangadi Feb 15, 2018

rangadi commented Feb 13, 2018

rangadi commented Feb 15, 2018

		}
		UnboundedSourceShard<OutputT, CheckpointMarkT> residual = UnboundedSourceShard.of(

[BEAM-3689] Fix unbounded reader leak in direct-runner. #4658

[BEAM-3689] Fix unbounded reader leak in direct-runner. #4658

Conversation

rangadi commented Feb 12, 2018 • edited Loading

rangadi commented Feb 12, 2018

rangadi commented Feb 12, 2018

tgroh left a comment

Choose a reason for hiding this comment

tgroh Feb 13, 2018

Choose a reason for hiding this comment

rangadi Feb 13, 2018

Choose a reason for hiding this comment

rangadi Feb 13, 2018

Choose a reason for hiding this comment

rangadi Feb 13, 2018

Choose a reason for hiding this comment

tgroh Feb 13, 2018

Choose a reason for hiding this comment

rangadi Feb 13, 2018 • edited Loading

Choose a reason for hiding this comment

tgroh Feb 14, 2018

Choose a reason for hiding this comment

rangadi Feb 15, 2018

Choose a reason for hiding this comment

rangadi commented Feb 13, 2018

rangadi commented Feb 15, 2018

rangadi commented Feb 12, 2018 •

edited

Loading

rangadi Feb 13, 2018 •

edited

Loading