[FLINK-10331][network] reduce unnecesary flushing #6692

NicoK · 2018-09-13T22:34:19Z

What is the purpose of the change

With the re-design of the record writer interaction with the result(sub)partitions, flush requests can currently pile up in these scenarios:

a previous flush request has not been completely handled yet and/or is still enqueued or
the network stack is still polling from this subpartition and doesn't need a new notification

These lead to increased notifications in low latency settings (low output flusher intervals) which can be avoided.

Brief change log

do not flush (again) in the scenarios mentioned above, relying on flushRequested and the buffer queue size
add intensive sanity checks to SpillingAdaptiveSpanningRecordDeserializer
several smaller improvement hotfixes (please see the individual commits)

Verifying this change

This change is already covered by existing tests plus a few new tests in PipelinedSubpartitionTest.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive): yes (depending on output flusher interval, rather per buffer)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? no
If yes, how is the feature documented? JavaDocs

pnowojski

Mostly LGTM

pnowojski · 2018-09-18T07:27:02Z

...ntime/src/main/java/org/apache/flink/runtime/io/network/partition/PipelinedSubpartition.java

+			}
+			if (!flushRequested) {
+				flushRequested = true; // set this before the notification!
+				if (buffers.size() == 1) {


this if check deserves an explanation, either here or in the java doc above.

If there is more then 1 buffer, when we were adding second one we have already notified the reader

sure - I think, I explained the flushing behaviour in the class' JavaDoc

Whenever {@link #add(BufferConsumer)} adds a finished {@link BufferConsumer} or a second {@link BufferConsumer} (in which case we will assume the first one finished), we will {@link PipelinedSubpartitionView#notifyDataAvailable() notify} a read view created via {@link #createReadView(BufferAvailabilityListener)} of new data availability. Except by calling {@link #flush()} explicitly, we always only notify when the first finished buffer turns up and then, the reader has to drain the buffers via {@link #pollBuffer()} until its return value shows no more buffers being available.

But it doesn't hurt to have something small / more explicit here as well

pnowojski · 2018-09-18T07:30:41Z

...e/src/test/java/org/apache/flink/runtime/io/network/partition/PipelinedSubpartitionTest.java

+	 */
+	@Test
+	public void testFlushWithUnfinishedBufferBehindFinished2() throws Exception {
+		final ResultSubpartition subpartition = createSubpartition();


Extract those initialisations to setup/teardown/@Rule. This code block is duplicated couple of times:

final ResultSubpartition subpartition = createSubpartition(); AwaitableBufferAvailablityListener availablityListener = new AwaitableBufferAvailablityListener(); ResultSubpartitionView readView = subpartition.createReadView(availablityListener); availablityListener.resetNotificationCounters(); (...) } finally { subpartition.release(); }

unfortunately, not every unit test works on the same setup - are you proposing to

instantiate these nonetheless and let those be unused in some tests, or

split the unit test into one with and one without this initialization?
Or maybe I'm not aware of some trick that solves this...

ok, I forked off the methods with read view and availability listener to PipelinedSubpartitionWithReadViewTest

pnowojski · 2018-09-18T07:36:11Z

...ime/src/test/java/org/apache/flink/runtime/io/network/partition/InputGateConcurrentTest.java

@@ -198,6 +199,8 @@ public void testConsumptionWithMixedChannels() throws Exception {
 	private abstract static class Source {

 		abstract void addBufferConsumer(BufferConsumer bufferConsumer) throws Exception;
+
+		abstract void flush();


Why did you need to add this flush? What was wrong?

depending on the implementation in PipelinedSubpartition, i.e. if (buffers.size() == 1 && buffers.peekLast().isFinished()) or whatever we change it to (we don't make guarantees here!), the producer thread may not have flushed its last record after finishing and the source would wait forever (no output flusher in that test)
-> we need to flush all channels before leaving the producer

NicoK · 2018-09-18T10:18:11Z

thanks for the review - I integrated the changes
(FYI: I already merged the additional comment to PipelinedSubpartition into the respective commit because that change is so small and no further squashing is now required)

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes apache#6692.

pnowojski

Thanks for the changes. LGTM :)

…e record length Once we know the record length and if we are not spilling, we should size the buffer immediately to the expected record size, and not incrementally for each received buffer chunk.

…rBuilder and BufferConsumer

…guarantees - producers should flush after writing to make sure all data has been sent - we can only check bufferConsumer.isFinished() after building a Buffer - producer/consumer threads should be named

…BufferOrEvent()

…itionTest

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes apache#6692.

…nitialization - add PipelinedSubpartitionWithReadViewTest which always creates a subpartition, an availability listener, and a read view before each test and cleans up after each test - remove mockito use from testBasicPipelinedProduceConsumeLogic()

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes #6692.

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes apache#6692.

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes #6692.

NicoK requested a review from pnowojski September 13, 2018 22:34

NicoK mentioned this pull request Sep 13, 2018

[FLINK-10332][network] move data notification out of the synchronized block #6693

Merged

NicoK force-pushed the f10331 branch from cadc92f to f444375 Compare September 17, 2018 14:58

pnowojski requested changes Sep 18, 2018

View reviewed changes

NicoK force-pushed the f10331 branch from f444375 to 0906f79 Compare September 18, 2018 10:12

NicoK force-pushed the f10331 branch from 0906f79 to 2bce3fb Compare September 18, 2018 12:01

pnowojski approved these changes Sep 19, 2018

View reviewed changes

Nico Kruber added 9 commits September 19, 2018 12:26

[hotfix][checkstyle] Remove suppression for runtime/network.partition

a08822e

[hotfix][network] ensure deserialization buffer capacity for the whol…

854a40f

…e record length Once we know the record length and if we are not spilling, we should size the buffer immediately to the expected record size, and not incrementally for each received buffer chunk.

[hotfix][network] some minor improvements around the network stack

ad7af52

[hotfix][network] minor optimisations and clarifications around Buffe…

7bb5320

…rBuilder and BufferConsumer

[hotfix][network] adapt InputGateConcurrentTest to really follow our …

afda72f

…guarantees - producers should flush after writing to make sure all data has been sent - we can only check bufferConsumer.isFinished() after building a Buffer - producer/consumer threads should be named

[hotfix][network][tests] add readView.nextBufferIsEvent to assertNext…

6f20447

…BufferOrEvent()

[hotfix][network][tests] use assertNextBuffer etc in PipelinedSubpart…

ad82448

…itionTest

[FLINK-10331][network] reduce unnecessary flushing

ef9ae5e

Do not flush (again) if - a previous flush request has not been completely handled yet and/or is still enqueued or - the network stack is still polling from this subpartition and doesn't need a new notification This closes apache#6692.

NicoK force-pushed the f10331 branch from 2bce3fb to e230dc6 Compare September 19, 2018 10:26

NicoK merged commit 915db25 into apache:release-1.5 Sep 19, 2018

rmetzger added the component=Runtime/Network label Mar 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-10331][network] reduce unnecesary flushing #6692

[FLINK-10331][network] reduce unnecesary flushing #6692

NicoK commented Sep 13, 2018

pnowojski left a comment

pnowojski Sep 18, 2018

NicoK Sep 18, 2018 •

edited

pnowojski Sep 18, 2018

NicoK Sep 18, 2018

NicoK Sep 18, 2018

pnowojski Sep 18, 2018

NicoK Sep 18, 2018

NicoK commented Sep 18, 2018

pnowojski left a comment

[FLINK-10331][network] reduce unnecesary flushing #6692

[FLINK-10331][network] reduce unnecesary flushing #6692

Conversation

NicoK commented Sep 13, 2018

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

pnowojski left a comment

Choose a reason for hiding this comment

pnowojski Sep 18, 2018

Choose a reason for hiding this comment

NicoK Sep 18, 2018 • edited

Choose a reason for hiding this comment

pnowojski Sep 18, 2018

Choose a reason for hiding this comment

NicoK Sep 18, 2018

Choose a reason for hiding this comment

NicoK Sep 18, 2018

Choose a reason for hiding this comment

pnowojski Sep 18, 2018

Choose a reason for hiding this comment

NicoK Sep 18, 2018

Choose a reason for hiding this comment

NicoK commented Sep 18, 2018

pnowojski left a comment

Choose a reason for hiding this comment

NicoK Sep 18, 2018 •

edited