Skip to content

[FLINK-5021] Makes the ContinuousFileReaderOperator rescalable.#2763

Closed
kl0u wants to merge 4 commits intoapache:masterfrom
kl0u:repart_fs
Closed

[FLINK-5021] Makes the ContinuousFileReaderOperator rescalable.#2763
kl0u wants to merge 4 commits intoapache:masterfrom
kl0u:repart_fs

Conversation

@kl0u
Copy link
Copy Markdown
Contributor

@kl0u kl0u commented Nov 7, 2016

This is the last PR that completes the refactoring of the ContinuousFileReaderOperator so that it can be rescalable. With this, the reader can restart from a savepoint with a different parallelism without compromising the provided exactly-once guarantees.

The whole PR contains 3 commits.

The first removes the EOS special split which was used to signal that no new splits are to be processed. This was useful in the PROCESS_ONCE mode. Now the reader closes by setting a flag and waiting for all the pending splits to be fully processed.

The second puts an additional check in the ContinuousFileMonitoringFunction that guarantees that in the case of the PROCESS_ONCE, the source will not reprocess the directory after recovering from a failure.

Finally, the third integrates the new rescalable state abstractions with the reader so that the reader can restart from a savepoint with different parallelism and still guarantee exactly-once semantics.

R: @aljoscha

@kl0u kl0u force-pushed the repart_fs branch 2 times, most recently from 2afb9cd to 5f51f4e Compare November 11, 2016 12:48
kl0u added 4 commits November 11, 2016 13:50
Without this special split signaling that no more splits are
to arrive, the ContinuousFileReaderOperator now closes by
setting a flag that marks it as closed and exiting when the
flag is set to true and the pending split queue is empty.
This is the last commit that completes the refactoring of the
ContinuousFileReaderOperator so that it can be rescalable.
With this, the reader can restart from a savepoint with a
different parallelism without compromising the provided
exactly-once guarantees.
@aljoscha
Copy link
Copy Markdown
Contributor

I merged this, thanks for your work! 👍

Could you please close this PR and the Jira issue?

@kl0u kl0u closed this Nov 11, 2016
@kl0u kl0u deleted the repart_fs branch March 15, 2017 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants