Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-8823] Make FnApiRunner work by executing ready elements instead of stages #15441

Merged
merged 30 commits into from Oct 12, 2021

Conversation

pabloem
Copy link
Member

@pabloem pabloem commented Sep 1, 2021

This PR modifies the FnApiRunner to execute pipelines per-bundle instead of per-stage.

The previous implementation would work as such:

  1. Order the pipeline DAG topologically
  2. Pick the next stage in the DAG's topological sort
  3. Pass the input bundles to this stage
  4. If there are any deferred inputs from the stage, then go back to step 3.
  5. If there are more stages to execute, go to step 2

The new implementation works as follows:

  1. Create three work queues: (1)ready bundles, (2) watermark-pending bundles, and (3) real-time-pending bundles
  2. Add all the initial bundles (i.e. IMPULSE bundles) to the ready queue
  3. Fetch the next ready bundle, and the stage that consumes it
  4. Execute this bundle
  5. Enqueue all outputs from the execution of the bundle (deferred inputs, downstream outputs)
  6. If any of the outputs from the bundle are side inputs downstream, store them in state
  7. Update all of the watermarks
  8. If there are no ready-bundles remaining, check watermark-pending and time-pending queues to find new ready bundles
  9. If there are more ready bundles, go to step 3

I'm happy to dive into detail for this PR.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

ValidatesRunner compliance status (on master branch)

Lang ULR Dataflow Flink Samza Spark Twister2
Go --- Build Status Build Status Build Status Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Python --- Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status ---
XLang Build Status Build Status Build Status Build Status Build Status ---

Examples testing status on various runners

Lang ULR Dataflow Flink Samza Spark Twister2
Go --- --- --- --- --- --- ---
Java --- Build Status
Build Status
Build Status
--- --- --- --- ---
Python --- --- --- --- --- --- ---
XLang --- --- --- --- --- --- ---

Post-Commit SDK/Transform Integration Tests Status (on master branch)

Go Java Python
Build Status Build Status Build Status
Build Status
Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status Build Status --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Sep 22, 2021

Codecov Report

Merging #15441 (70fd62b) into master (476efbb) will decrease coverage by 0.29%.
The diff coverage is 88.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15441      +/-   ##
==========================================
- Coverage   83.79%   83.50%   -0.30%     
==========================================
  Files         444      445       +1     
  Lines       60474    61413     +939     
==========================================
+ Hits        50674    51281     +607     
- Misses       9800    10132     +332     
Impacted Files Coverage Δ
...nners/portability/fn_api_runner/worker_handlers.py 79.34% <50.00%> (-0.11%) ⬇️
...eam/runners/portability/fn_api_runner/execution.py 91.94% <87.19%> (-1.40%) ⬇️
.../runners/portability/fn_api_runner/translations.py 93.10% <88.88%> (-0.10%) ⬇️
...eam/runners/portability/fn_api_runner/fn_runner.py 89.75% <89.38%> (-1.06%) ⬇️
sdks/python/apache_beam/io/iobase.py 86.21% <100.00%> (ø)
...ers/portability/fn_api_runner/watermark_manager.py 96.00% <100.00%> (ø)
sdks/python/apache_beam/transforms/util.py 95.84% <100.00%> (+0.02%) ⬆️
sdks/python/apache_beam/io/gcp/bigquery.py 62.72% <0.00%> (-12.84%) ⬇️
...ython/apache_beam/runners/interactive/sql/utils.py 76.09% <0.00%> (-7.91%) ⬇️
...he_beam/runners/interactive/sql/beam_sql_magics.py 49.75% <0.00%> (-4.79%) ⬇️
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63211ee...70fd62b. Read the comment docs.

@pabloem pabloem changed the title Per bundle execution [BEAM-8823] Make FnApiRunner work by executing ready elements instead of stages Sep 22, 2021
@pabloem pabloem marked this pull request as ready for review September 22, 2021 18:59
@pabloem pabloem requested a review from y1chi September 27, 2021 18:21
@pabloem
Copy link
Member Author

pabloem commented Sep 27, 2021

r: @y1chi (see first comment for a quick description of the change)

@y1chi
Copy link
Contributor

y1chi commented Sep 27, 2021

r: @y1chi (see first comment for a quick description of the change)

Thanks Pablo, I'll take a look and reach out if I have questions.

Copy link
Contributor

@y1chi y1chi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

self.watermark_manager = WatermarkManager(stages)
# from apache_beam.runners.portability.fn_api_runner import \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep this here as a hint to show that the pipeline can be visualized here.

_LOGGER.debug(
'Enqueuing stage pending watermark. Stage name: %s', stage.name)
self.queues.watermark_pending_inputs.enque(
((stage.name, MAX_TIMESTAMP), DataInput(data_input, {})))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_TIMESTAMP here seems weird to me, should it be MIN_TIMESTAMP if we expect the input to be ready (maybe for streaming)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right - for BATCH, we need the upstream stage to be fully processed before moving forward. As I work on streaming this will change to be attributed to a more appropriate timestamp.

# Timer was cleared, so we must skip setting it below.
timer_cleared = True
continue
if timer_cleared or (transform_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if same timer was set multiple times with clear being called in between?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SDK would only send back the latest of these events, independently of what it is. is that reasonable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's the case when it is grpc data channel https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/bundle_processor.py#L669, IIUC every time timer calls set it will write an output timer to the output queue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah my bad - only the last one will be saved - see in lines 547-548 of the file - we only store the latest event on a given tmier+window+key

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if a timer is cleared for certain key and window we ignore all the other set timers for the timer family in the bundle, am I misunderstanding the condition here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see in lines 546-548 we decode all the timers that have been written, and we key them by (key, window) in a dictionary. Note that if there are multiple timers in the same (key, window), only the latest one will be saved in the timers_by_key_and_window dictionary.

Then, in the loop starting at line 551, we read the latest timer action for each (key, window)

So we will only apply the latest action - whether it is clear or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the loop starting at line 551, the timer_cleared is set to True as long as one (key, window) has a clear timer, and all other (key, window) timers are skipped and not append to newly_set_timers because timer_cleared is True and the continue jumps to next iteration of loop starting at line 537. Isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah great observation Yichi.... I've changed this code - we keep timers per dynamic timer tag, and ew clean up only previous sets - but allow new sets to work - so we'll only use the LATEST action on the timer.

Pushing code in a bit.

Comment on lines 749 to 756
buffer = runner_execution_context.pcoll_buffers.get(
buffer_id, ListBuffer(None))

if buffer and buffer_id in buffers_to_clean:
runner_execution_context.pcoll_buffers[buffer_id] = buffer.copy()
buffer = runner_execution_context.pcoll_buffers[buffer_id]
if buffer_id in runner_execution_context.pcoll_buffers:
buffers_to_clean.add(buffer_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it a bit hard to follow the logic here. Are we just trying to pop the pcollection buffer with the buffer id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments for the special cases. lmk if that makes sense.

@pabloem
Copy link
Member Author

pabloem commented Oct 5, 2021

oops looking at failures...

@pabloem
Copy link
Member Author

pabloem commented Oct 5, 2021

thanks @y1chi ! this is ready for another round of review : )

@pabloem pabloem requested a review from y1chi October 5, 2021 21:33
Copy link
Contributor

@y1chi y1chi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment addressing commit seems missing.

@pabloem
Copy link
Member Author

pabloem commented Oct 6, 2021

the commit that tries to address comments is this one: 4a4067b

it got stumped a bit by the merge commit on top

# Timer was cleared, so we must skip setting it below.
timer_cleared = True
continue
if timer_cleared or (transform_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the loop starting at line 551, the timer_cleared is set to True as long as one (key, window) has a clear timer, and all other (key, window) timers are skipped and not append to newly_set_timers because timer_cleared is True and the continue jumps to next iteration of loop starting at line 537. Isn't it?

Comment on lines +758 to +759
runner_execution_context.pcoll_buffers[buffer_id] = buffer.copy()
buffer = runner_execution_context.pcoll_buffers[buffer_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is runner_execution_context.pcoll_buffers[buffer_id] = buffer.copy() creaing copy for every stage, isn't it just overriding the original buffer with it's own copy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - so the flow is like this:

  1. Run stage, and write its outputs to pcoll_buffers
  2. For each stage output, do:
  3. Get the data buffer from pcoll_buffers
  4. Find its next downstream consumer and enqueue this buffer to be consumed
  5. If there are more downstream consumers, make a copy of the buffer, add it to pcoll_buffers, and go back to point 3
  6. If there are no more downstream consumers, continue to next step

In the general case, a PCollection will have only one consumer - so the buffer will not need to be copied, but if there are multiple downstream consumers, then we create copies starting for the second one, so that each buffer copy is pushed to one consumer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, I guess what confused me is why we need to write the copy back to pcoll_buffers.

@pabloem
Copy link
Member Author

pabloem commented Oct 8, 2021

Run Python 3.8 PostCommit

@pabloem
Copy link
Member Author

pabloem commented Oct 8, 2021

address your timer comments here: #15441 (comment) - and with the latest commit

Copy link
Contributor

@y1chi y1chi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pabloem pabloem merged commit ef43645 into apache:master Oct 12, 2021
robertwb added a commit to robertwb/incubator-beam that referenced this pull request Oct 14, 2021
…nner work by executing ready elements instead of stages"

This reverts commit ef43645.
dmitriikuzinepam pushed a commit to dmitriikuzinepam/beam that referenced this pull request Nov 2, 2021
…k by executing ready elements instead of stages

* [BEAM-9640] Sketching watermark tracking on FnApiRunner.

* Addressing some comments

* Fixups

* fixing bug with truncation of restrictions

* Fixing output watermark for stages

* Moving visualization tools to different file

* Individual stages are run bundle-based

* [wip] Working on per-bundle execution

* fixups

* Fixing tests

* cleanup

* fixing lint, formatting, some typing info

* More tests passing

* fix import

* fix test

* Fix most other tests

* all passin

* testout

* testing weird fix, hehe

* fixup

* Fix formatting

* fix typing issues

* fixup

* fixup

* cleanup

* addressing comments

* fix typoschmypo

* proper timer handling
dmitriikuzinepam pushed a commit to dmitriikuzinepam/beam that referenced this pull request Nov 2, 2021
…nner work by executing ready elements instead of stages"

This reverts commit ef43645.
pabloem added a commit to pabloem/beam that referenced this pull request Nov 2, 2021
… FnApiRunner work by executing ready elements instead of stages""

This reverts commit a2f08e5.
pabloem added a commit to pabloem/beam that referenced this pull request Feb 13, 2022
… FnApiRunner work by executing ready elements instead of stages""

This reverts commit a2f08e5.
pabloem added a commit to pabloem/beam that referenced this pull request Mar 17, 2022
… FnApiRunner work by executing ready elements instead of stages""

This reverts commit a2f08e5.
pabloem added a commit that referenced this pull request Apr 1, 2022
…xecuting ready elements instead of stages

* Revert "Revert "Merge pull request #15441 from [BEAM-8823] Make FnApiRunner work by executing ready elements instead of stages""

This reverts commit a2f08e5.

* improving/fixing side input handling

* fixup

* fixup

* fixup

* fixup and new test

* fixup

* fixup

* Addressing comments

* Adding comments for clarity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants