[BEAM-8273] Expand portability environment documentation#10116
[BEAM-8273] Expand portability environment documentation#10116tweise merged 6 commits intoapache:masterfrom
Conversation
|
Thanks for taking this up! For external environment here are some notes in case you want to include: |
Document the EXTERNAL environment. Add additional instructions for using the PROCESS environment.
|
Thanks for the suggestions Thomas. It looks like |
Use Python 3.6 as example because of recent difficulties with 3.7 (BEAM-8651).
It is probably also needed for Windows. |
Good point. I added your suggestions, PTAL. |
Originally, it was a MacOS specific workaround but it looks like it applies to Windows as well: https://docs.docker.com/docker-for-windows/networking/ |
|
@ibzib this entire section doesn't belong into the roadmap. I think this is a good opportunity to add a portability page to https://beam.apache.org/documentation/ and start adding info there. |
There's a page dedicated to "Runtime environments," but currently it only focuses on how to "customize, build, and push Beam SDK container images." I suppose that might be a logical place for this? I should also probably update the pipeline options source with some of this information. |
|
Yep, good find. "Runtime Environments" should probably be renamed "Containers" or something like that and then this could be added as "Portable Pipeline Environments" |
website/src/roadmap/portability.md
Outdated
| `export BEAM_WORKER_POOL_IN_DOCKER_VM=1`. | ||
| - `LOOPBACK`: User code is executed within the same process that submitted the pipeline. This | ||
| option is useful for local testing. However, it is not suitable for a production environment, | ||
| as it requires a connection between the original Python process and the worker nodes, and |
There was a problem hiding this comment.
This isn't specific to Python.
There was a problem hiding this comment.
Removed the reference to Python. Also added a note that while these are Python options, they might apply to other SDKs as well. (Maybe a follow-up could be to fill this in for Java and Go.)
website/src/roadmap/portability.md
Outdated
| - `LOOPBACK`: User code is executed within the same process that submitted the pipeline. This | ||
| option is useful for local testing. However, it is not suitable for a production environment, | ||
| as it requires a connection between the original Python process and the worker nodes, and | ||
| performs work on the machine the job originated from, *not the worker nodes*. |
There was a problem hiding this comment.
There aren't any workers in this case. I would phrase this as "rather than starting up worker nodes, it calls back to the original process that submitted the job to process the data" or something like that.
There was a problem hiding this comment.
Removed the references to worker nodes.
|
@ibzib Do you want to merge/address the remaining suggestions? |
|
@tweise I moved this section to its own page. |
tweise
left a comment
There was a problem hiding this comment.
Thanks for moving the page!
|
@ibzib please check if the JIRA should be closed: https://issues.apache.org/jira/browse/BEAM-8273 |
Document the EXTERNAL environment.
Add additional instructions for using the PROCESS environment.
cc @functicons
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.