Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-1013, BEAM-1353] Website fixups after PTransform Style Guide changes #243

Closed
wants to merge 2 commits into from

Conversation

jkff
Copy link

@jkff jkff commented May 12, 2017

@asfbot
Copy link

asfbot commented May 12, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Stage/474/

Jenkins built the site at commit id 001f20e with Jekyll and staged it here. Happy reviewing.

Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again.

@asfbot
Copy link

asfbot commented May 13, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Stage/475/

Jenkins built the site at commit id 98b9b70 with Jekyll and staged it here. Happy reviewing.

Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again.

Copy link
Member

@davorbonaci davorbonaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -24,7 +24,7 @@ Apex was built as stateful stream processor from the ground up. Operators [check

## Translation to Apex DAG

A Beam runner needs to implement the translation from the Beam model to the underlying frameworks execution model. In the case of Apex, the runner will translate the pipeline into the [native (compositional, low level) DAG API](https://www.datatorrent.com/blog/tracing-dags-from-specification-to-execution/) (which is also the base for a number of other API that are available to specify applications that run on Apex). The DAG consists of operators (functional building blocks that are connected with streams. The runner provides the execution layer. In the case of Apex it is distributed stream processing, operators process data event by event. The minimum set of operators covers Beam’s primitive transforms: `ParDo.Bound`, `ParDo.BoundMulti`, `Read.Unbounded`, `Read.Bounded`, `GroupByKey`, `Flatten.FlattenPCollectionList` etc.
A Beam runner needs to implement the translation from the Beam model to the underlying frameworks execution model. In the case of Apex, the runner will translate the pipeline into the [native (compositional, low level) DAG API](https://www.datatorrent.com/blog/tracing-dags-from-specification-to-execution/) (which is also the base for a number of other API that are available to specify applications that run on Apex). The DAG consists of operators (functional building blocks that are connected with streams. The runner provides the execution layer. In the case of Apex it is distributed stream processing, operators process data event by event. The minimum set of operators covers Beam’s primitive transforms: `ParDo.SingleOutput`, `ParDo.MultiOutput`, `Read.Unbounded`, `Read.Bounded`, `GroupByKey`, `Flatten.PCollections` etc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert; we don't change blog posts -- they are dated and should be accurate as the date of publication.

If warranted, we can add a note at the top of the blog that it may be outdated -- but, I wouldn't change the text.

@@ -420,8 +420,9 @@ PCollection<KV<UserId, Event>> events = ...
PCollectionView<Map<UserId, Model>> userModels = events

// A tradeoff between latency and cost
.apply(Window.triggering(
AfterProcessingTime.pastFirstElementInPane(Duration.standardSeconds(1)))
.apply(Window.configure()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(leave this one as is)

Fixes usages of PTransforms affected by changes as part of
https://issues.apache.org/jira/browse/BEAM-1353
Copy link

@melap melap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes!

@@ -1227,7 +1206,7 @@ Each pipeline object has a `CoderRegistry`. The `CoderRegistry` represents a map
The Beam SDK for Python has a `CoderRegistry` that represents a mapping of Python types to the default coder that should be used for `PCollection`s of each type.

{:.language-java}
By default, the Beam SDK for Java automatically infers the `Coder` for the elements of an output `PCollection` using the type parameter from the transform's function object, such as `DoFn`. In the case of `ParDo`, for example, a `DoFn<Integer, String>function` object accepts an input element of type `Integer` and produces an output element of type `String`. In such a case, the SDK for Java will automatically infer the default `Coder` for the output `PCollection<String>` (in the default pipeline `CoderRegistry`, this is `StringUtf8Coder`).
By default, the Beam SDK for Java automatically infers the `Coder` for the elements of a `PCollection` produced by a `PTransform` using the type parameter from the transform's function object, such as `DoFn`. In the case of `ParDo`, for example, a `DoFn<Integer, String>` function object accepts an input element of type `Integer` and produces an output element of type `String`. In such a case, the SDK for Java will automatically infer the default `Coder` for the output `PCollection<String>` (in the default pipeline `CoderRegistry`, this is `StringUtf8Coder`).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest adding a "by" here: "produced by a PTransform by using the type parameter"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(adding the second by in that snippet, I realize upon re-reading there's another by already there)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't notice this before merging. Maybe include this in the next change whatever it is?.. probably doesn't warrant a separate PR.

@asfbot
Copy link

asfbot commented May 15, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Stage/480/

Jenkins built the site at commit id 2a55166 with Jekyll and staged it here. Happy reviewing.

Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again.

@asfgit asfgit closed this in 9ffe5ec May 15, 2017
@jkff jkff deleted the style-guide branch May 15, 2017 19:17
robertwb pushed a commit to robertwb/incubator-beam that referenced this pull request Jun 5, 2018
robertwb pushed a commit to robertwb/incubator-beam that referenced this pull request Jun 5, 2018
melap pushed a commit to apache/beam that referenced this pull request Jun 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants