[BEAM-890] Update compatibility matrix for Spark. by amitsela · Pull Request #65 · apache/beam-site

amitsela · 2016-11-04T09:24:50Z

No description provided.

amitsela · 2016-11-04T09:24:58Z

asfbot · 2016-11-04T09:25:41Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Test/16/

asfbot · 2016-11-04T09:30:08Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Stage/62/

Jenkins built the site at commit id 97c147f with Jekyll and staged it here. Happy reviewing.

Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again.

davorbonaci · 2016-11-04T16:06:05Z

R: @kennknowles

kennknowles

So happy to be changing cells to "yes" :-)

Some suggestions about how to focus the explanations.

kennknowles · 2016-11-04T16:49:31Z

_data/capability-matrix.yml

-            l2: group by window in batch only
-            l3: "Uses Spark's groupByKey for grouping. Grouping by window is currently only supported in batch."
+            l2: support for grouping by panes (streaming) is a work in progress.
+            l3: Using groupByKey for grouping, but only if the pipeline explicitly calls for GroupByKey or the model forces it. For efficient group-compute see Combine.


Say something positive about batch first, like:

"Full support for in batch mode. GroupByKey with multiple trigger firings in streaming mode is a work in progress."

And then I don't think you actually need to teach users about how to program against it here. The statement is likely true for most runners. If you just say "Using Spark's groupByKey" that tells users what they need to know, if they are familiar with Spark. (do use code font, I'd say)

Maybe we should be more clear about this in general. I know that the fact that people saw GroupByKey associated with Spark in the same context caused a bit of "riot" in Twitter a while ago ;-)
Maybe clearing the optmizations via Combine in general is a good idea.

Agree with Amit.

@kennknowles you mean use the tt tag for groupByKey, combineByKey and such ?

Whatever Jekyll requires. I guess it is YAML and I was thinking Markdown.

@jbonofre any input here ? I'm the worst web-developer in the project. For sure.

I think ' should do the trick.

kennknowles · 2016-11-04T16:51:54Z

_data/capability-matrix.yml

            l1: 'Yes'
            l2: fully supported
-            l3: Supports GroupedValues, Globally and PerKey.
+            l3: Using combineByKey and aggregate functions.


"Using Spark's combineByKey and aggregate functions."

kennknowles · 2016-11-04T16:55:54Z

_data/capability-matrix.yml

-            l3: "Side input is actually a broadcast variable in Spark so it can't be updated during the life of a job. Spark-runner implementation of side input is more of an immutable, static, side input."
+            l1: 'Yes'
+            l2: fully supported
+            l3: A side input is actually a broadcast variable in Spark. In streaming mode, a side input could be updated between micro-batches. The distribution of side inputs to workers is not partitioned, but only a worker assigned with a relevant task will get a copy of the side input.


"Using Spark's broadcast variables."

I actually think the rest of it is just a description of what a side input is. They are always global views of a PCollection, and are generally always only read by workers who are working on a task that needs them.

I'm not saying you can't have some explanation, but maybe more focused. Maybe the second sentence is good, like "In streaming mode, side input values only update between micro-batches."

I don't understand the comments for Flink and Dataflow - clearly side inputs have size restrictions by design.. why is streaming different then batch ? and why write it here ?

amitsela · 2016-11-04T17:39:51Z

@kennknowles @jbonofre trying a 2nd iteration.

asfbot · 2016-11-04T17:46:22Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Stage/63/

Jenkins built the site at commit id 6fa3dda with Jekyll and staged it here. Happy reviewing.

Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again.

asfbot · 2016-11-04T17:47:23Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Website_Test/17/

davorbonaci · 2016-11-04T23:54:34Z

LGTM

jbonofre · 2016-11-05T09:13:18Z

LGTM

[BEAM-890] Update compatibility matrix for Spark.

97c147f

kennknowles requested changes Nov 4, 2016

View reviewed changes

fixup! second iteration.

6fa3dda

asfgit closed this in e96b07f Nov 4, 2016

amitsela deleted the BEAM-890 branch November 5, 2016 08:36

robertwb pushed a commit to robertwb/incubator-beam that referenced this pull request Jun 5, 2018

This closes apache/beam-site#65

301f299

melap pushed a commit to apache/beam that referenced this pull request Jun 20, 2018

This closes apache/beam-site#65

8879ebe

Conversation

amitsela commented Nov 4, 2016

Uh oh!

amitsela commented Nov 4, 2016

Uh oh!

asfbot commented Nov 4, 2016

Uh oh!

asfbot commented Nov 4, 2016

Uh oh!

davorbonaci commented Nov 4, 2016

Uh oh!

kennknowles left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amitsela Nov 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amitsela commented Nov 4, 2016

Uh oh!

asfbot commented Nov 4, 2016

Uh oh!

asfbot commented Nov 4, 2016

Uh oh!

davorbonaci commented Nov 4, 2016

Uh oh!

jbonofre commented Nov 5, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

amitsela Nov 4, 2016 •

edited

Loading