[BEAM-2795] Use portable constructs in Flink batch translator by bsidhom · Pull Request #4343 · apache/beam

bsidhom · 2018-01-04T01:27:07Z

This was tested by round-tripping batch pipelines to and from protobuf form. It works with both real Java pipelines and rehydrated pipelines.

References and downcasts to specific transform subclasses are replaced with generic PTransforms. Transform metadata is now accessed through the translation utilities under org.apache.beam.runners.core.construction.

CombineTranslation uses a new side input extractor modeled after ParDoTranslation#getSideInputs.

The RawCombine rehydrated transform exposes side inputs via getAdditionalInputs. Side inputs were not previously exposed as "additional" inputs, so FlinkBatchTranslationContext#getInput could not properly extract the main output collection when side inputs were used.

The ParDo union coder is picky about ordering. It appears that coders must appear at the same indexes as their respective output collection tags. This ordering is now preserved.

Follow this checklist to help us incorporate your contribution quickly and easily:

Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
Each commit in the pull request should have a meaningful subject line and body.
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

bsidhom · 2018-01-04T01:30:52Z

I've left a protobuf round-trip in this change to ensure that all tests pass on rehydrated pipelines. I can remove this before submitting.

I've left a few TODOs inline as questions to the reviewer.

Finally, I'm not sure stylistically if it's appropriate to use these try-catch blocks everywhere. How is this normally handled in Beam? Should I bother with better error messages?

kennknowles

Seems fine to me. I think @aljoscha should take a look.

kennknowles · 2018-01-04T02:24:45Z

...ruction-java/src/main/java/org/apache/beam/runners/core/construction/CombineTranslation.java

+              (PCollection<?>) application.getInputs().get(new TupleTag<>(sideInputTag)),
+              "no input with tag %s",
+              sideInputTag);
+      // TODO: Should ParDoTranslation#viewFromProto live elsewhere?


Yea, seems like it could like in something like PCollectionViewTranslation

kennknowles · 2018-01-04T02:25:37Z

...ruction-java/src/main/java/org/apache/beam/runners/core/construction/CombineTranslation.java


+    @Override
+    public Map<TupleTag<?>, PValue> getAdditionalInputs() {
+      // TODO: This was ripped from ParDoTranslation. Is this correct?


Yes, this looks fine to me. Probably this could also be a helper that is shared between ParDo and Combine, since the canonical definition of how these side inputs should work is based on the composite definition of Combine as a GBK followed by a ParDo.

Where should such a helper live?

Any suggestions? Based on the size of this code, I don't think it's worth refactoring into a brand new class. I'll leave it here unless you have a better suggestion.

kennknowles · 2018-01-04T02:27:00Z

runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchPipelineTranslator.java

-      LOG.info(node.getTransform().getClass().toString());
-      throw new UnsupportedOperationException("The transform " + transform
+      String transformUrn = PTransformTranslation.urnForTransform(transform);
+      LOG.info(transformUrn);


Logging seems redundant with the exception?

Yes, that was probably my mistake, leaving that log output in.

kennknowles · 2018-01-04T02:29:32Z

runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java

      Map<TupleTag<?>, PValue> outputs = context.getOutputs(transform);

+      TupleTag<?> mainOutputTag;
+      try {


Kind of cluttery having this construct everywhere. I wonder if there is a way to lift the exception catching.

Is java 8 supported by all the submodules? The only clean way I can think of is to demote the exception to runtime using a lambda wrapper.

I've gone the lambda route to address this. It's fairly general and will be needed again in the streaming translator. However, I've left it inside of FlinkBatchTransformTranslators for now because it's unclear to me whether Java 8 is allowed globally.

Please advise on a better location/name.

Looks like I wrote that too soon. The Maven build appears to be configured to use -source 1.7, at least for this module. I'm not sure what else to do.

kennknowles · 2018-01-04T02:30:22Z

run flink validatesrunner

aljoscha

I second @kennknowles's comments and made some of my own. Overall I think this looks good but it's currently failing the Jenkins hooks.

aljoscha · 2018-01-04T15:49:56Z

...ruction-java/src/main/java/org/apache/beam/runners/core/construction/CombineTranslation.java

+              sideInputTag);
+      // TODO: Should ParDoTranslation#viewFromProto live elsewhere?
+      views.add(
+          ParDoTranslation.viewFromProto(sideInput, sideInputTag, originalPCollection, combineProto, components));


checkstyle violation: line's too long

I was wondering why this wasn't caught at compile time. Forgot I had been disabling checkstyle for faster builds. ;)

aljoscha · 2018-01-04T15:55:25Z

runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchPipelineTranslator.java

-      LOG.info(node.getTransform().getClass().toString());
-      throw new UnsupportedOperationException("The transform " + transform
+      String transformUrn = PTransformTranslation.urnForTransform(transform);
+      LOG.info(transformUrn);


Yes, that was probably my mistake, leaving that log output in.

aljoscha · 2018-01-04T16:11:38Z

runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java

        }
      }

+      // TODO: Why does the UnionCoder order have to match the output map order? Why does this


The map of tags is created here:

beam/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java

Line 487 in 875f3e1

for (TupleTag<?> tag : outputs.keySet()) {

And it's used again here to create individual Flink DataSets for each of the output tag indices:

beam/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java

Line 571 in 875f3e1

for (Entry<TupleTag<?>, PValue> output : outputs.entrySet()) {

If the order, i.e. the index, changes in between then the mapping to outputs won't be correct anymore.

That first link is exactly what's happening above. Unfortunately, it's insufficient in getting the union coder to work properly. Leaving that as-is breaks when I do a protobuf round-trip. I have to create a union coder that lists individual coders in the same order they appear in the output map in order to get tests to pass. My question is: why is this only necessary when using a rehydrated pipeline?

I think it might be because rehydration messes up the order of the individual coders. I also just realised that the code that constructs the outputMap is making sure to put the main coder at index 0 while the code that constructs the lists of union coders doesn't do it. I think this just happened to work because the main-output coder always was at index 0.

To be more specific: I think rehydration changes the order so that the main input is no longer at index 0.

Ah, this is interesting. Is there any requirement that the rest of the outputs match the order of their respective coders?

UnionCoder requires that the order of the inputs / outputs for the tags to match because the union coder encodes values in a specific order and when reading them in needs to decode them in that same order.

OK, thanks for confirming this. I'll leave the fix as-is and remove the comment.

bsidhom

Responding to comments.

The main thing to note here is that I've added a dependency on Java 8. Our current Flink dependency (1.4.0) doesn't require this, but Flink 1.5.0 will. I think it makes sense to start moving in this direction, but it's not required yet.

If we need to stick with Java 7, then I can bring the try/catch boilerplate back.

kennknowles · 2018-01-05T01:43:42Z

We do need to stick with Java 7 until the vote concludes. But this PR could perhaps wait for that result rather than reverting.

bsidhom · 2018-01-05T19:28:32Z

...ruction-java/src/main/java/org/apache/beam/runners/core/construction/CombineTranslation.java


+    @Override
+    public Map<TupleTag<?>, PValue> getAdditionalInputs() {
+      // TODO: This was ripped from ParDoTranslation. Is this correct?


Any suggestions? Based on the size of this code, I don't think it's worth refactoring into a brand new class. I'll leave it here unless you have a better suggestion.

bsidhom · 2018-01-05T19:30:15Z

runners/flink/pom.xml

+            <groupId>org.apache.maven.plugins</groupId>
+            <artifactId>maven-compiler-plugin</artifactId>
+            <configuration>
+              <source>1.8</source>


For some reason, compilation fails on my machine even with Java 8 set for the runners/flink module. I'm reverting it for now and adding a TODO to refactor it once Java 8 is the default.

bsidhom · 2018-01-05T19:32:52Z

From what I can tell, Jenkins is failing for some reason unrelated to these changes. Let me know if this is not the case.

bsidhom · 2018-01-05T22:54:09Z

Rebasing to see if it helps with the Jenkins issue.

bsidhom · 2018-01-06T01:10:22Z

OK, at least some of the old failures were fixed. The gradle build now fails due what looks like a corrupt dependency file. The maven build fails due to unspecified "dependency problems".

bsidhom · 2018-01-11T00:53:57Z

run Flink ValidatesRunner

bsidhom · 2018-01-11T22:12:39Z

Friendly ping.

kennknowles · 2018-01-11T22:32:56Z

The result of the Flink ValidatesRunner is gone since the last commit was pushed but it was https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Flink/4665/ so this is as green as it gets for now.

aljoscha · 2018-01-12T07:00:40Z

What's up with the pre-commit failures? Other than that I think this LGTM!

(Sorry for the tardy responses, I was traveling and I will be on vacation until end of next week.)

bsidhom · 2018-01-12T18:23:40Z

Run Flink ValidatesRunner

kennknowles · 2018-01-12T19:36:23Z

There's a known issue with Dataflow right now causing the WordCountIT to fail - it is because of the moster classpath containing all of Flink, Spark, Apex, Gearpump, and Dataflow deps in the same Maven profile. Gradle runs the same tests in a better way so we know they pass. And the gradle build is failing only because of a known broken MqttIO test. I will sickbay all of these today; didn't get to them yesterday.

So, yea, this is green as far as I am concerned.

kennknowles · 2018-01-12T19:38:01Z

Ben - can you reorganize the commits into a history of meaningful changes, squashing in any that were just incremental changes during development and review.

bsidhom · 2018-01-12T20:00:29Z

I've rebased and cleaned this up. Should be ready for merging.

kennknowles · 2018-01-12T22:17:57Z

It does still include a commit and a revert of that commit. Seems silly to add both.

bsidhom · 2018-01-12T22:23:03Z

I left that so it's easy enough to cherry pick the change back in when we switch to Java 8. I'll just remove it altogether.

…ation `CombineTranslation` uses a new side input extractor modeled after `ParDoTranslation#getSideInputs`. The `RawCombine` rehydrated transform exposes side inputs via `getAdditionalInputs`. Side inputs were not previously exposed as "additional" inputs, so portable translators could not properly extract the main output collection when side inputs were used. `ParDoTranslation.viewFromProto` was used all over this package for general view translations. This method has been moved into a new `PCollectionViewTranslation` class.

This was tested by round-tripping batch pipelines to and from protobuf form. It works with both real Java pipelines and rehydrated pipelines. References and downcasts to specific transform subclasses are replaced with generic `PTransform`s. Transform metadata is now accessed through the translation utilities under `org.apache.beam.runners.core.construction`. The `ParDo` union coder is picky about ordering. It appears that coders must appear at the same indexes as their respective output collection tags. This ordering is now preserved.

bsidhom · 2018-01-12T22:26:47Z

I've removed the commit/revert pair.

lukecwik · 2018-01-13T06:19:23Z

This broke the checkstyle in Flink, filed BEAM-3478 and cut PR/4410

kennknowles requested review from aljoscha and kennknowles January 4, 2018 02:21

kennknowles assigned aljoscha Jan 4, 2018

kennknowles reviewed Jan 4, 2018

View reviewed changes

aljoscha reviewed Jan 4, 2018

View reviewed changes

bsidhom commented Jan 5, 2018

View reviewed changes

bsidhom force-pushed the flink-portable-batch branch from 8878f94 to a65e98f Compare January 5, 2018 22:59

bsidhom force-pushed the flink-portable-batch branch from a65e98f to e8e36bb Compare January 10, 2018 19:04

bsidhom mentioned this pull request Jan 10, 2018

[BEAM-2795] Use portable constructs in Flink streaming translator #4384

Merged

6 tasks

bsidhom force-pushed the flink-portable-batch branch from a1cd53f to 5d14026 Compare January 12, 2018 19:59

bsidhom force-pushed the flink-portable-batch branch from 5d14026 to ee95b95 Compare January 12, 2018 20:30

kennknowles assigned kennknowles and unassigned aljoscha Jan 12, 2018

bsidhom added 2 commits January 12, 2018 14:24

bsidhom force-pushed the flink-portable-batch branch from ee95b95 to 5ddf50b Compare January 12, 2018 22:25

kennknowles merged commit b88d150 into apache:master Jan 13, 2018

lukecwik mentioned this pull request Jan 13, 2018

[BEAM-3336] Fix thread safety issues of MqttIOTest #4406

Merged

6 tasks

bsidhom deleted the flink-portable-batch branch January 18, 2018 22:42

Conversation

bsidhom commented Jan 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bsidhom commented Jan 4, 2018

Uh oh!

kennknowles left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kennknowles commented Jan 4, 2018

Uh oh!

aljoscha left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsidhom left a comment

Choose a reason for hiding this comment

Uh oh!

kennknowles commented Jan 5, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsidhom commented Jan 5, 2018

Uh oh!

bsidhom commented Jan 5, 2018

Uh oh!

bsidhom commented Jan 6, 2018

Uh oh!

bsidhom commented Jan 11, 2018

Uh oh!

bsidhom commented Jan 11, 2018

Uh oh!

kennknowles commented Jan 11, 2018

Uh oh!

aljoscha commented Jan 12, 2018

Uh oh!

bsidhom commented Jan 12, 2018

bsidhom commented Jan 4, 2018 •

edited

Loading