Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-11] Integrate Spark runner with Beam #42

Closed
wants to merge 10 commits into from

Conversation

@amitsela
Copy link
Member

@amitsela amitsela commented Mar 12, 2016

No description provided.

@davorbonaci
Copy link
Member

@davorbonaci davorbonaci commented Mar 13, 2016

I'll take a peek at this one shortly.

R: @davorbonaci

</dependency>
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-examples-all</artifactId>
Copy link
Member

@davorbonaci davorbonaci Mar 13, 2016

Is this really needed?

Copy link
Member Author

@amitsela amitsela Mar 13, 2016

It sounds fair to me to prefer the runners logger.. Flink runner does the same.

Copy link
Member

@tomwhite tomwhite Mar 14, 2016

The dependency is needed because some of the examples are used to test the Spark runner.

@davorbonaci
Copy link
Member

@davorbonaci davorbonaci commented Mar 13, 2016

LGTM

Nice!

I think we'll have to go over all pom.xml files in the project and fix them up globally -- but, that's unrelated to this pull request.

@davorbonaci
Copy link
Member

@davorbonaci davorbonaci commented Mar 13, 2016

(We should get to the bottom of the Jenkins failure before merging.)

@amitsela
Copy link
Member Author

@amitsela amitsela commented Mar 13, 2016

R: @tomwhite as well

<!--<transformers>-->
<!--<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />-->
<!--</transformers>-->
<!--</configuration>-->
Copy link
Member

@tomwhite tomwhite Mar 14, 2016

Guava will still need to be relocated to run properly on a cluster, won't it?

Copy link
Member Author

@amitsela amitsela Mar 14, 2016

The SDK upgraded to Guava 19 but I guess shading is still necessary for cluster. I'll reinstate the shade configuration.

@tomwhite
Copy link
Member

@tomwhite tomwhite commented Mar 14, 2016

Looks good to me. Thanks for working on it @amitsela. A few comments inline and here:

  • The license headers should be changed to ASF ones.
  • Since you are reorganising packages, how about keeping only the ones that clients use (SparkPipelineRunner, SparkPipelineOptions, EvaluationResult) in the top-level org.apache.beam.runners.spark package, and moving all the others to subpackages?
  • Remove .gitignore and .travis.yml.

@tomwhite
Copy link
Member

@tomwhite tomwhite commented Mar 14, 2016

Also, the note about the Spark runner on https://github.com/apache/incubator-beam#runners should be updated to say that it's now a part of Beam.

@amitsela
Copy link
Member Author

@amitsela amitsela commented Mar 14, 2016

Thanks @tomwhite and @davorbonaci !
I'll do a second iteration and hope that by the time I'm done, the Jenkins issue will be solved as well :)

I plan to address the following:

  • Shade configuration
  • ASF licenses
  • Remove .gitignore and .travis.iml
  • Package organization
  • Updated README

@davorbonaci like you said, I think that we need a cross-project pom.xml work to get all components inline, but let's get this runner running first :)

@amitsela
Copy link
Member Author

@amitsela amitsela commented Mar 14, 2016

This pull request is till pending additional work so please DON'T MERGE.
An early push was executed to trigger a Jenkins job to test a new configuration.

Thanks!

@amitsela
Copy link
Member Author

@amitsela amitsela commented Mar 15, 2016

@tomwhite please review second iteration. Thanks.

@tomwhite
Copy link
Member

@tomwhite tomwhite commented Mar 15, 2016

+1 from me

@asfgit asfgit closed this in a91e115 Mar 15, 2016
aljoscha pushed a commit to aljoscha/beam that referenced this issue Mar 29, 2018
Wire job service API into portable runner PipelineResults
tvalentyn pushed a commit to tvalentyn/beam that referenced this issue May 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants