Skip to content

[BEAM-2076] Shade Runner Dependencies in the DirectRunner#2714

Closed
tgroh wants to merge 2 commits intoapache:masterfrom
tgroh:direct_runner_deps
Closed

[BEAM-2076] Shade Runner Dependencies in the DirectRunner#2714
tgroh wants to merge 2 commits intoapache:masterfrom
tgroh:direct_runner_deps

Conversation

@tgroh
Copy link
Member

@tgroh tgroh commented Apr 26, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

Move inner classes of the DirectRunner to reduce total API Surface.

Add API Surface Test for the Direct Runner.

<pattern>org.apache.beam.runners</pattern>
<!--suppress MavenModelInspection -->
<shadedPattern>
org.apache.beam.runners.direct.util
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*.util -> *.repackaged.runners.util

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no version numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm'd

Copy link
Contributor

@dhalperi dhalperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post maven dependency:tree and comment on its goodness?

@tgroh tgroh force-pushed the direct_runner_deps branch 3 times, most recently from d16024f to 8648b51 Compare May 3, 2017 17:53
@tgroh tgroh force-pushed the direct_runner_deps branch 3 times, most recently from 6a477f0 to b76481c Compare May 4, 2017 02:07
@tgroh
Copy link
Member Author

tgroh commented May 4, 2017

(Removing test and runtime dependencies)

[INFO] --- maven-dependency-plugin:3.0.0:tree (default-cli) @ beam-runners-direct-java ---
[INFO] org.apache.beam:beam-runners-direct-java:jar:0.7.0-SNAPSHOT
[INFO] +- org.apache.beam:beam-sdks-java-core:jar:0.7.0-SNAPSHOT:compile
...
[INFO] +- org.apache.beam:beam-runners-core-construction-java:jar:0.7.0-SNAPSHOT:compile
[INFO] | - org.apache.beam:beam-sdks-common-runner-api:jar:0.7.0-SNAPSHOT:compile
[INFO] +- org.apache.beam:beam-runners-core-java:jar:0.7.0-SNAPSHOT:compile
[INFO] +- com.google.guava:guava:jar:20.0:compile
...
[INFO] +- joda-time:joda-time:jar:2.4:compile
[INFO] +- com.google.code.findbugs:jsr305:jar:3.0.1:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.14:compile
[INFO] +- com.google.auto.service:auto-service:jar:1.0-rc2:compile (optional)
[INFO] | - com.google.auto:auto-common:jar:0.3:compile (optional)
[INFO] +- com.google.auto.value:auto-value:jar:1.4.1:provided
[INFO] +- org.hamcrest:hamcrest-all:jar:1.3:provided
[INFO] +- junit:junit:jar:4.12:provided
...

We shade and repackage guava, runners-core-construction-java and runners-core-java. It may also be appropriate to repackage the runner API. My belief is that all other dependencies are also dependencies of the Java SDK, and thus cannot be repackaged.

Copy link
Contributor

@dhalperi dhalperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not also transitive dependency on protobuf through common-runner-api? That would surprise me, but I don't see it in your list.

If proto's not in the dependency-reduced-pom.xml, I'm down.. but I'm not sure how it's not, exactly. (Unless shade does the transitive magic).

public class DirectRunnerApiSurfaceTest {
@Test
public void testDirectRunnerApiSurface() throws Exception {
// The DirectRunner should expose no more than the Core SDK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this comment is now out of date?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@RunWith(JUnit4.class)
public class SdkCoreApiSurfaceTest {
@SuppressWarnings("unchecked")
public static final Set<String> ALLOWED_PACKAGES =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm'd

@tgroh
Copy link
Member Author

tgroh commented May 4, 2017

Whoops, ran on the non-reduced POM.

We don't have a direct dependency on proto; right now we get a transitive dependency via the core SDK and runners-core-construction (and probably runners-core). Those can (and probably should) be shaded independently

[INFO] --- maven-dependency-plugin:3.0.0:tree (default-cli) @ beam-runners-direct-java ---
[INFO] org.apache.beam:beam-runners-direct-java:jar:0.7.0-SNAPSHOT
[INFO] +- org.apache.beam:beam-sdks-java-core:jar:0.7.0-SNAPSHOT:compile
[INFO] | +- com.google.http-client:google-http-client:jar:1.22.0:compile
[INFO] | | - org.apache.httpcomponents:httpclient:jar:4.0.1:compile
[INFO] | | +- org.apache.httpcomponents:httpcore:jar:4.0.1:compile
[INFO] | | +- commons-logging:commons-logging:jar:1.1.1:compile
[INFO] | | - commons-codec:commons-codec:jar:1.3:compile
[INFO] | +- com.google.protobuf:protobuf-java:jar:3.2.0:compile
[INFO] | +- com.fasterxml.jackson.core:jackson-core:jar:2.8.8:compile
[INFO] | +- com.fasterxml.jackson.core:jackson-annotations:jar:2.8.8:compile
[INFO] | +- com.fasterxml.jackson.core:jackson-databind:jar:2.8.8:compile
[INFO] | +- net.bytebuddy:byte-buddy:jar:1.6.8:compile
[INFO] | +- org.apache.avro:avro:jar:1.8.1:compile
[INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile
[INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
[INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.7:compile
[INFO] | | - org.tukaani:xz:jar:1.5:compile
[INFO] | +- org.xerial.snappy:snappy-java:jar:1.1.4-M3:compile
[INFO] | +- org.apache.commons:commons-compress:jar:1.9:compile
[INFO] | - org.apache.commons:commons-lang3:jar:3.5:compile
[INFO] +- com.google.guava:guava-testlib:jar:20.0:test
[INFO] | - com.google.guava:guava:jar:20.0:compile
[INFO] +- joda-time:joda-time:jar:2.4:compile
[INFO] +- com.google.code.findbugs:jsr305:jar:3.0.1:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.14:compile
[INFO] +- com.google.auto.service:auto-service:jar:1.0-rc2:compile (optional)
[INFO] | - com.google.auto:auto-common:jar:0.3:compile (optional)

@tgroh
Copy link
Member Author

tgroh commented May 4, 2017

Protobuf is shaded.

@tgroh tgroh force-pushed the direct_runner_deps branch from 91a22bf to f3f4b1f Compare May 4, 2017 22:20
@tgroh
Copy link
Member Author

tgroh commented May 4, 2017

R: @dhalperi

@tgroh
Copy link
Member Author

tgroh commented May 5, 2017

retest this please

@tgroh
Copy link
Member Author

tgroh commented May 5, 2017

The failure was in archetype generation, which appears unrelated.

@tgroh tgroh force-pushed the direct_runner_deps branch from f3f4b1f to 1804465 Compare May 5, 2017 17:07
@tgroh
Copy link
Member Author

tgroh commented May 5, 2017

retest this please

@tgroh tgroh force-pushed the direct_runner_deps branch from 1804465 to f21a6b5 Compare May 8, 2017 16:51
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling f21a6b5 on tgroh:direct_runner_deps into ** on apache:master**.

@tgroh tgroh force-pushed the direct_runner_deps branch from f21a6b5 to 914193e Compare May 9, 2017 00:54
@tgroh
Copy link
Member Author

tgroh commented May 9, 2017

PTAL

I needed to include the Runner API protos to shade runners-core, as they aren't otherwise on the classpath, and then I'm shading them so the dependency doesn't leak.

@tgroh tgroh force-pushed the direct_runner_deps branch from 38cf0b9 to 039b047 Compare May 9, 2017 16:42
When a worker dies because of an error, propagate that error and fail
the Pipeline.
@tgroh tgroh force-pushed the direct_runner_deps branch from 039b047 to 0cf63ac Compare May 9, 2017 16:51
@tgroh tgroh force-pushed the direct_runner_deps branch from 0cf63ac to 7f4853d Compare May 9, 2017 17:08
// The DirectRunner can expose the Core SDK, anything exposed by the Core SDK, and itself
@SuppressWarnings("unchecked")
final Set<String> allowed =
ImmutableSet.of("org.apache.beam.sdk", "org.apache.beam.runners.direct", "org.joda.time");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future improvement, would be nice to make this be just specific classes Pipeline, PipelineOptions, PipelineRunner, and maybe that's roughly it.

@dhalperi
Copy link
Contributor

dhalperi commented May 9, 2017

LGTM, merging. please CP to release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants