Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java creates an incorrect pipeline proto when core-construction-java jar is not in the CLASSPATH #21111

Open
damccorm opened this issue Jun 4, 2022 · 6 comments

Comments

@damccorm
Copy link
Contributor

damccorm commented Jun 4, 2022

Seems like Java ends up creating incorrect pipeline API proto segments when "core-construction-java" is not available in the system.

 

For example, this may create  a GBK transform without the URN. 

 

This can be reproduced by running the wordcount example in [1] after  removing the following dependency from the pom.xml file.

    <dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-runners-core-construction-java</artifactId>
      <version>${beam.version}</version>
    </dependency>
 
Seems like this is because Java is unable to register corresponding XYZTranslation classes (for example [2]). 
 

I think we should fail here instead of generating default incorrect proto segments.

 

[1] https://github.com/chamikaramj/multi-language-pipelines

[2] https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/GroupByKeyTranslation.java

 

Imported from Jira BEAM-12807. Original Jira may contain additional context.
Reported by: chamikara.

@kennknowles
Copy link
Member

Agreed. We should fail if we have a primitive that cannot be translated. Is this still occurring? /cc @lukecwik @robertwb who may have some context on the portable translation recent state of affairs.

@lukecwik
Copy link
Member

I haven't seen this being a large issue.

Any reason why we don't merge most of these into sdks/java/core and make classes package private?

I feel as though we have a lot of modules like core-construction-java or java-fn-execution that don't add more dependencies but are split up into separate modules making the code organization more complex then necessary.

@kennknowles
Copy link
Member

Agree. The separation of core-construction-java was to keep protobuf out of the core dependencies. That is clearly obsolete. The separation of java-fn-execution I don't know. I assume it had something to do with the SDK harness and runners sharing the logic there?

@kennknowles
Copy link
Member

To mitigate this, we should have all the runners depend on core-construction so it is always on the classpath.

@kennknowles kennknowles self-assigned this Oct 19, 2022
@kennknowles
Copy link
Member

(until we feel like taking the time to merge the modules)

@kennknowles kennknowles removed their assignment Dec 7, 2022
@kennknowles kennknowles added P2 and removed P1 labels Dec 7, 2022
@kennknowles
Copy link
Member

Given the lack of user reports of this, downgrading to P2. We could pick it up during some cleanup time if we want. It probably isn't a good first issue because it requires knowledge of our code organization, even though it isn't complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants