Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate targets when mixing Scala and Java dependencies #299

Open
salvalcantara opened this issue May 4, 2021 · 4 comments
Open

Duplicate targets when mixing Scala and Java dependencies #299

salvalcantara opened this issue May 4, 2021 · 4 comments

Comments

@salvalcantara
Copy link

salvalcantara commented May 4, 2021

There seems to be a bug caused by mixing Scala and Java dependencies. The tool seems to use the same target name for both, and thus produces duplicate targets (one for Scala and one for Java). In particular, I'm hitting this problem in a very simple Flink/Scala project where the dependencies look like this:

dependencies:
  org.apache.flink:
    flink:
      lang: scala
      version: "1.11.2"
      modules: [clients, scala, streaming-scala]
    flink-connector-kafka:
      lang: java
      version: "0.10.2"
    flink-test-utils:
      lang: java
      version: "0.10.2"

FYI, the full code is available here: https://github.com/salvalcantara/bazel-flink-scala.

When I try to build the project, I'm getting this error:

Starting local Bazel server and connecting to it...
ERROR: Traceback (most recent call last):
        File "/Users/salvalcantara/Projects/me/bazel-flink-scala/WORKSPACE", line
44, column 25, in <toplevel>
                build_external_workspace(name = "vendor")
        File
"/Users/salvalcantara/Projects/me/bazel-flink-scala/vendor/target_file.bzl",
line 258, column 91, in build_external_workspace
                return build_external_workspace_from_opts(name = name, target_configs =
list_target_data(), separator = list_target_data_separator(), build_header =
build_header())
        File
"/Users/salvalcantara/Projects/me/bazel-flink-scala/vendor/target_file.bzl",
line 251, column 40, in list_target_data
                "vendor/org/apache/flink:flink_clients":
["lang||||||scala:2.12.11","name||||||//vendor/org/apache/flink:flink_clients","visibility||||||//visibility:public","kind||||||import","deps|||L|||","jars|||L|||//external:jar/org/apache/flink/flink_clients_2_12","sources|||L|||","exports|||L|||","runtimeDeps|||L|||//vendor/commons_cli:commons_cli|||//vendor/org/slf4j:slf4j_api|||//vendor/org/apache/flink:force_shading|||//vendor/com/google/code/findbugs:jsr305|||//vendor/org/apache/flink:flink_streaming_java_2_12|||//vendor/org/apache/flink:flink_core|||//vendor/org/apache/flink:flink_java|||//vendor/org/apache/flink:flink_runtime_2_12|||//vendor/org/apache/flink:flink_optimizer_2_12","processorClasses|||L|||","generatesApi|||B|||false","licenses|||L|||","generateNeverlink|||B|||false"],
Error: dictionary expression has duplicate key:
"vendor/org/apache/flink:flink_clients"
ERROR: error loading package 'external': Package 'external' contains errors
INFO: Elapsed time: 3.644s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

If you look at the dict lines that are reported as conflicting, you'll see the duplicate "vendor/org/apache/flink:flink_clients" target:

 "vendor/org/apache/flink:flink_clients": ["lang||||||java","name||||||//vendor/org/apache/flink:flink_clients", ...],
 "vendor/org/apache/flink:flink_clients": ["lang||||||scala:2.12.11","name||||||//vendor/org/apache/flink:flink_clients", ...],

Maybe it's possible to customize the target templates to suffix targets with the language? Something like:

_JAVA_LIBRARY_TEMPLATE = """
java_library(
  name = "{name}_java",
..."""

_SCALA_IMPORT_TEMPLATE = """
scala_import(
    name = "{name}_scala",
..."""

In any case, I would really appreciate having this issue addressed, might it be a bug or not.

@johnynek
Copy link
Collaborator

johnynek commented May 4, 2021

huh...

I don't think we've considered this case. Due to the _version approach of scala jar publishing there could be two different jars which collide if you chop off the version information from the name.

I think a good fix would be to:

  1. make an explicit check for this in the code and give a good error message before writing the graph out.
  2. require the user to somehow disambiguate in their yaml config file

If we automatically disambiguate, we would change the output if we happen to add a dependency that causes a collision, but that could happen late in the life of a big repo, which would be a pain.

to rename, we could something like:

  1. add field to the record: "alias" which allows you to rename a single item (if there are no modules).
  2. if you do use modules then the modules can either be strings, or a map with one item modules: [{"client": "client_java"}] or something like that...

I can coach you on what files need to change to get this done, but I probably can't get to this very soon personally. cc @ianoc who may or may not have any other thoughts on this.

@salvalcantara
Copy link
Author

Sounds good @johnynek ... I might be able to help if you point me on the right direction (don't promise anything, though).

@johnynek
Copy link
Collaborator

johnynek commented May 5, 2021

So, the key things would be:

  1. add support for representing the data in the DepsModel file: https://github.com/johnynek/bazel-deps/blob/3bed1b71bf94a62b566d6d98f119f2708b28c367/src/scala/com/github/johnynek/bazel_deps/DepsModel.scala#L601
  2. check for duplicates after language normalization I think in this method: https://github.com/johnynek/bazel-deps/blob/3bed1b71bf94a62b566d6d98f119f2708b28c367/src/scala/com/github/johnynek/bazel_deps/Writer.scala#L117

I hope that's a good starting point.

@johnynek
Copy link
Collaborator

johnynek commented May 5, 2021

BTW: you might be able to make two PRs: the second one: error when there is a duplicate with a nice error, might be the easier of the two and would have helped you figure out (though not remedy) the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants