Skip to content
This repository has been archived by the owner on Nov 10, 2023. It is now read-only.

Buck is inconsistent in dependency resolution order between java tests and binaries, problematic with duplicates #1830

Open
romanoid opened this issue Apr 2, 2018 · 13 comments
Labels
P3 Priority 3

Comments

@romanoid
Copy link
Contributor

romanoid commented Apr 2, 2018

I'm planning to work on fixing this issue, please comment and advice!

We've noticed issue with Buck resolving dependencies in different order for different (but related targets).

This is especially dangerous in the case if any duplicates are present in classpath.

Common reason for it is repository including multiple versions of 3rdparty library different modules need (or authors assume different version is needed), then suddenly one of these modules depends on another. (see example)

Buck uses different lookup order for tests and binaries, this may make all the tests pass but binary may fail with "Method not found in production".

Behaviors:
java_binary: lexicographically earlier target selected first (in case of common version naming, lowest version), sorting happens in AbstractSourcePathResolver

java_test: I haven't looked into implementation details yet, but experiments shown postorder DFS with direct children sorted.

Additionally, if any attempt to trace class path is made with buck query, buck query uses BFS, which is different for both.

Demonstrated in experiment:
https://github.com/romanoid/buck-tinkering/blob/master/deps-experiment/buildables/BUCK

This uses resources instead of java files since they are lighter and easier to trace, but behavior is same for .class files

Proposed changes: I'd like to update both java_test and java_binary to be consistent in dependency traversal order.

Options we have (please add if I'm missing some):

  1. Always sort lexicographically: (java_binary of today)
  • (+) This is the consistent and easy to trace option. (sort output of deps query)
  • (-) With duplicate deps, this always "downgrades dep".
  • (+) Change of dep order does not change artifact.
  • (-) Gives no control to owners of individual module to override deps.
  1. Always sort reversed lexicographically:
  • (+) This is the consistent and easy to trace option. (sort output of deps query)
  • (+) With duplicate deps, this always "upgrades dep".
  • (+) Change of dep order does not change artifact.
  • (-) Gives no control to owners of individual module to override deps.
  1. (pre/post order) DFS with children order equal to declared deps:
  • (-) This is hard to exactly trace since there no command to output the order.
  • (-) With duplicate deps, this will get one that is closer in the order
  • (-) Change of dep order changes results (may require changes to how target keys are computed).
  • (+) Gives module owners ability to force upgrade/downgrade by specifying dependency in their module.
  1. (pre/post order) DFS with sorted children: (java_test of today)
  • (-) This is extremely hard to exactly trace since there no command to output the order and order is not clear from BUCK files.
  • (-) With duplicate deps, this will get one that is closer in the order.
  • (-) Change of dep order changes results (may require changes to how target keys are computed).
  • (+) Gives module owners ability to force upgrade/downgrade by specifying dependency in their module, but requires some naming tricks and knowledge of this sorting.
  1. BFS with children order equal to declared deps:
  • (-) This is hard to exactly trace since there no command to output the order.
  • (+) With duplicate deps, one declared directly in target will always be picked.
  • (+) With transitive deps, one declared closer will be picked.
  • (-) Change of dep order changes results (may require changes to how target keys are computed).
  • (+) Gives owners control by allowing to specify picked transitive deps directly.
  1. BFS with children sorted children: (buck query deps of today)
  • (+) Easy to trace with buck query.
  • (+) With duplicate deps, one declared directly in target will always be picked.
  • (+) With transitive deps, one declared closer will be picked.
  • (+) Change of dep order does not change artifact.
  • (+) Gives owners control by allowing to specify picked transitive deps directly.

Options I like the most are:

  • (2) Easiest to implement (adding sorting to tests, reversing sorting for java binaries), on the other hand it only solves it for duplicated consistently named 3rdparty deps and forces upgrade behavior.
  • (6) Most control and consistency gain, may be hard to implement.

Either behavior change should be no-op for people who don't have duplicates.

One other notice: this is potentially breaking change for anyone who has duplicates in classpath. How do you handle such cases?

I were considering to (if possible) add a new .buckconfig setting for java to keep old behavior but change default to new behavior.

Please advice,
Roman

@bobyangyf
Copy link
Contributor

Could you post the errors for "Buck uses different lookup order for tests and binaries, this may make all the tests pass but binary may fail with "Method not found in production"."

I'm not sure what you are talking about here. Since to my understanding, as long as dependencies are all listed, there shouldn't be problems building/running due to missing dependencies.

@romanoid
Copy link
Contributor Author

romanoid commented Apr 3, 2018

@bobyangyf Hi Bob, please see the example I linked in description I constructed with resources:

https://github.com/romanoid/buck-tinkering/blob/master/deps-experiment/buildables/BUCK

I may construct more sophisticated example with opensource library usages if needed for clarification, I though having clear minimal example is preferred though.

Reiterating, tests see resource contents as:
/1-2.txt: 2
/1-3.txt: 1
/1-4.txt: 1
/1-5.txt: 1
/2-3.txt: 2
/2-4.txt: 2
/2-5.txt: 2
/3-4.txt: 4
/3-5.txt: 3
/4-5.txt: 4

While java binary sees resources as:
/1-2.txt: 1
/1-3.txt: 1
/1-4.txt: 1
/1-5.txt: 1
/2-3.txt: 2
/2-4.txt: 2
/2-5.txt: 2
/3-4.txt: 3
/3-5.txt: 3
/4-5.txt: 4

@kageiit
Copy link
Contributor

kageiit commented Apr 5, 2018

@styurin @ttsugriy Can you please help resolve the direction we want to go with this?

@bobyangyf
Copy link
Contributor

I see.
We'd love you to have a fix.
I think the general consensus amongst buck is to have option 1. lexicographically, as the rest of the code base uses a lot of ImmutableSortedSets that sort lexicographically.

@romanoid
Copy link
Contributor Author

romanoid commented Apr 9, 2018

I don't like option 1 since it gives individual developers no control at all to resolve conflicts (if they happen to depend on 2 versions), so it provides the worse behavior possible.

It is still good from consistency perspective though.

@bobyangyf
Copy link
Contributor

Depending on 2 versions is never a good thing ;)
Consistency, determinism and well defined behaviour is desired, so we are opting for option 1, and enforcing developers to have no conflicts.

There is too much of the code base (well beyond just java) that enforces lexicographic sorting and relies on such behaviour.

romanoid added a commit to romanoid/buck that referenced this issue Apr 10, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
@romanoid
Copy link
Contributor Author

"Depending on 2 versions is never a good thing ;)" – totally agree, but that's not something easy to solve for us at the moment.

@styurin @ttsugriy, do you share the same opinion as @bobyangyf ?

romanoid added a commit to romanoid/buck that referenced this issue Apr 10, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
romanoid added a commit to romanoid/buck that referenced this issue Apr 10, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
@romanoid
Copy link
Contributor Author

Meanwhile, can you please take a look on #1841 this should be useful regardless.

@bobyangyf
Copy link
Contributor

I've talked to them internally regarding this already.

romanoid added a commit to romanoid/buck that referenced this issue Apr 10, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
@romanoid
Copy link
Contributor Author

Sounds good, I'll have PR shortly. (in addition to #1841)

@romanoid
Copy link
Contributor Author

PR: #1844

romanoid added a commit to romanoid/buck that referenced this issue Apr 11, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
romanoid added a commit to romanoid/buck that referenced this issue Apr 11, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
romanoid added a commit to romanoid/buck that referenced this issue Apr 11, 2018
…ing regressions.

This is integration test for facebook#1830.
When issue is fixed, the test will be updated accordingly.

Meanwhile as well as after the fix, test will indicate any regressions in the dependency conflict resolution order.
romanoid added a commit to romanoid/buck that referenced this issue May 2, 2018
@styurin
Copy link

styurin commented May 4, 2018

Agree that we should unify classpath creation in all mentioned cases.

A few thoughts:

  • Why do we need to sort? The only reason I can think of is rule key computation, but, for example, java_binaries with different order of classes should have different rule keys because their runtime behavior can be different.

  • We should give people a way to control the order of dependencies. This is important in cases when stubs are used in tests. With lexicographical sorting the only way to control that is to rename the target names which is counter-intuitive.

Since Java doesn't have notion of dependency graphs (classpath is a flat list), it doesn't enforce sorting or traversal order.

I suggest we choose some stable ordering produced by traversing the graph of dependencies.

@styurin
Copy link

styurin commented May 4, 2018

@jkeljo any specific reasons why we sort classpath entries?

@v-jizhang v-jizhang added the P3 Priority 3 label Mar 18, 2020
shangxinli pushed a commit to shangxinli/parquet-mr that referenced this issue Aug 9, 2021
…ency ordering

Summary:
Roman's research regarding dependency ordering:
facebook/buck#1830

Reviewers: shangx

Reviewed By: shangx

Differential Revision: https://code.uberinternal.com/D3190249
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P3 Priority 3
Projects
None yet
Development

No branches or pull requests

5 participants