Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement exports literal in jvm_target #4329

Merged
merged 7 commits into from Mar 20, 2017

Conversation

JieGhost
Copy link
Contributor

@JieGhost JieGhost commented Mar 14, 2017

Problem (#4315)

Given a scenario with 3 jvm targets A, B, C.
A depends on B. B depends on C.
In some cases (annotations, inheritance, etc), even though A does not directly use C, C has to be in A's classpath for A to compile. With strict_deps turned on, this means C has to be in A's direct dependency list.

This causes trouble if B is a core library that thousands of targets depends on. Anytime B adds a new dependency, if strict_deps is True, then it is very likely that all those thousands of targets that depend on B have to include this new dependency in their BUILD files as well. Given that strict_deps is the direction we are heading, we have to find a way for libraries to "export" some of its dependencies.

Solution

Implemented bazel-like "exports" literal (https://bazel.build/versions/master/docs/be/java.html#java_library.exports) in pants. Now pants user can specify which of a target's dependencies will be exported so the consumers would not need to specify/know about these in case of strict_deps.

An example of this usage would be:

java_library(
  name='A',
  dependencies=[
    'src/java/dep/it:needs',
  ]
  exports=[
    'src/java/dep/it:needs',
  ]
)

With strict_deps = True, any targets that depend on A will have access to 'src/java/dep/it:needs' as well (in compile scope, that means A will have 'src/java/dep/it:needs' in its classpath).

@stuhood stuhood requested review from benjyw and removed request for ity March 15, 2017 06:12
Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Yujie!

@@ -75,6 +77,10 @@ def __init__(self,
at compilation time. This enforces that all direct deps of the target
are declared, and can improve compilation speed due to smaller
classpaths. Transitive deps are always provided at runtime.
:param list exports: A list of exported libraries, which will be accessible to dependents even
with strict_deps turned on. A common use case is for library targets to
export its dependencies. Then any dependents of that library target will
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to export dependencies that it knows its dependents will need"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

:rtype: list
"""
exports = [Address.parse(spec, relative_to=self.address.spec_path) for spec in self.payload.exports]
return [self._build_graph.get_target(addr) for addr in exports]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, this will only work if these exports are definitely in the context. This should always be the case when the exports are a subset of a target's dependencies, but that isn't currently being checked.

Another way to ensure that exports and dependencies intersect might be to declare exports as traversable_dependency_specs of a target, which would cause them to be added to the dependencies of the target after construction: basically, the complete set of dependencies would be exports+dependencies, and so you wouldn't need to repeat them.

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand it would be nice to avoid the redundancy. OTOH, we want people to use exports sparingly, I assume, and if all exports were effectively dependencies it might cause people to use it as a synonym for dependencies without thinking too hard about what actually needs exporting.

Still, I'm leaning towards agreeing with Stu's comment, just providing the counterpoint.

Copy link
Sponsor Member

@stuhood stuhood Mar 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, and now I'm less sure about it. You're right that folks would begin to just use it by default. On the other hand, the number of things that need to be exported in scala code can be high. I can imagine the frustration of needing to repeat 25 out of 100 things in a dep list, for example (a common one in our repo).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think exports is always a subset of dependencies, if a non-existent target in exports list, then "None" will be returned, which should not cause exceptions. Bazel also has similar constraints on that:

Important: an exported rule is not a regular dependency. Sticking to the previous example, if B exports C and wants to also use C, it has to also list it in its own deps.

Allowing "exports" to be a complement of "dependencies" may lead to another cleanup in the future.

What do you think?

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe it's OK to have the redundancy here, to discourage frivolous use of exports. But then if someone lists an export that isn't in dependencies we should error, not silently return None...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the logic to throw exception in "None" case.

pants_run = self.run_pants(['compile', '--compile-scalafmt-skip', target])
self.assert_success(pants_run)

def modify_exports_and_compile(self, target, modify_file):
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than cloning the sources, you might consider just defining a target that is expected to fail to compile: just need to add it to the list of testprojects targets that are expected to fail here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cloned here because I want to modify the source file to test cache invalidation. I don't want to corrupt the repo, thus I copied it to a temp dir. But I will add a failure case. Benjy also mentioned it. Are you saying I don't have to run the failure test here but just add to the list you linked?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was saying is that if you want to create a testprojects test that fails to compile, you can do that. You just have to add it to the list of targets that are known to fail.

But if you're trying to test invalidation, you do need to create the clone, so carry on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a fail target, and added it to the list you linked.
Those targets are skipped during integration, so I added another test in test_dep_exports_integration.py to make sure it fails.

Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this is a very neat and useful piece of functionality!

@@ -40,6 +41,7 @@ def __init__(self,
services=None,
platform=None,
strict_deps=None,
exports=None,
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could even go on Target itself? There's nothing intrinsically JVM-y about the concept, that I can see. The same is true for strict_deps itself, of course, but I wouldn't expect you to move that in this change.

Or is there some reason why it makes more sense for this to be only on JVM targets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drive for "exports" literal was a common issue in both javac and scalac. So far we have only seen the issue with java and scala targets. Also bazel only has this under java_library rule (not saying we should do whatever bazel does, but I feel limiting the usage of exports is reasonable).

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It wouldn't be hard to expand later.

:rtype: list
"""
exports = [Address.parse(spec, relative_to=self.address.spec_path) for spec in self.payload.exports]
return [self._build_graph.get_target(addr) for addr in exports]
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand it would be nice to avoid the redundancy. OTOH, we want people to use exports sparingly, I assume, and if all exports were effectively dependencies it might cause people to use it as a synonym for dependencies without thinking too hard about what actually needs exporting.

Still, I'm leaning towards agreeing with Stu's comment, just providing the counterpoint.

sources = ['D.java'],
dependencies=[':C'],
strict_deps=True,
)
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline at end of file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def _resolve_exports(target):
for export in getattr(target, 'exports', []):
for exp in _resolve_exports(export):
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So exports are transitive? That should be documented on the exports target attribute.

In fact, this whole functionality should be written up on the docsite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I feel like they shouldn't be transitive. If the goal is to make the exports of a library explicit at the library's declaration, then I think exports being transitive is confusing.

For example, in order to find all the exported targets of t(exports=['a','b','c']), you'd need to look at a, b and c's declarations too, and their exported targets declarations if they exist. If exports are intransitive, then all the necessary information is in t(exports=['a','b','c']).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify between the two scenarios:

  1. If A exports B, B exports C, then C should be available for A at compile time, which is what this bazel discussion is about https://groups.google.com/d/msg/bazel-discuss/CRd1BbkHVUw/8Vi1gPAjIAAJ
  2. If A depends on B, B exports C, C exports D, is D available for A at compile time? (I am not too clear about this one)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the bazel discussion actually means that it is transitively for 2), so it will definitely be transitive for 1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. After reading that thread, I think I changed my mind. I hadn't thought it through.

  1. This is the case we're shooting for either way
A -d-> B
B -e-> C

classpath(A) => B, C
  1. This is the transitive case
A -d-> B
B -e-> C
C -e-> D
# should either
#  only include direct exports
classpath(A) => B, C
#  or, include transitive exports
classpath(A) => B, C, D
  • If we make exports transitive, strict_deps makes dependencies intransitive, but not exports. I think that's reasonable, in that strict_deps applies to dependencies not exports.
  • For the classfile parsing case, eg if D provides an annotation, will we get a warning/error if it is not on the classpath along with C? I think we might get an error.
  • For the inheritance case, making exports transitive means that if you inherit from C in A, that scalac's LUB check will be able to find D, which is an outcome we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Exports should be transitive. Sorry. I think I wasn't clear enough in that last comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was my bad. I didn't realize you post the comment already until I post mine and refresh the page.

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: If A has B in its dependencies and B has C in its exports then would we want C to be on A's classpath? Or does A have to have B in its exports? I guess the latter, otherwise anything depending on A won't get C on its classpath?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it ought to work like this: A has C on its compile classpath. Given an X depending on A, X won't have C on its compile classpath because A doesn't export B or C. If A exports B, then C will be on X's classpath.

A -d-> B
B -e-> C
X -d-> A

cp(A) == {B, C}
cp(X) == {A}

# if we add
A -e-> B
# then
cp(X) == {A, B, C}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjyw A target's "exports" only has effect on dependents of the target, not on target itself. Nick's description is the exact behavior I implemented in this patch.

@@ -22,6 +22,16 @@ def _resolve_aliases(target):
else:
yield declared

for export in _resolve_exports(declared):
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A function named _resolve_aliases is a slightly unexpected place to find this logic, since it has nothing to do with aliases, although it's clearly the most straightforward way to implement this. Maybe clarify by renaming _resolve_aliases to _resolve_strict_dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

def hermetic(cls):
return True

def test_compilation(self):
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a test case demonstrating that these do indeed fail to compile without the exports mechanism. Otherwise some change to the test files might cause the compilations in this test to succeed for other reasons, and the test wouldn't be testing what we think it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a case where exports is needed but not added. Thanks for the suggestion!

@@ -75,6 +77,10 @@ def __init__(self,
at compilation time. This enforces that all direct deps of the target
are declared, and can improve compilation speed due to smaller
classpaths. Transitive deps are always provided at runtime.
:param list exports: A list of exported libraries, which will be accessible to dependents even
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more precise to say exported targets instead of exported libraries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


def _resolve_exports(target):
for export in getattr(target, 'exports', []):
for exp in _resolve_exports(export):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I feel like they shouldn't be transitive. If the goal is to make the exports of a library explicit at the library's declaration, then I think exports being transitive is confusing.

For example, in order to find all the exported targets of t(exports=['a','b','c']), you'd need to look at a, b and c's declarations too, and their exported targets declarations if they exist. If exports are intransitive, then all the necessary information is in t(exports=['a','b','c']).

Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending other reviews. Thanks again for this, it's a neat change.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Mar 17, 2017

Oh, but please also update the docsite, so that this functionality is discoverable...

@JieGhost
Copy link
Contributor Author

@benjyw Is there any documents I can follow on how to update the docsite?

@lahosken
Copy link
Contributor

@JieGhost
Copy link
Contributor Author

@lahosken Thanks! I will publish it once this patch lands.

Copy link
Contributor

@baroquebobcat baroquebobcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me. 👍

Are you going to follow up with a docsite update? I feel like the JVM docs could stand to have sections on this + strict_deps. Maybe we should file an issue for it.

@JieGhost
Copy link
Contributor Author

Yeah, I will publish to docsite when this patch lands.

Good idea to have a dedicated section on strict_deps and exports. I will create the ticket.

@JieGhost JieGhost merged commit 29d7dca into pantsbuild:master Mar 20, 2017
@JieGhost JieGhost deleted the yujieproject/implement_exports branch March 20, 2017 18:59
lenucksi pushed a commit to lenucksi/pants that referenced this pull request Apr 25, 2017
### Problem (pantsbuild#4315)

Given a scenario with 3 jvm targets A, B, C.
A depends on B. B depends on C.
In some cases (annotations, inheritance, etc), even though A does not directly use C, C has to be in A's classpath for A to compile. With strict_deps turned on, this means C has to be in A's direct dependency list.

This causes trouble if B is a core library that thousands of targets depends on. Anytime B adds a new dependency, if strict_deps is True, then it is very likely that all those thousands of targets that depend on B have to include this new dependency in their BUILD files as well. Given that strict_deps is the direction we are heading, we have to find a way for libraries to "export" some of its dependencies.

### Solution

Implemented bazel-like "exports" literal (https://bazel.build/versions/master/docs/be/java.html#java_library.exports) in pants. Now pants user can specify which of a target's dependencies will be exported so the consumers would not need to specify/know about these in case of strict_deps.

An example of this usage would be:

```
java_library(
  name='A',
  dependencies=[
    'src/java/dep/it:needs',
  ]
  exports=[
    'src/java/dep/it:needs',
  ]
)
```

With strict_deps = True, any targets that depend on A will have access to 'src/java/dep/it:needs' as well (in compile scope, that means A will have 'src/java/dep/it:needs' in its classpath).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants