New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement exports literal in jvm_target #4329
implement exports literal in jvm_target #4329
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yujie!
@@ -75,6 +77,10 @@ def __init__(self, | |||
at compilation time. This enforces that all direct deps of the target | |||
are declared, and can improve compilation speed due to smaller | |||
classpaths. Transitive deps are always provided at runtime. | |||
:param list exports: A list of exported libraries, which will be accessible to dependents even | |||
with strict_deps turned on. A common use case is for library targets to | |||
export its dependencies. Then any dependents of that library target will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to export dependencies that it knows its dependents will need"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
:rtype: list | ||
""" | ||
exports = [Address.parse(spec, relative_to=self.address.spec_path) for spec in self.payload.exports] | ||
return [self._build_graph.get_target(addr) for addr in exports] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, this will only work if these exports are definitely in the context. This should always be the case when the exports
are a subset of a target's dependencies
, but that isn't currently being checked.
Another way to ensure that exports and dependencies intersect might be to declare exports as traversable_dependency_specs
of a target, which would cause them to be added to the dependencies of the target after construction: basically, the complete set of dependencies would be exports+dependencies
, and so you wouldn't need to repeat them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand it would be nice to avoid the redundancy. OTOH, we want people to use exports
sparingly, I assume, and if all exports were effectively dependencies it might cause people to use it as a synonym for dependencies
without thinking too hard about what actually needs exporting.
Still, I'm leaning towards agreeing with Stu's comment, just providing the counterpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, and now I'm less sure about it. You're right that folks would begin to just use it by default. On the other hand, the number of things that need to be exported in scala code can be high. I can imagine the frustration of needing to repeat 25 out of 100 things in a dep list, for example (a common one in our repo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think exports is always a subset of dependencies, if a non-existent target in exports list, then "None" will be returned, which should not cause exceptions. Bazel also has similar constraints on that:
Important: an exported rule is not a regular dependency. Sticking to the previous example, if B exports C and wants to also use C, it has to also list it in its own deps.
Allowing "exports" to be a complement of "dependencies" may lead to another cleanup in the future.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe it's OK to have the redundancy here, to discourage frivolous use of exports
. But then if someone lists an export that isn't in dependencies we should error, not silently return None...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the logic to throw exception in "None" case.
pants_run = self.run_pants(['compile', '--compile-scalafmt-skip', target]) | ||
self.assert_success(pants_run) | ||
|
||
def modify_exports_and_compile(self, target, modify_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than cloning the sources, you might consider just defining a target that is expected to fail to compile: just need to add it to the list of testprojects targets that are expected to fail here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cloned here because I want to modify the source file to test cache invalidation. I don't want to corrupt the repo, thus I copied it to a temp dir. But I will add a failure case. Benjy also mentioned it. Are you saying I don't have to run the failure test here but just add to the list you linked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was saying is that if you want to create a testprojects
test that fails to compile, you can do that. You just have to add it to the list of targets that are known to fail.
But if you're trying to test invalidation, you do need to create the clone, so carry on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a fail target, and added it to the list you linked.
Those targets are skipped during integration, so I added another test in test_dep_exports_integration.py to make sure it fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - this is a very neat and useful piece of functionality!
@@ -40,6 +41,7 @@ def __init__(self, | |||
services=None, | |||
platform=None, | |||
strict_deps=None, | |||
exports=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could even go on Target
itself? There's nothing intrinsically JVM-y about the concept, that I can see. The same is true for strict_deps
itself, of course, but I wouldn't expect you to move that in this change.
Or is there some reason why it makes more sense for this to be only on JVM targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drive for "exports" literal was a common issue in both javac and scalac. So far we have only seen the issue with java and scala targets. Also bazel only has this under java_library rule (not saying we should do whatever bazel does, but I feel limiting the usage of exports is reasonable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. It wouldn't be hard to expand later.
:rtype: list | ||
""" | ||
exports = [Address.parse(spec, relative_to=self.address.spec_path) for spec in self.payload.exports] | ||
return [self._build_graph.get_target(addr) for addr in exports] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand it would be nice to avoid the redundancy. OTOH, we want people to use exports
sparingly, I assume, and if all exports were effectively dependencies it might cause people to use it as a synonym for dependencies
without thinking too hard about what actually needs exporting.
Still, I'm leaning towards agreeing with Stu's comment, just providing the counterpoint.
sources = ['D.java'], | ||
dependencies=[':C'], | ||
strict_deps=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add newline at end of file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
def _resolve_exports(target): | ||
for export in getattr(target, 'exports', []): | ||
for exp in _resolve_exports(export): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So exports are transitive? That should be documented on the exports
target attribute.
In fact, this whole functionality should be written up on the docsite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I feel like they shouldn't be transitive. If the goal is to make the exports of a library explicit at the library's declaration, then I think exports being transitive is confusing.
For example, in order to find all the exported targets of t(exports=['a','b','c'])
, you'd need to look at a
, b
and c
's declarations too, and their exported targets declarations if they exist. If exports are intransitive, then all the necessary information is in t(exports=['a','b','c'])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify between the two scenarios:
- If A exports B, B exports C, then C should be available for A at compile time, which is what this bazel discussion is about https://groups.google.com/d/msg/bazel-discuss/CRd1BbkHVUw/8Vi1gPAjIAAJ
- If A depends on B, B exports C, C exports D, is D available for A at compile time? (I am not too clear about this one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah the bazel discussion actually means that it is transitively for 2), so it will definitely be transitive for 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. After reading that thread, I think I changed my mind. I hadn't thought it through.
- This is the case we're shooting for either way
A -d-> B
B -e-> C
classpath(A) => B, C
- This is the transitive case
A -d-> B
B -e-> C
C -e-> D
# should either
# only include direct exports
classpath(A) => B, C
# or, include transitive exports
classpath(A) => B, C, D
- If we make exports transitive,
strict_deps
makes dependencies intransitive, but not exports. I think that's reasonable, in that strict_deps applies to dependencies not exports. - For the classfile parsing case, eg if D provides an annotation, will we get a warning/error if it is not on the classpath along with C? I think we might get an error.
- For the inheritance case, making exports transitive means that if you inherit from C in A, that scalac's LUB check will be able to find D, which is an outcome we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Exports should be transitive. Sorry. I think I wasn't clear enough in that last comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was my bad. I didn't realize you post the comment already until I post mine and refresh the page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: If A has B in its dependencies and B has C in its exports then would we want C to be on A's classpath? Or does A have to have B in its exports? I guess the latter, otherwise anything depending on A won't get C on its classpath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it ought to work like this: A has C on its compile classpath. Given an X depending on A, X won't have C on its compile classpath because A doesn't export B or C. If A exports B, then C will be on X's classpath.
A -d-> B
B -e-> C
X -d-> A
cp(A) == {B, C}
cp(X) == {A}
# if we add
A -e-> B
# then
cp(X) == {A, B, C}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjyw A target's "exports" only has effect on dependents of the target, not on target itself. Nick's description is the exact behavior I implemented in this patch.
@@ -22,6 +22,16 @@ def _resolve_aliases(target): | |||
else: | |||
yield declared | |||
|
|||
for export in _resolve_exports(declared): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A function named _resolve_aliases
is a slightly unexpected place to find this logic, since it has nothing to do with aliases, although it's clearly the most straightforward way to implement this. Maybe clarify by renaming _resolve_aliases
to _resolve_strict_dependencies
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
def hermetic(cls): | ||
return True | ||
|
||
def test_compilation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding a test case demonstrating that these do indeed fail to compile without the exports
mechanism. Otherwise some change to the test files might cause the compilations in this test to succeed for other reasons, and the test wouldn't be testing what we think it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a case where exports is needed but not added. Thanks for the suggestion!
@@ -75,6 +77,10 @@ def __init__(self, | |||
at compilation time. This enforces that all direct deps of the target | |||
are declared, and can improve compilation speed due to smaller | |||
classpaths. Transitive deps are always provided at runtime. | |||
:param list exports: A list of exported libraries, which will be accessible to dependents even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more precise to say exported targets
instead of exported libraries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
def _resolve_exports(target): | ||
for export in getattr(target, 'exports', []): | ||
for exp in _resolve_exports(export): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I feel like they shouldn't be transitive. If the goal is to make the exports of a library explicit at the library's declaration, then I think exports being transitive is confusing.
For example, in order to find all the exported targets of t(exports=['a','b','c'])
, you'd need to look at a
, b
and c
's declarations too, and their exported targets declarations if they exist. If exports are intransitive, then all the necessary information is in t(exports=['a','b','c'])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending other reviews. Thanks again for this, it's a neat change.
Oh, but please also update the docsite, so that this functionality is discoverable... |
@benjyw Is there any documents I can follow on how to update the docsite? |
@lahosken Thanks! I will publish it once this patch lands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me. 👍
Are you going to follow up with a docsite update? I feel like the JVM docs could stand to have sections on this + strict_deps. Maybe we should file an issue for it.
Yeah, I will publish to docsite when this patch lands. Good idea to have a dedicated section on strict_deps and exports. I will create the ticket. |
### Problem (pantsbuild#4315) Given a scenario with 3 jvm targets A, B, C. A depends on B. B depends on C. In some cases (annotations, inheritance, etc), even though A does not directly use C, C has to be in A's classpath for A to compile. With strict_deps turned on, this means C has to be in A's direct dependency list. This causes trouble if B is a core library that thousands of targets depends on. Anytime B adds a new dependency, if strict_deps is True, then it is very likely that all those thousands of targets that depend on B have to include this new dependency in their BUILD files as well. Given that strict_deps is the direction we are heading, we have to find a way for libraries to "export" some of its dependencies. ### Solution Implemented bazel-like "exports" literal (https://bazel.build/versions/master/docs/be/java.html#java_library.exports) in pants. Now pants user can specify which of a target's dependencies will be exported so the consumers would not need to specify/know about these in case of strict_deps. An example of this usage would be: ``` java_library( name='A', dependencies=[ 'src/java/dep/it:needs', ] exports=[ 'src/java/dep/it:needs', ] ) ``` With strict_deps = True, any targets that depend on A will have access to 'src/java/dep/it:needs' as well (in compile scope, that means A will have 'src/java/dep/it:needs' in its classpath).
Problem (#4315)
Given a scenario with 3 jvm targets A, B, C.
A depends on B. B depends on C.
In some cases (annotations, inheritance, etc), even though A does not directly use C, C has to be in A's classpath for A to compile. With strict_deps turned on, this means C has to be in A's direct dependency list.
This causes trouble if B is a core library that thousands of targets depends on. Anytime B adds a new dependency, if strict_deps is True, then it is very likely that all those thousands of targets that depend on B have to include this new dependency in their BUILD files as well. Given that strict_deps is the direction we are heading, we have to find a way for libraries to "export" some of its dependencies.
Solution
Implemented bazel-like "exports" literal (https://bazel.build/versions/master/docs/be/java.html#java_library.exports) in pants. Now pants user can specify which of a target's dependencies will be exported so the consumers would not need to specify/know about these in case of strict_deps.
An example of this usage would be:
With strict_deps = True, any targets that depend on A will have access to 'src/java/dep/it:needs' as well (in compile scope, that means A will have 'src/java/dep/it:needs' in its classpath).