New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce compilation invalidation scope of targets with strict_deps=True #4143

Merged
merged 14 commits into from Jan 9, 2017

Conversation

Projects
None yet
4 participants
@JieGhost
Contributor

JieGhost commented Dec 16, 2016

Problem

Related issue: #3060

Now, with strict_deps turned on, only direct dependencies will be used to compile jvm targets. But it isn't taken into account when generating fingerprints for compile. We can reduce the invalidation scope as well.

An example will be following:
Let's assume we have 3 targets A, B, and C. A depends on B; B depends on C, and A does not directly depends on C (C is a transitive dependency of A). In current implementation, invalidation hash of A takes hashes of A, B and C into account. But with strict_deps turned on, changing C does not have to invalidate A, since A does not depend on C for compilation.

Solution

I add a new "direct(self, target)" method in FingerpringStrategy abstract base class. This method returns "False" by default, derived class of FingerpringStrategy can override this method. In the case of this review, ResolvedJarAwareTaskIdentityFingerprintStrategy overrides this method so that when a target's strict_deps is defined then return its value, otherwise, platform's strict_deps property will be used. The target.transitive_invalidation_hash() method will use direct() method to determine the scope of invalidation hash.

This review also adds a unit test which verifies direct invalidation logic works as expected and several integration tests.

@JieGhost JieGhost requested review from baroquebobcat, stuhood and peiyuwang Dec 19, 2016

Show outdated Hide outdated src/python/pants/build_graph/target.py
@@ -473,7 +473,10 @@ def transitive_invalidation_hash(self, fingerprint_strategy=None, depth=0):
def dep_hash_iter():
for dep in self.dependencies:
try:
dep_hash = dep.transitive_invalidation_hash(fingerprint_strategy, depth=depth+1)
if getattr(self, 'strict_deps', None):

This comment has been minimized.

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

Hm. The thing that's tricky here is that we only want this behavior for compilation. A target with strict_deps should invalidate normally for things like test runs or bundles.

Also, the getattr here won't account for the global strict_deps value, which could cause issues.

One approach would be to lift the transitive vs not q into the strategy.

Some unit tests around the behavior would also be good. Maybe in JvmTargetTest or TargetTest. There might be better places for them to live, but those would be a good starting point.

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

Hm. The thing that's tricky here is that we only want this behavior for compilation. A target with strict_deps should invalidate normally for things like test runs or bundles.

Also, the getattr here won't account for the global strict_deps value, which could cause issues.

One approach would be to lift the transitive vs not q into the strategy.

Some unit tests around the behavior would also be good. Maybe in JvmTargetTest or TargetTest. There might be better places for them to live, but those would be a good starting point.

This comment has been minimized.

@JieGhost

JieGhost Dec 19, 2016

Contributor

fingerprint strategy is on per target level without considering dependencies, thus I tend to try Stu's advice here if it's OK with you.

@JieGhost

JieGhost Dec 19, 2016

Contributor

fingerprint strategy is on per target level without considering dependencies, thus I tend to try Stu's advice here if it's OK with you.

This comment has been minimized.

@JieGhost

JieGhost Dec 20, 2016

Contributor

hmm, a second thought on this, it may be neater to do it in fingerstrategy.

@JieGhost

JieGhost Dec 20, 2016

Contributor

hmm, a second thought on this, it may be neater to do it in fingerstrategy.

This comment has been minimized.

@baroquebobcat

baroquebobcat Dec 20, 2016

Contributor

Yeah. My thinking with the suggestion was that the strategy determines how hashing works, so it might also be reasonable for it to determine traversal.

@baroquebobcat

baroquebobcat Dec 20, 2016

Contributor

Yeah. My thinking with the suggestion was that the strategy determines how hashing works, so it might also be reasonable for it to determine traversal.

This comment has been minimized.

@JieGhost

JieGhost Dec 21, 2016

Contributor

Sorry for changing my mind again. Implement this logic in fingerprintstrategy is a bit tricky. target.invalidation_hash and target.transitive_invalidation_hash both use fingerprintstrategy, thus will result in recursive invalidation in unwanted case. I think it's cleaner to modify key_for_target method.

@JieGhost

JieGhost Dec 21, 2016

Contributor

Sorry for changing my mind again. Implement this logic in fingerprintstrategy is a bit tricky. target.invalidation_hash and target.transitive_invalidation_hash both use fingerprintstrategy, thus will result in recursive invalidation in unwanted case. I think it's cleaner to modify key_for_target method.

Show outdated Hide outdated ...hon/pants_test/invalidation/test_strict_deps_invalidation_integration.py
from pants_test.pants_run_integration_test import PantsRunIntegrationTest
class RunTrackerIntegrationTest(PantsRunIntegrationTest):

This comment has been minimized.

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

class name

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

class name

Show outdated Hide outdated ...hon/pants_test/invalidation/test_strict_deps_invalidation_integration.py
TEST_SRC = 'testprojects/tests/java/org/pantsbuild/testproject/strictdeps'
def copy_src_and_run_test(self, target_name, match_str):

This comment has been minimized.

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

match_str isn't a good name. Maybe, invalidates_root_target?

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

match_str isn't a good name. Maybe, invalidates_root_target?

Show outdated Hide outdated ...hon/pants_test/invalidation/test_strict_deps_invalidation_integration.py
self.copy_src_and_run_test('A2', True)
def test_strict_deps_true(self):
self.copy_src_and_run_test('A1', False)

This comment has been minimized.

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

Could you also add a test that checks when the global value is True and the target leaves it unspecified?

@baroquebobcat

baroquebobcat Dec 19, 2016

Contributor

Could you also add a test that checks when the global value is True and the target leaves it unspecified?

This comment has been minimized.

@JieGhost

JieGhost Jan 4, 2017

Contributor

Not sure how does "outdated" algorithm work, but it is done :)

@JieGhost

JieGhost Jan 4, 2017

Contributor

Not sure how does "outdated" algorithm work, but it is done :)

Show outdated Hide outdated src/python/pants/build_graph/target.py
@@ -473,7 +473,10 @@ def transitive_invalidation_hash(self, fingerprint_strategy=None, depth=0):
def dep_hash_iter():
for dep in self.dependencies:
try:
dep_hash = dep.transitive_invalidation_hash(fingerprint_strategy, depth=depth+1)
if getattr(self, 'strict_deps', None):
dep_hash = dep.invalidation_hash(fingerprint_strategy)

This comment has been minimized.

@stuhood

stuhood Dec 19, 2016

Member

Unfortunately, I think that this will break assumptions in a lot of Tasks. Whether invalidation is transitive or direct is currently a task-level concern. And in particular, "strict_deps" needs to only apply to tasks that explicitly ask for it.

As an example: zinc will use strict deps, and only pass strict deps to the compiler. But junit always uses the full transitive graph.

So I think that (with the current task API) it is necessary for this to be a change to this argument:

invalidate_dependents: If True then any targets depending on changed targets are invalidated.
... I definitely don't love that tasks are responsible for specifying this, but it's the way of things currently.

@stuhood

stuhood Dec 19, 2016

Member

Unfortunately, I think that this will break assumptions in a lot of Tasks. Whether invalidation is transitive or direct is currently a task-level concern. And in particular, "strict_deps" needs to only apply to tasks that explicitly ask for it.

As an example: zinc will use strict deps, and only pass strict deps to the compiler. But junit always uses the full transitive graph.

So I think that (with the current task API) it is necessary for this to be a change to this argument:

invalidate_dependents: If True then any targets depending on changed targets are invalidated.
... I definitely don't love that tasks are responsible for specifying this, but it's the way of things currently.

This comment has been minimized.

@JieGhost

JieGhost Dec 19, 2016

Contributor

that sounds reasonable. so I will add a new state here to invalidate direct dependees. I will change the parameter type from boolean to integers as now there are 3 cases.

@JieGhost

JieGhost Dec 19, 2016

Contributor

that sounds reasonable. so I will add a new state here to invalidate direct dependees. I will change the parameter type from boolean to integers as now there are 3 cases.

Show outdated Hide outdated src/python/pants/build_graph/target.py
@@ -473,7 +473,10 @@ def transitive_invalidation_hash(self, fingerprint_strategy=None, depth=0):
def dep_hash_iter():
for dep in self.dependencies:
try:
dep_hash = dep.transitive_invalidation_hash(fingerprint_strategy, depth=depth+1)
if direct:
dep_hash = dep.invalidation_hash(fingerprint_strategy)

This comment has been minimized.

@stuhood

stuhood Dec 27, 2016

Member

Is this equivalent to transitive_invalidation_hash(.., depth=0)? It should be, right?

@stuhood

stuhood Dec 27, 2016

Member

Is this equivalent to transitive_invalidation_hash(.., depth=0)? It should be, right?

This comment has been minimized.

@JieGhost

JieGhost Jan 3, 2017

Contributor

No it's not, at least not in current implementation.

transitive_invalidation_hash method hashes dependencies recursively until no dependencies are seen. "depth" parameter here does not mean the wanted depth of invalidation hash. It is only used to track depth of recursion and avoid hitting python stack limit.

@JieGhost

JieGhost Jan 3, 2017

Contributor

No it's not, at least not in current implementation.

transitive_invalidation_hash method hashes dependencies recursively until no dependencies are seen. "depth" parameter here does not mean the wanted depth of invalidation hash. It is only used to track depth of recursion and avoid hitting python stack limit.

Show outdated Hide outdated src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py
@@ -394,6 +420,7 @@ def classpath_for_context(context):
# all targets to finish.
with self.invalidated(relevant_targets,
invalidate_dependents=True,
invalidate_direct_checker=JvmCompile.enable_strict_deps,

This comment has been minimized.

@stuhood

stuhood Dec 27, 2016

Member

This isn't a global property, so I don't think this will do the right thing.

@stuhood

stuhood Dec 27, 2016

Member

This isn't a global property, so I don't think this will do the right thing.

This comment has been minimized.

@JieGhost

JieGhost Jan 3, 2017

Contributor

I am a little confused. What is not a global property? "JvmCompile.enable_strict_deps"? It is a static method thus I think as long as class JvmCompile is used, this method will work?

@JieGhost

JieGhost Jan 3, 2017

Contributor

I am a little confused. What is not a global property? "JvmCompile.enable_strict_deps"? It is a static method thus I think as long as class JvmCompile is used, this method will work?

This comment has been minimized.

@stuhood

stuhood Jan 3, 2017

Member

That method is global, but it is used as the default value of the option, if a target doesn't specify it directly. Each target may have a different value for this field, as specified in https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py#L829

@stuhood

stuhood Jan 3, 2017

Member

That method is global, but it is used as the default value of the option, if a target doesn't specify it directly. Each target may have a different value for this field, as specified in https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py#L829

Show outdated Hide outdated src/python/pants/invalidation/build_invalidator.py
@@ -77,10 +77,13 @@ def key_for_target(self, target, transitive=False, fingerprint_strategy=None):
hasher = hashlib.sha1()
hasher.update(self._cache_key_gen_version)
key_suffix = hasher.hexdigest()[:12]
if transitive:
target_key = target.transitive_invalidation_hash(fingerprint_strategy)
if direct_checker and direct_checker(target):

This comment has been minimized.

@stuhood

stuhood Dec 27, 2016

Member

Again, not sure the global property works. But if it actually does, this should be a property of the fingerprint_strategy (ie fingerprint_strategy.direct) rather than an argument to this method.

@stuhood

stuhood Dec 27, 2016

Member

Again, not sure the global property works. But if it actually does, this should be a property of the fingerprint_strategy (ie fingerprint_strategy.direct) rather than an argument to this method.

This comment has been minimized.

@stuhood

stuhood Dec 27, 2016

Member

Actually, one way to do this might be to define an abstract method: fingerprint_strategy.direct(target).

@stuhood

stuhood Dec 27, 2016

Member

Actually, one way to do this might be to define an abstract method: fingerprint_strategy.direct(target).

This comment has been minimized.

@JieGhost

JieGhost Jan 3, 2017

Contributor

hmm, maybe not an abstract one, the method in base class can return False. Otherwise, all derived classes have to implement this, but only 1 derived class actually overrides it.

@JieGhost

JieGhost Jan 3, 2017

Contributor

hmm, maybe not an abstract one, the method in base class can return False. Otherwise, all derived classes have to implement this, but only 1 derived class actually overrides it.

This comment has been minimized.

@stuhood

stuhood Jan 3, 2017

Member

Ah, yea. That makes sense.

@stuhood

stuhood Jan 3, 2017

Member

Ah, yea. That makes sense.

@stuhood

This is very close now! The semantics all look correct, there is just a bit of disagreement about which dependencies will be used between invalidation and actual usage. I think once that is fixed it will be ready to land.

Show outdated Hide outdated src/python/pants/build_graph/target.py
@@ -473,7 +473,10 @@ def transitive_invalidation_hash(self, fingerprint_strategy=None, depth=0):
def dep_hash_iter():
for dep in self.dependencies:

This comment has been minimized.

@stuhood

stuhood Jan 4, 2017

Member

It looks like there is currently some disagreement in the definition of "strict" deps between https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/jvm_compile/compile_context.py#L66-L77 and this code.

The largest difference is that CompileContext.strict_deps expands aliases... a relatively minor bit is that it includes the transitive deps of plugins.

Perhaps the FingerprintStrategy could actually be what computes the dep list?

@stuhood

stuhood Jan 4, 2017

Member

It looks like there is currently some disagreement in the definition of "strict" deps between https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/jvm_compile/compile_context.py#L66-L77 and this code.

The largest difference is that CompileContext.strict_deps expands aliases... a relatively minor bit is that it includes the transitive deps of plugins.

Perhaps the FingerprintStrategy could actually be what computes the dep list?

@peiyuwang

minor comments

Show outdated Hide outdated ...hon/pants_test/invalidation/test_strict_deps_invalidation_integration.py
TEST_SRC = 'testprojects/tests/java/org/pantsbuild/testproject/strictdeps'
def copy_src_and_run_test(self, target_name, invalidate_root_target, *extra_args):

This comment has been minimized.

@peiyuwang

peiyuwang Jan 5, 2017

Contributor
  • copy_src_and_run_test -> modify_transitive_deps_and_compile
  • invalidate_root_target*_expected*
@peiyuwang

peiyuwang Jan 5, 2017

Contributor
  • copy_src_and_run_test -> modify_transitive_deps_and_compile
  • invalidate_root_target*_expected*
@stuhood

stuhood approved these changes Jan 6, 2017

Thanks Yujie!

@baroquebobcat

One large comment and a couple nitpicks. I like where this is going though.

Show outdated Hide outdated src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py
@@ -351,16 +351,42 @@ def _dep_analyzer(self):
def _analysis_parser(self):
return self._analysis_tools.parser
@staticmethod
def _compute_language_property(target, selector):
"""Computes the a language property setting for the given target sources.

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

grammar nit: s/the a/a/ I know this is just a move, but might be good to fix it while you're in here.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

grammar nit: s/the a/a/ I know this is just a move, but might be good to fix it while you're in here.

Show outdated Hide outdated src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py
def _fingerprint_strategy(self, classpath_products):
return ResolvedJarAwareTaskIdentityFingerprintStrategy(self, classpath_products)
@staticmethod
def enable_strict_deps(target):

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

This name feels like it's flipping strict deps on rather than checking the value. How about strict_deps_enabled?

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

This name feels like it's flipping strict deps on rather than checking the value. How about strict_deps_enabled?

Show outdated Hide outdated src/python/pants/invalidation/build_invalidator.py
@@ -77,10 +77,13 @@ def key_for_target(self, target, transitive=False, fingerprint_strategy=None):
hasher = hashlib.sha1()
hasher.update(self._cache_key_gen_version)
key_suffix = hasher.hexdigest()[:12]
if transitive:
target_key = target.transitive_invalidation_hash(fingerprint_strategy)
if direct_checker and direct_checker(target):

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

I think direct_checker only makes sense if transitive is True. Maybe you could simplify this if else sequence using that invariant.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

I think direct_checker only makes sense if transitive is True. Maybe you could simplify this if else sequence using that invariant.

Show outdated Hide outdated src/python/pants/invalidation/build_invalidator.py
@@ -62,7 +62,7 @@ def __init__(self, cache_key_gen_version=None):
self._cache_key_gen_version = '_'.join([cache_key_gen_version or '',
GLOBAL_CACHE_KEY_GEN_VERSION])
def key_for_target(self, target, transitive=False, fingerprint_strategy=None):
def key_for_target(self, target, transitive=False, direct_checker=None, fingerprint_strategy=None):
"""Get a key representing the given target and its sources.

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

add docstring for direct_checker.

It feels like there could be a better description for this than direct / direct_checker. direct here limits transitive traversals of the graph from recursing beyond direct dependencies if the predicate direct_checker is true. It prevents recursing beyond the direct dependencies.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

add docstring for direct_checker.

It feels like there could be a better description for this than direct / direct_checker. direct here limits transitive traversals of the graph from recursing beyond direct dependencies if the predicate direct_checker is true. It prevents recursing beyond the direct dependencies.

Show outdated Hide outdated src/python/pants/backend/jvm/tasks/jvm_compile/jvm_compile.py
return ResolvedJarAwareTaskIdentityFingerprintStrategy(self, classpath_products, self._dep_context)
@staticmethod
def enable_strict_deps(target):

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

I think it'd be clearer if this were named strict_deps_enabled. It sounds more like a setter with the current name.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

I think it'd be clearer if this were named strict_deps_enabled. It sounds more like a setter with the current name.

Show outdated Hide outdated src/python/pants/build_graph/target.py
try:
dep_hash = dep.transitive_invalidation_hash(fingerprint_strategy, depth=depth+1)
if fingerprint_strategy.direct(self):

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

The recursion here makes this tricky. I'd like to see a few test cases using targets where strict_deps=False whose dependencies have strict_deps=True.

Consider

A => B
B => C
C => D
where B has strict_deps=True

Currently, the dependencies used in A's compile classpath will be B, C, D. A's transitive fingerprint will also be based on those three targets.

With this change as it is, the classpath will remain the same, but the fingerprint will not include D because B is strict_deps=True and direct(B) will return True.

We should resolve this discrepancy. I'd also like to see a test case that looks like that and specifies the expected behavior.

We can resolve this by either having direct only apply to the top level target, ie with depth 0, or by changing the compile classpath. I think having direct only apply to the top level target would be better.

@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

The recursion here makes this tricky. I'd like to see a few test cases using targets where strict_deps=False whose dependencies have strict_deps=True.

Consider

A => B
B => C
C => D
where B has strict_deps=True

Currently, the dependencies used in A's compile classpath will be B, C, D. A's transitive fingerprint will also be based on those three targets.

With this change as it is, the classpath will remain the same, but the fingerprint will not include D because B is strict_deps=True and direct(B) will return True.

We should resolve this discrepancy. I'd also like to see a test case that looks like that and specifies the expected behavior.

We can resolve this by either having direct only apply to the top level target, ie with depth 0, or by changing the compile classpath. I think having direct only apply to the top level target would be better.

@baroquebobcat

Looks good to me!

@JieGhost JieGhost merged commit 797cd82 into pantsbuild:master Jan 9, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@JieGhost JieGhost deleted the JieGhost:yujieproject/github_3060 branch Jan 9, 2017

lenucksi added a commit to lenucksi/pants that referenced this pull request Apr 25, 2017

Reduce compilation invalidation scope of targets with strict_deps=True (
#4143)

### Problem

Related issue: #3060 

Now, with strict_deps turned on, only direct dependencies will be used to compile jvm targets. But it isn't taken into account when generating fingerprints for compile. We can reduce the invalidation scope as well.

An example will be following:
Let's assume we have 3 targets A, B, and C. A depends on B; B depends on C, and A does not directly depends on C (C is a transitive dependency of A). In current implementation, invalidation hash of A takes hashes of A, B and C into account. But with strict_deps turned on, changing C does not have to invalidate A, since A does not depend on C for compilation.

### Solution

I add a new "direct(self, target)" method in FingerpringStrategy abstract base class. This method returns "False" by default, derived class of FingerpringStrategy can override this method. In the case of this review, ResolvedJarAwareTaskIdentityFingerprintStrategy overrides this method so that when a target's strict_deps is defined then return its value, otherwise, platform's strict_deps property will be used. The target.transitive_invalidation_hash() method will use direct() method to determine the scope of invalidation hash.

 This review also adds a unit test which verifies direct invalidation logic works as expected and several integration tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment