New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that invalid vts have results_dir cleaned before passing to ta… #4139

Merged
merged 10 commits into from Dec 19, 2016

Conversation

Projects
None yet
3 participants
@mateor
Member

mateor commented Dec 14, 2016

Tasks that sometimes fail due to outside factors (download failures,
resolve issues, etc) often would call safe_mkdir(vt.results_dir, clean=True)
in order to wipe possibly truncated or crufty state.

But since vt.results_dir is a symlink, that replaced it with a real dir.
That ended up breaking caching, since the task output was therefore
never making it into the artifact cache, and other weird
bugs.

This is a small change that deletes the existing directories if
a target is invalid, removing the need for tasks to wipe the
results_dir, and also now checks to make sure that the results_dir
is legal before marking valid and passing to the artifact_caches.

The majority of the change is added test coverage around the breaks.
I have a couple immediate followups that cover an additional failure
state, and reworks the cache_manager to remove some of the harder
to reason about bits.

Closes: #4137

Ensure that invalid vts have results_dir cleaned before passing to ta…
…sks.

Tasks that sometimes fail due to outside factors (download failures,
resolve issues, etc) often would call safe_mkdir(vt.results_dir, clean=True)
in order to wipe possibly truncated or crufty state.

But since vt.results_dir is a symlink, that replaced it with a real dir.
That ended up breaking caching, since the task output was therefore
never making it into the artifact cache, and other weird
bugs.

This is a small change that deletes the existing directories if
a target is invalid, removing the need for tasks to wipe the
results_dir, and also now checks to make sure that the results_dir
is legal before marking valid and passing to the artifact_caches.

The majority of the change is added test coverage around the breaks.
I have a couple immediate followups that cover an additional failure
state, and reworks the cache_manager to remove some of the harder
to reason about bits.

Closes: #4137
@mateor

This comment has been minimized.

Show comment
Hide comment
@mateor

mateor Dec 14, 2016

Member

I am pushing this to CI to get Travis to take a crack at it - this will be my first non-release CR through github so forgive me if I muck the process at all.

You can read whenever you want, but I won't be adding specific reviewers until it passes CI.

ETA: I have two followups getting polished, one of which should also be backported to 1.2.1 (the error message for relative_symlink failure state)

Member

mateor commented Dec 14, 2016

I am pushing this to CI to get Travis to take a crack at it - this will be my first non-release CR through github so forgive me if I muck the process at all.

You can read whenever you want, but I won't be adding specific reviewers until it passes CI.

ETA: I have two followups getting polished, one of which should also be backported to 1.2.1 (the error message for relative_symlink failure state)

@mateor

This comment has been minimized.

Show comment
Hide comment
@mateor

mateor Dec 14, 2016

Member

I looked into the failures a bit tonight. They are legit, although it wasn't clear to me exactly what went wrong.

I will dig at it tomorrow and add some unit tests coverage around whatever it ends up deriving from.

Member

mateor commented Dec 14, 2016

I looked into the failures a bit tonight. They are legit, although it wasn't clear to me exactly what went wrong.

I will dig at it tomorrow and add some unit tests coverage around whatever it ends up deriving from.

+ # If the vt is invalid, clean. Deletes both because if the stable_dir has somehow been replaced with a real dir,
+ # the relative_symlink call below raises an uncaught exception when it attempts to unlink the real directory.
+ safe_rmtree(current_dir)
+ safe_rmtree(stable_dir)

This comment has been minimized.

@stuhood

stuhood Dec 14, 2016

Member

IMO, it would be better to blow up in the case of a deleted/replaced results_dir symlink to expose an issue in a task, rather than to silently try to fix it.

@stuhood

stuhood Dec 14, 2016

Member

IMO, it would be better to blow up in the case of a deleted/replaced results_dir symlink to expose an issue in a task, rather than to silently try to fix it.

This comment has been minimized.

@mateor

mateor Dec 14, 2016

Member

I have a followup that does that - I broke it out into a separate commit from due to your original comment. Can add here if you prefer.

@mateor

mateor Dec 14, 2016

Member

I have a followup that does that - I broke it out into a separate commit from due to your original comment. Can add here if you prefer.

@@ -264,11 +287,13 @@ def __init__(self,
def update(self, vts):
"""Mark a changed or invalidated VersionedTargetSet as successfully processed."""
for vt in vts.versioned_targets:
+ vt._ensure_legal()

This comment has been minimized.

@stuhood

stuhood Dec 14, 2016

Member

This method should be public if it will be called here (it needn't be API: public)

@stuhood

stuhood Dec 14, 2016

Member

This method should be public if it will be called here (it needn't be API: public)

This comment has been minimized.

@mateor

mateor Dec 14, 2016

Member

oh yes, that is true.

@mateor

mateor Dec 14, 2016

Member

oh yes, that is true.

@@ -95,15 +105,17 @@ def test_incremental(self):
vtB = task.execute()
self.assertContent(vtB, one + two)
- # Incremental atop existing directory for vtA.
+ # vtC.previous_cache_key == vtB.cache_key so the task appends the file to vtB's results.
+ # This behavior used to be skipped because the current_dir existed when checked by cache_manager._use_previous_dir.

This comment has been minimized.

@stuhood

stuhood Dec 14, 2016

Member

I thought about this a bit more... the idea was that if we switched back to exactly the same inputs as some previous build with zinc (for example), then zinc would quickly do the right thing: which is noop.

On the other hand, clearing the directory regardless as you're doing now is definitely less error prone then ever reusing an existing directory, so I think your change makes sense.

I would still love to see the test updated before landing this though.

@stuhood

stuhood Dec 14, 2016

Member

I thought about this a bit more... the idea was that if we switched back to exactly the same inputs as some previous build with zinc (for example), then zinc would quickly do the right thing: which is noop.

On the other hand, clearing the directory regardless as you're doing now is definitely less error prone then ever reusing an existing directory, so I think your change makes sense.

I would still love to see the test updated before landing this though.

This comment has been minimized.

@mateor

mateor Dec 14, 2016

Member

I am tinkering with this now...I think I can break out the previous_results move to happen after the cache check. It is tough to reason about the cache manager code, I have written another followup that deletes a lot (but not all) of the circular referencing and object mutation.

The above test should currently pass (although maybe it doesn't? I am working on a different integration test failure atm). The case it presents is no longer very useful though. Can you elaborate on what you are asking for with:

"I would still love to see the test updated before landing this though."

@mateor

mateor Dec 14, 2016

Member

I am tinkering with this now...I think I can break out the previous_results move to happen after the cache check. It is tough to reason about the cache manager code, I have written another followup that deletes a lot (but not all) of the circular referencing and object mutation.

The above test should currently pass (although maybe it doesn't? I am working on a different integration test failure atm). The case it presents is no longer very useful though. Can you elaborate on what you are asking for with:

"I would still love to see the test updated before landing this though."

This comment has been minimized.

@stuhood

stuhood Dec 14, 2016

Member

You added a comment at the top of the method indicating that it should be updated.

@stuhood

stuhood Dec 14, 2016

Member

You added a comment at the top of the method indicating that it should be updated.

This comment has been minimized.

@mateor

mateor Dec 14, 2016

Member

Hmmm...yes that sounds like something I would do :)

Okay, I will add a some more coverage. I also believe that I was able to add some additional correctness as mentioned above.

Assuming that the code is working, the incremental task behavior should now be:

  • full noop if target unchanged from previous run
  • noop from zinc on cached VT
  • previous_results copied in before zinc execution for invalid, uncached VT

I will add some tests to show the above, as much as I am able.

I didn't see a check for already populated current_result_dirs under the pants_workdir, just local and remote cache checks. Which surprised me, since the incremental implementation proves Pants is willing to consume from previous result_dirs. I wonder if that case happens often enough to be worth considering. Perhaps rare to the point of not mattering...

@mateor

mateor Dec 14, 2016

Member

Hmmm...yes that sounds like something I would do :)

Okay, I will add a some more coverage. I also believe that I was able to add some additional correctness as mentioned above.

Assuming that the code is working, the incremental task behavior should now be:

  • full noop if target unchanged from previous run
  • noop from zinc on cached VT
  • previous_results copied in before zinc execution for invalid, uncached VT

I will add some tests to show the above, as much as I am able.

I didn't see a check for already populated current_result_dirs under the pants_workdir, just local and remote cache checks. Which surprised me, since the incremental implementation proves Pants is willing to consume from previous result_dirs. I wonder if that case happens often enough to be worth considering. Perhaps rare to the point of not mattering...

This comment has been minimized.

@stuhood

stuhood Dec 14, 2016

Member

I didn't see a check for already populated current_result_dirs under the pants_workdir, just local and remote cache checks. Which surprised me, since the incremental implementation proves Pants is willing to consume from previous result_dirs.

AFAICT, that's actually exactly what this test was doing before: vtA had the same directory as vtC... I'm fine with dropping that behaviour for the reasons mentioned above, but I believe that that is what was being tested here.

@stuhood

stuhood Dec 14, 2016

Member

I didn't see a check for already populated current_result_dirs under the pants_workdir, just local and remote cache checks. Which surprised me, since the incremental implementation proves Pants is willing to consume from previous result_dirs.

AFAICT, that's actually exactly what this test was doing before: vtA had the same directory as vtC... I'm fine with dropping that behaviour for the reasons mentioned above, but I believe that that is what was being tested here.

This comment has been minimized.

@mateor

mateor Dec 14, 2016

Member

I understood why it didn't copy the previous_results (the directory was found to exist) but it was not clear to me that was happening on purpose.

And the task itself doesn't noop, which is what I meant above. The invalidation block runs all three times. Zinc just passively inherits the existing results_dir since the namespace is reused. That may be equivalent to a noop for the zinc tool, for all I know, but either way no other tasks even hit that check, since zinc is the only thing that hits that previous_dir code.

Maybe the build_invalidator could go ahead and be fully namespaced in the same manner as the results_dir, and therefore hold an arbitrary number of VT results. Since those would only be VTs that successfully exited invalidation, the results_dir would only require a wipe if the current_results_dir exists but no hash file was dropped.

I guess that is firmly in the land of a new feature, in any rate, and possibly you know reasons that my speculating is flawed :)

I will be able to provide the contract detailed above. Tasks that are rebuilding cache_keys older than the most recent will consume from the cache, and if not found will be opting out of incremental support since they will no longer passively inherit their results.

Which is maybe preferred when taken as a whole? Cache hits look like they supercede the previous_results, in which case all cached VTs were copying over the previous results in the cache_manager, and then immediately pruning the files within the artifact_cache unpack.

The only case where a zinc task would newly miss out on both artifact cache and incremental results is if the results_dir existed but the artifact was uncached, which is probably pretty rare in practice.

@mateor

mateor Dec 14, 2016

Member

I understood why it didn't copy the previous_results (the directory was found to exist) but it was not clear to me that was happening on purpose.

And the task itself doesn't noop, which is what I meant above. The invalidation block runs all three times. Zinc just passively inherits the existing results_dir since the namespace is reused. That may be equivalent to a noop for the zinc tool, for all I know, but either way no other tasks even hit that check, since zinc is the only thing that hits that previous_dir code.

Maybe the build_invalidator could go ahead and be fully namespaced in the same manner as the results_dir, and therefore hold an arbitrary number of VT results. Since those would only be VTs that successfully exited invalidation, the results_dir would only require a wipe if the current_results_dir exists but no hash file was dropped.

I guess that is firmly in the land of a new feature, in any rate, and possibly you know reasons that my speculating is flawed :)

I will be able to provide the contract detailed above. Tasks that are rebuilding cache_keys older than the most recent will consume from the cache, and if not found will be opting out of incremental support since they will no longer passively inherit their results.

Which is maybe preferred when taken as a whole? Cache hits look like they supercede the previous_results, in which case all cached VTs were copying over the previous results in the cache_manager, and then immediately pruning the files within the artifact_cache unpack.

The only case where a zinc task would newly miss out on both artifact cache and incremental results is if the results_dir existed but the artifact was uncached, which is probably pretty rare in practice.

mateor added some commits Dec 15, 2016

Raise an Exception if relative_symlink tries to unlink a concrete file
As said in the comment, this has been a long-term uncaught exception,
surfacing when os.unlink was called on a directory. This resulted
in non-zero exit for Pants, with Operation not Permitted raising
from the OS.

Should now give a decent msg, but the outcome of Pants in this
situation will continue to be an execution-halting exception.

This behavior has diverged between relative and absolute symlinks
methods. the absolute_symlink checks for dirs, and silently rms
them. That behavior should probably be reconciled, one way or another,
one of these days.

Once a determination is made, it would likely inform the final determination
on whether safe_mkdir should be allowed to continue to be symlink agnostic.
Only copy over previous results after checking for cache hits
This was probably really clogging the IO, since every jvm
target was:

* copying the entire previous_results_dir over
* doing it a second time
* checking for cached artifact
* if found, parsing the analysis and pruning unneeded results.

Now it creates the current VT results_dir, and then checks for the
cache. This includes the double check, so the last thing that is
done before yielding the VT as invalid is to copy any
available previous results.

My read on this is that it will save a good chunk of unnecessary system
calls.
Add some test coverage around the delayed previous_results copy
This appears to save a bunch of unnecessary results_dir copies.
I added some new test coverage to exercise this, but there was
very limited testing to begin with.

I have one more followup commit that significantly cleans up
the logic in cache_manager. That can land separatly from this commit,
I wanted to get some test coverage in before I tried that one.
@mateor

This comment has been minimized.

Show comment
Hide comment
@mateor

mateor Dec 16, 2016

Member

I added some additional test coverage around the discoveries we discussed in the bug thread. I probably should do another pass over this once it passes CI - will do this afternoon.

Member

mateor commented Dec 16, 2016

I added some additional test coverage around the discoveries we discussed in the bug thread. I probably should do another pass over this once it passes CI - will do this afternoon.

@mateor mateor requested review from peiyuwang and gmalmquist Dec 16, 2016

Python lint error
WHoops. I have gotten so used to disabling the style checks while
developing!

Otherwise appears to be good. There is still not enough test coverage
around invalidation in general, and incremental tasks specifically.
But I have added a lot more then there used to be.

I have a logic refactor as a followup, and I will try and
continue to add coverage as a part of that.
@stuhood

Thanks Mateo... excellent work. Some minor nits, but not blocking probably.

- # Finally, create the stable symlink.
- relative_symlink(current_dir, stable_dir)
+ # If the target is valid, both directories are assumed to exist.

This comment has been minimized.

@stuhood

stuhood Dec 16, 2016

Member

Maybe worth validating that here in the else clause?

@stuhood

stuhood Dec 16, 2016

Member

Maybe worth validating that here in the else clause?

This comment has been minimized.

@mateor

mateor Dec 16, 2016

Member

I agree - I have a followup with that change. My hope was to change the behavior here as little as possible in the change, since the coverage was light and we were scheduling to backport to a stable branch.

But I ended up needing to make somewhat significant changes due to moving the previous check, so I can go ahead and add that here safely, I think.

@mateor

mateor Dec 16, 2016

Member

I agree - I have a followup with that change. My hope was to change the behavior here as little as possible in the change, since the coverage was light and we were scheduling to backport to a stable branch.

But I ended up needing to make somewhat significant changes due to moving the previous check, so I can go ahead and add that here safely, I think.

+ # Set the self._previous last, so that it is only True after the copy completed.
+ self._previous_results_dir = previous_path
+
+ def _use_previous_dir(self, allow_incremental, root_dir):

This comment has been minimized.

@stuhood

stuhood Dec 16, 2016

Member

Unused method? It returns the self._previous_results_dir, which wouldn't be set when this method was called.

@stuhood

stuhood Dec 16, 2016

Member

Unused method? It returns the self._previous_results_dir, which wouldn't be set when this method was called.

This comment has been minimized.

@mateor

mateor Dec 16, 2016

Member

Yep, thanks for the catch. All the functions of this method are covered elsewhere, so will delete.

It was a miss when splitting from the followup commit, which cleans up the cache_manager's more mutation-heavy choices: mateor@62e49a3

It needs to be updated but I intend to put up a roughly analogous RB as a followup.

@mateor

mateor Dec 16, 2016

Member

Yep, thanks for the catch. All the functions of this method are covered elsewhere, so will delete.

It was a miss when splitting from the followup commit, which cleans up the cache_manager's more mutation-heavy choices: mateor@62e49a3

It needs to be updated but I intend to put up a roughly analogous RB as a followup.

mateor added some commits Dec 16, 2016

Add some extra tests and error message to vt.ensure_legal
At the moment this throws an error in these circumstances.
This is probably a win from a UX perspective, since we will
finally get some caught exceptions if someone starts
deleting dirs under pants.d. But perhaps you would prefer
to silently restore the working state for now, so let me
know if that is the case.

Also removes no longer needed method and retitles the
InvalidResultsDir() exception to avoid overloading "invalid" any
more than it already is.
@wisechengyi

Thanks @mateor. a few nits

+ raise self.IllegalResultsDir(
+ '\nThe results_dirs state should not be manually cleaned or recreated by tasks.\n{}'.format(errors)
+ )
+ return True

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

no need to return any value since it is not used.

docstring += 'i.e. self._result_dir has be to be a link to a dir'

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

no need to return any value since it is not used.

docstring += 'i.e. self._result_dir has be to be a link to a dir'

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

Yes...I added the return value because things like this often end up being useful as bools, and people naturally would try it. Like added to a should_cache_artifacts method or something. It was an attempt to conform to expectation. Without the return True, `vt.ensure_legal() would be Falsey, which felt confusing.

I would prefer to leave it, but will remove if you let me know you disagree.

@mateor

mateor Dec 19, 2016

Member

Yes...I added the return value because things like this often end up being useful as bools, and people naturally would try it. Like added to a should_cache_artifacts method or something. It was an attempt to conform to expectation. Without the return True, `vt.ensure_legal() would be Falsey, which felt confusing.

I would prefer to leave it, but will remove if you let me know you disagree.

- # Finally, create the stable symlink.
- relative_symlink(current_dir, stable_dir)
+ def copy_previous_results(self, root_dir):

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

could privatize

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

could privatize

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

I don't think so - this is called by Task.py after cache is checked.

@mateor

mateor Dec 19, 2016

Member

I don't think so - this is called by Task.py after cache is checked.

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

Might make better sense to move to Task honestly - it is weird that the cache manager manages that part.

I have a followup that refactors cache_manager, will consider doing that there.

@mateor

mateor Dec 19, 2016

Member

Might make better sense to move to Task honestly - it is weird that the cache manager manages that part.

I have a followup that refactors cache_manager, will consider doing that there.

+ task_output = os.path.join(vt.results_dir, 'a_file')
+ self.create_file(task_output, 'foo')
+
+ def is_empty(self, dirname):

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

could make this static method.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

could make this static method.

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

yes, that sounds good. I considered adding it to the base class. Will go ahead and make static for now.

@mateor

mateor Dec 19, 2016

Member

yes, that sounds good. I considered adding it to the base class. Will go ahead and make static for now.

+ def is_empty(self, dirname):
+ return not os.listdir(dirname)
+
+ def matching_result_dirs(self, vt):

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

static

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

static

+ # Ensure that the result_dirs contain the same files.
+ return self.is_empty(vt.results_dir) == self.is_empty(vt.current_results_dir)
+
+ def clobber_symlink(self, vt):

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

static

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

static

+
+ def matching_result_dirs(self, vt):
+ # Ensure that the result_dirs contain the same files.
+ return self.is_empty(vt.results_dir) == self.is_empty(vt.current_results_dir)

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

is it safe to conclude the content of the dirs are the same if neither of them are empty? or should the docstring be updated?

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

is it safe to conclude the content of the dirs are the same if neither of them are empty? or should the docstring be updated?

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

Yeah, this for sure needs to be changed. I was continually testing the file list and decided to break it out into a function - but I did not do a very good job.

@mateor

mateor Dec 19, 2016

Member

Yeah, this for sure needs to be changed. I was continually testing the file list and decided to break it out into a function - but I did not do a very good job.

+
+ # The main protection for this is the exception raised when the cache_manager attempts to mark the VT valid.
+ self.assertFalse(vt.valid)
+ vt.update()

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

iiuc the exception will be raised at vt.update(), would it be more clear to say

<blob>
with self.assertRaises(VersionedTargetSet.IllegalResultsDir):
  vt.update()

instead of

with self.assertRaises(VersionedTargetSet.IllegalResultsDir):
  <blob>
  vt.update()
@wisechengyi

wisechengyi Dec 19, 2016

Contributor

iiuc the exception will be raised at vt.update(), would it be more clear to say

<blob>
with self.assertRaises(VersionedTargetSet.IllegalResultsDir):
  vt.update()

instead of

with self.assertRaises(VersionedTargetSet.IllegalResultsDir):
  <blob>
  vt.update()

This comment has been minimized.

@mateor

mateor Dec 19, 2016

Member

agree, will bring above the contextmanager block.

@mateor

mateor Dec 19, 2016

Member

agree, will bring above the contextmanager block.

+ vt = self.make_vt()
+ self.clobber_symlink(vt)
+ vts = VersionedTargetSet.from_versioned_targets([vt])
+ vts.update()

This comment has been minimized.

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

same

@wisechengyi

wisechengyi Dec 19, 2016

Contributor

same

mateor added some commits Dec 19, 2016

Make some common methods static, remove the custom CacheKeyGenerator …
…subclass.

Also added some additional test coverage and some TODOs.
@mateor

This comment has been minimized.

Show comment
Hide comment
@mateor

mateor Dec 19, 2016

Member

Okay, followed up on review, thanks @wisechengyi .

Member

mateor commented Dec 19, 2016

Okay, followed up on review, thanks @wisechengyi .

@mateor mateor merged commit 6a43414 into pantsbuild:master Dec 19, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

stuhood added a commit to twitter/pants that referenced this pull request Dec 20, 2016

Ensure that invalid vts have results_dir cleaned before passing to ta… (
#4139)

Tasks that sometimes fail due to outside factors (download failures,
resolve issues, etc) often would call safe_mkdir(vt.results_dir, clean=True)
in order to wipe possibly truncated or crufty state.

But since vt.results_dir is a symlink, that replaced it with a real dir.
That ended up breaking caching, since:
    * Product pipeline was consuming from the vt.results_dir
    * But the artifacts in the cache were created using vt.current_results_dir,
       which no longer got the output.

If a target is invalid, the cache_manager will now wipe the results_dir prior
to passing to the task, removing the need for tasks to clean it themself.
It also now checks to make sure that the results_dir is legal before marking
a VT valid and passing to the artifact_cache.

Also:
* Raise an Exception if relative_symlink tries to unlink a concrete file
* Copy over previous results *only* after checking for cache hits
    * Before this change every file in compile.zinc results_dir was copied twice
       per cache hit, and then thrown away. Now it only copies after a cache miss.

The majority of the change is added test coverage to cover the original
and added behavior.

lenucksi added a commit to lenucksi/pants that referenced this pull request Apr 25, 2017

Ensure that invalid vts have results_dir cleaned before passing to ta… (
#4139)

Tasks that sometimes fail due to outside factors (download failures,
resolve issues, etc) often would call safe_mkdir(vt.results_dir, clean=True)
in order to wipe possibly truncated or crufty state.

But since vt.results_dir is a symlink, that replaced it with a real dir.
That ended up breaking caching, since:
    * Product pipeline was consuming from the vt.results_dir
    * But the artifacts in the cache were created using vt.current_results_dir,
       which no longer got the output.

If a target is invalid, the cache_manager will now wipe the results_dir prior
to passing to the task, removing the need for tasks to clean it themself.
It also now checks to make sure that the results_dir is legal before marking
a VT valid and passing to the artifact_cache.

Also:
* Raise an Exception if relative_symlink tries to unlink a concrete file
* Copy over previous results *only* after checking for cache hits
    * Before this change every file in compile.zinc results_dir was copied twice
       per cache hit, and then thrown away. Now it only copies after a cache miss.

The majority of the change is added test coverage to cover the original
and added behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment