Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many: detect local source changes #2167

Merged

Conversation

kyrofa
Copy link
Contributor

@kyrofa kyrofa commented Jun 23, 2018

  • Have you followed the guidelines for contributing?
  • Have you signed the CLA?
  • If this is a bugfix. Have you checked that there is a bug report open for the issue you are trying to fix on bug reports?
  • If this is a new feature. Have you discussed the design on the forum?
  • Have you successfully run ./runtests.sh static?
  • Have you successfully run ./runtests.sh unit?

The snapcraft CLI doesn't notice changes to local sources once the parts have been pulled: one must clean and re-pull in order to take them into account. This PR resolves LP: #1583718 by comparing local source timestamps to the pull state timestamp, and updating the pull (and all subsequent) steps as necessary.

@kyrofa kyrofa force-pushed the feature/1583718/detect_source_changes branch from 6b262c9 to 0c85aba Compare June 23, 2018 02:30
@sergiusens
Copy link
Collaborator

Hey, what a PR, it is quite big and lots of tests seem to be failing, so I'll leave it to you to fix them before moving forward with an actual review.

@kyrofa kyrofa force-pushed the feature/1583718/detect_source_changes branch from 0c85aba to 3abc332 Compare June 25, 2018 16:36
@codecov-io
Copy link

codecov-io commented Jun 25, 2018

Codecov Report

Merging #2167 into master will decrease coverage by 0.03%.
The diff coverage is 87.63%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2167      +/-   ##
==========================================
- Coverage   91.32%   91.28%   -0.04%     
==========================================
  Files         201      202       +1     
  Lines       12617    12757     +140     
  Branches     1874     1900      +26     
==========================================
+ Hits        11522    11645     +123     
- Misses        741      752      +11     
- Partials      354      360       +6
Impacted Files Coverage Δ
snapcraft/internal/pluginhandler/_dirty_report.py 100% <ø> (ø) ⬆️
snapcraft/file_utils.py 97.56% <100%> (ø) ⬆️
snapcraft/_baseplugin.py 93.4% <100%> (+0.07%) ⬆️
...apcraft/internal/pluginhandler/_outdated_report.py 100% <100%> (ø)
snapcraft/plugins/cmake.py 89.28% <100%> (+2.18%) ⬆️
snapcraft/internal/errors.py 99.53% <100%> (ø) ⬆️
snapcraft/internal/lifecycle/_status_cache.py 100% <100%> (ø) ⬆️
snapcraft/internal/sources/_base.py 92.06% <69.23%> (-5.94%) ⬇️
snapcraft/internal/sources/_local.py 84.9% <78.94%> (-15.1%) ⬇️
snapcraft/internal/pluginhandler/__init__.py 91.87% <82.6%> (-0.42%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25043ab...6cc1d79. Read the comment docs.

@kyrofa
Copy link
Contributor Author

kyrofa commented Jun 25, 2018

[...] lots of tests seem to be failing, so I'll leave it to you to fix them before moving forward with an actual review.

Yeah I had to move the in-test XDG cache around, so integration tests are taking a few iterations. I'll ping you.

The snapcraft CLI doesn't notice changes to local sources once the parts
have been pulled: one must clean and re-pull in order to take them into
account. Start comparing local source timestamps to the pull state
timestamp, updating the pull (and all subsequent) steps as necessary.

LP: #1583718

Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
@kyrofa kyrofa force-pushed the feature/1583718/detect_source_changes branch from 3abc332 to 77481b3 Compare June 25, 2018 18:09
@kyrofa
Copy link
Contributor Author

kyrofa commented Jun 26, 2018

Alright @sergiusens, this one should be good, now.



def _link(source: str, destination: str, follow_symlinks: bool=False) -> None:
def link(source: str, destination: str, follow_symlinks: bool=False) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that they are public it would be nice to force kwargs and add some docstrings if you don't mind

Copy link
Contributor Author

@kyrofa kyrofa Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API needs to be compatible with copy2 and link_or_copy, so the only kwarg we can force is follow_symlinks, but I agree that it's an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thank you for catching the lack of docs!

@@ -137,7 +137,7 @@ def _link(source: str, destination: str, follow_symlinks: bool=False) -> None:
raise SnapcraftCopyFileNotFoundError(source)


def _copy(source: str, destination: str, follow_symlinks: bool=False) -> None:
def copy(source: str, destination: str, follow_symlinks: bool=False) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dito

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -277,11 +285,18 @@ def _run_step(self, *, step: steps.Step, part, progress, hint=''):

part = _replace_in_part(part)

def _run_step(self, *, step: steps.Step, part, progress, hint=''):
self._prepare_to_run(step=step, part=part)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe _prepare_to_run_step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with _prepare_step so we have _{prepare,run,complete}_step.

notify_part_progress(part, progress, hint)
getattr(part, step.name)()

# We know we just ran this step, so rather than check, manually twiddle
# the cache
self._step_complete(part, step)

def _step_complete(self, part, step):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_complete_step for parity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, done.

raise RuntimeError("source must be checked before it's updated")
self._update()

def _check(self, target: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just do the abc. thing here, you can leave the docstring, it will still be valid code.

Copy link
Contributor Author

@kyrofa kyrofa Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then we'd need to add it to all sources, even those that don't actually support doing this. Since the vast majority of sources don't support it, it seemed to make sense to assume a lack of support in the API, and have only that source that does support it provide an implementation.

else:
getattr(self, '_re{}'.format(step.name))(part, '({})'.format(
outdated_report.get_summary()))


def notify_part_progress(part, progress, hint='', debug=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is old, but what is the difference?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keep it green 😄

Copy link
Contributor Author

@kyrofa kyrofa Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm not clear on what you're saying, here. Are you asking me to change something?

@@ -316,6 +331,28 @@ def _handle_dirty(self, part, step, dirty_report, cli_config):
getattr(self, '_re{}'.format(step.name))(part, hint='({})'.format(
dirty_report.get_summary()))

def _handle_outdated(self, part, step, outdated_report, cli_config):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as handle_dirty, can this take a dirty_action as a parameter instead of passing down the entire object?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the introduction of the outdated concept, I think this is just another case of dirty. Can we add more precise qualifiers like handle_dirty_definitions and handle_dirty_[local]_source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me give you my thought process. A step (for a specific part) can be in one of two states:

  1. Not run (we call this "clean")
  2. Run, which is further broken down into sub-states:
    2.1. Complete
    2.2. Needs to be cleaned and run again (we call this "dirty", i.e. it needs to be cleaned)
    2.3. Needs to be run again, but does not need to be cleaned (in this PR we call this "outdated", i.e. it needs to be updated)

Both 2.2 and 2.3 are caused by something, which needs to be recorded somewhere so we can tell the user what it was. We do that with the {Dirty,Outdated}Report classes. This allows us to isolate concerns, separating the "what makes this {dirty,outdated}" from "oh, this is {dirty,outdated}, I need to take appropriate action." If we combine these reports into one, we have a few issues:

  1. We would need to come up with a new name, since "dirty" no longer works (i.e. the solution is not to clean it anymore).
  2. Concerns leak. Now lifecycle needs to know "Oh, the project properties changed, okay, I know I need to clean" or "Oh, an earlier step happened, I know I need to update". This spread of knowledge seems error-prone.

(1) isn't a big deal, we can figure that one out. The word "outdated" could be used to refer to both situations, I suppose. (2) we can probably solve by keeping things similar to how they are today, but have a master class composed of both a dirty and outdated report, and take action appropriately. That at least gives us a single report to deal with in the PluginHandler's API and thus the lifecycle, but other than that doesn't buy us much. Do you like that idea better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not say combine them, I just meant that "dirty" and "outdated" are easier to confuse and that we should narrow the scope of what it means down.

Your thought process is from the point of view of the action wrt lifecycle and my mind is trending towards the state of things (why is it dirty/outdated) whilst the report comes from a component that does not determine the future actions.

These two terms lead to confusions, is this dirty because it is outdated (so far that has been our logic, right?). The what is outdated is what I am asking to more clearly state in the method calls, checks and class names.

But yeah, by no means, unless it gets converted into the state class merge these two.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are really stuck with the names, I have added a couple more comments that would satisfy me. My intention was that from the names of variables and classes it was crystal clear what it meant without having to read the code, I can compromise with a bit more documentation under the internal class names and methods called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this should be cleaned up, but as we discussed on the call, let's see if we can get this PR in a decent state with docs and then take another pass (in another PR) at improving it further.

part,
'Skipping {}'.format(current_step.name),
'(already ran)')
outdated_report = self._cache.get_outdated_report(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have complications accepting this procedural flow, just the level of nesting.
How is dirty_report so different from outdated_report? How can we have a dirty_report if we haven't has_step_run? Do you think this can be shuffled a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because it's the ugliest thing on Earth. I tried to avoid touching too much here, but let me see what I can do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There, that's quite a bit better. Thoughts?

self.plugin.statedir, steps.PULL)

# Not all sources support checking for updates
with contextlib.suppress(NotImplementedError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see the reason for raising now, still would be nice to have a different exception to avoid over catching (or check the payload) to not swallow exceptions like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with a custom exception.

if os.path.exists(self.plugin.build_basedir):
shutil.rmtree(self.plugin.build_basedir)

# FIXME: It's not necessary to ignore here anymore since it's now
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, this is so old 😅

@@ -25,13 +26,79 @@

class Local(Base):

def __init__(self, *args, copy_function=file_utils.link_or_copy, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a scenario for copy_function to be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right here. Didn't want to duplicate functionality.


def _update(self):
# First, copy the directories
for directory in self._updated_directories:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels a lot like the pluginhandlers "migration" code. Maybe we need some generalization there (in the future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are definitely some similarities in that they both operate on directories and files, but the pluginhandler does some special stuff as well.

@@ -116,6 +116,13 @@ def setUp(self):
patcher_dirs.start()
self.addCleanup(patcher_dirs.stop)

self.useFixture(fixtures.EnvironmentVariable(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are back!

self.useFixture(fixtures.EnvironmentVariable(
'XDG_DATA_HOME', os.path.join(self.path, 'data')))
# Use a separate path for XDG dirs, or changes there may be detected as
# source changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kudos on the code comment, makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most annoying change I had to make in the PR! Touched so many things. I should have extracted it, in retrospect.

Copy link
Collaborator

@sergiusens sergiusens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty good, scanned through sans tests which I will look at later. Just a couple of niggles here and there, nothing much.

Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
# If this step hasn't yet run, all we need to do is run it
if not self._cache.has_step_run(part, current_step):
getattr(self, '_run_{}'.format(current_step.name))(part)
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of multiple returns here, it seems this could be handled very well with elif and a final else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're fetching reports before checking them, which is heavy to do upfront if we don't actually need to, so it doesn't fit particularly well within an if/elif unless we want to nest within elses.


fmt = (
'Failed to update source: '
"{source!s} sources don't support updating"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a . to end the sentence please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -249,6 +250,32 @@ def is_clean(self, step):
except errors.NoLatestStepError:
return True

def is_outdated(self, step):
"""Return true if the given step needs to be updated (no cleaning)."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this doc string (what is in between parens).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though mostly by directing the reader to get_outdated_report().

return self.get_outdated_report(step) is not None

def get_outdated_report(self, step: steps.Step):
"""Return an OutdatedReport class describing why step is outdated.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a follow up paragraph on what it means to be outdated? Simil for what it means to be dirty in its counterpart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, for both outdated and dirty.



class OutdatedReport:
"""The OutdatedReport class explains why a given step is outdated."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you further expand in a paragraph what it means to be outdated (conditions that trigger it)? Simil for DirtyReport please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, for both outdated and dirty.

Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
@@ -34,8 +35,8 @@ def __init__(self, config: _config.Config) -> None:
"""
self.config = config
self._steps_run = dict() # type: Dict[str, Set[steps.Step]]
self._outdated_reports = collections.defaultdict(dict) # type: _Report
self._dirty_reports = collections.defaultdict(dict) # type: _Report
self._outdated_reports = collections.defaultdict(dict) # type: _OutdatedReport # noqa
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why noqa or why noqa with no qualifier, and if it is to prevent a line break, I'd rather change the line limit to 99 😉
Use your parens until then 😆

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Signed-off-by: Kyle Fazzari <kyrofa@ubuntu.com>
@sergiusens
Copy link
Collaborator

atta boy 😄

@sergiusens sergiusens merged commit 39c4c39 into canonical:master Jul 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants