New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix python lint dependency on pyprep goal #6606

Merged
merged 19 commits into from Oct 18, 2018

Conversation

Projects
None yet
4 participants
@CMLivingston
Contributor

CMLivingston commented Oct 8, 2018

Problem

Python linting for a target set that contains incompatible python version constraints throws a fatal exception and terminates the lint run. This happens because the lint task is coupled to pyprep, triggering interpreter selection for the target closure.

Solution

Use InterpreterCache to partition the target set based on compatibility constraint, and run the link task for each partition.

Result

Linting incompatible target closures does not throw an exception and instead lints each constraint partition of the closure separately.

Resolves: #5764

Chris Livingston added some commits Oct 4, 2018

Chris Livingston
Refactor target partitioning in interpreter cache and consume that in…
… the lint task. Filter out Python 3 targets if both are in the target closure.
@jsirois

This is going to take some work to do right. Are you sure its not better to just fix the real issue?

)
if not python_tgts:
return
interpreter_cache = PythonInterpreterCache(PythonSetup.global_instance(),

This comment has been minimized.

@jsirois

jsirois Oct 8, 2018

Member

PythonSetup and PythonRepos are subsystem_dependencies you need to declare above.

# Check for Python 3-only constraints.
context_has_py3_targets = False
for constraint in tgts_by_compatibility.copy():
if '>3' in str(constraint) or '>=3' in str(constraint):

This comment has been minimized.

@jsirois

jsirois Oct 8, 2018

Member

This is easily foiled with spaces around operators, '>2.8,<3.7', etc. For robustness you really need to interact with constraints in Requirement form. Something like:

requirement = Requirement.parse(constraint)
if '3' in requirement:
  # Although what about? `>=2.7` - includes 2.7 _and_ 3; ie code works with both so probably _should_ be linted under 2.7

This comment has been minimized.

@stuhood

stuhood Oct 9, 2018

Member

Would it be possible to intersect a "3 -> 4" range with each of them to find matches?

This comment has been minimized.

@jsirois

jsirois Oct 9, 2018

Member

Perhaps. The key point is string comparisons are not going to cut it here.

@stuhood

stuhood approved these changes Oct 9, 2018

This is going to take some work to do right. Are you sure its not better to just fix the real issue?

@jsirois : I talked with @CMLivingston about this one a bit offline. My feeling is that this is a step in the right direction for the ticket he linked, in that once you have a pex/subprocess per linter, you can fork target(s) under their appropriate interpreter. So that ticket cleanly replaces the warning added here.

In the meantime, this makes something like ./pants --changed=.. lint "work" (not fail).

# Check for Python 3-only constraints.
context_has_py3_targets = False
for constraint in tgts_by_compatibility.copy():
if '>3' in str(constraint) or '>=3' in str(constraint):

This comment has been minimized.

@stuhood

stuhood Oct 9, 2018

Member

Would it be possible to intersect a "3 -> 4" range with each of them to find matches?

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 9, 2018

My feeling is that this is a step in the right direction for the ticket he linked,

Sure. My thrust though is effort. I just finished cleaning up blacklists in pex and the effort was not large. I was hoping we could avoid another set of hacks at a reasonable cost. It may not be reasonable though.

@CMLivingston

This comment has been minimized.

Contributor

CMLivingston commented Oct 9, 2018

I just pushed what I believe to be a comprehensive solution to the problem of constraint checking.

From my perspective, this work is significantly less than the work required to bring python 3 linting to parity with py2 linting by pex-ifying each lint plugin for execution by the lint task. This fix is to provide a quick resolution to some internal Twitter issues we are seeing right now with linting the targets that own a file (via owner of). My team has a new hire allocated towards the complete long term solution for #5764 at the end of October, and I will be personally shepherding this implementation (during which we will eliminate all cruft and workarounds).

Chris Livingston
https://github.com/pantsbuild/pants/issues/5764
"""
DISALLOWED_DISTS = [
Distribution(project_name="CPython", version="3.0"),

This comment has been minimized.

@jsirois

jsirois Oct 9, 2018

Member

Pants does not restrict its support to CPython.

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 9, 2018

From my perspective, this work is significantly less than the work required to bring python 3 linting to parity with py2 linting by pex-ifying each lint plugin for execution by the lint task.

I think that's not what's needed. I think, instead, the task can call 1 pex binary and the binary can be built at Pants runtime requiring no changes to any plugins and just a change to the task to break it into 1/2 task 1/2 binary entrypoint. I'll put my money where my mouth is and spend a few hours tonight on the chance we can avoid temporary cruft again.

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 10, 2018

Alright - I have this working.

Python 2 targets:

$ PEX_VERBOSE=9 ./pants --no-verify-config lint testprojects/src/python/interpreter_selection/python_3_selection_testing:lib_py2
...
12:52:03 00:01     [pythonstyle]
                   Invalidated 1 target.
12:52:03 00:01       [pythonstyle]
                     pex: PEX.run invoking /home/jsirois/dev/pantsbuild/jsirois-pants/build-support/pants_dev_deps.venv/bin/python2.7 /home/jsirois/dev/pantsbuild/jsirois-pants/.pants.d/lint/pythonstyle/checker/CPython-2.7.15 --root-dir=/home/jsirois/dev/pantsbuild/jsirois-pants --severity=COMMENT testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py

                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):012 indentation is not a multiple of four
                          |  print('I am a python 2 library method.')
                     
                     E114:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):013 indentation is not a multiple of four (comment)
                          |  # Note that ascii exists as a built-in in Python 3 and
                     
                     E114:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):014 indentation is not a multiple of four (comment)
                          |  # does not exist in Python 2.
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):015 indentation is not a multiple of four
                          |  try:
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):017 indentation is not a multiple of four
                          |  except NameError:
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):019 indentation is not a multiple of four
                          |  assert ret is None
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py):020 indentation is not a multiple of four
                          |  return ret
                     
                     T302:ERROR   testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py2.py:011 Expected 2 blank lines, found 3
                          |def say_something():
                     
                     
FAILURE: 8 Python Style issues found. You may try `./pants fmt <targets>`


               Waiting for background workers to finish.
12:52:03 00:01   [complete]
               FAILURE

Python 3 targets:

$ PEX_VERBOSE=9 ./pants --no-verify-config lint testprojects/src/python/interpreter_selection/python_3_selection_testing:lib_py3
...
12:52:15 00:02     [pythonstyle]
                   Invalidated 1 target.
12:52:15 00:02       [pythonstyle]
                     pex: PEX.run invoking /usr/bin/python3.4 /home/jsirois/dev/pantsbuild/jsirois-pants/.pants.d/lint/pythonstyle/checker/CPython-3.4.9 --root-dir=/home/jsirois/dev/pantsbuild/jsirois-pants --severity=COMMENT testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py3.py

                     E302:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py3.py):007 expected 2 blank lines, found 1
                          |def say_something():
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py3.py):008 indentation is not a multiple of four
                          |  print('I am a python 3 library method.')
                     
                     E111:ERROR   PythonFile(testprojects/src/python/interpreter_selection/python_3_selection_testing/lib_py3.py):009 indentation is not a multiple of four
                          |  return ascii
                     
                     
FAILURE: 3 Python Style issues found. You may try `./pants fmt <targets>`


               Waiting for background workers to finish.
12:52:15 00:02   [complete]
               FAILURE

I'm leaving out chunking the targets into groups by compatibility constraints - the change just converts the checks to be run under a pex using a given interpreter. Are you willing to let me clean this up, send a PR, then have you take over with something roughly like this PR but simpler that chunks targets by interpreter?

@CMLivingston

This comment has been minimized.

Contributor

CMLivingston commented Oct 10, 2018

Great! I am willing to do this - but do you mean chunking them and invoking separate runs for each interpreter?

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 10, 2018

but do you mean chunking them and invoking separate runs for each interpreter?

Yes. The checkstyle method:

Now has signature checkstyle(self, interpreter, sources) - so your change need only call that method once per unique interpreter constraint/sources chunk combo.

Chris Livingston added some commits Oct 12, 2018

Show resolved Hide resolved ...on/src/python/pants/contrib/python/checks/tasks/checkstyle/checkstyle.py Outdated
Show resolved Hide resolved ...on/src/python/pants/contrib/python/checks/tasks/checkstyle/checkstyle.py Outdated
if sources:
return self.checkstyle(interpreter, sources)
tgts_by_compatibility, _ = interpreter_cache.partition_targets_by_compatibility([vt.target for vt in invalidation_check.invalid_vts])
for constraint, targets in tgts_by_compatibility.items():

This comment has been minimized.

@jsirois

jsirois Oct 12, 2018

Member

Perhaps s/constaint/_/. This is pointing out the granularity of interpreter_cache.partition_targets_by_compatibility is a bit large for this use, but I'm happy enough keeping the change scope targeted too.

This comment has been minimized.

@CMLivingston

CMLivingston Oct 15, 2018

Contributor

Agreed. I was unsure how to break this out as the other call site uses both and I did not want to loop over everything twice by using a separate helper. I'm happy to incorporate suggestions.

def test_lint_runs_for_py2_only(self):
command = ['lint', self.TESTPROJECT.format(':main_py2')]
pants_run = self.run_pants(command=command)

This comment has been minimized.

@jsirois

jsirois Oct 12, 2018

Member

Oof. Integration tests are easy to setup but we really do pay for these in CI time. How about building on the existing unit test?

This comment has been minimized.

@CMLivingston

CMLivingston Oct 15, 2018

Contributor

I'll do that.

"""Partition targets by their compatibility constraints.
:param targets: a list of PythonTarget objects

This comment has been minimized.

@jsirois

jsirois Oct 12, 2018

Member

Kill extra blank line.

:param targets: a list of PythonTarget objects
:return: tgts_by_compatibilities: a dict that maps compatibility constraint to a list of

This comment has been minimized.

@jsirois

jsirois Oct 12, 2018

Member

Style would be:

:returns: (tgts_by_compatibilities, filters) <description>
:rtype: (dict(str, list), set)
)
if not python_tgts_and_reqs:

This comment has been minimized.

@CMLivingston

CMLivingston Oct 15, 2018

Contributor

@jsirois @stuhood any idea why we are grabbing PythonRequirementLibrarys here? There is a single test case for this that is breaking when I remove this check, and it doesn't seem to be doing anything afaict.

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

It looks like a testing hack to me: https://github.com/pantsbuild/pants/pull/6039/files#diff-3d23a0b5af958441c095d5acd551ac16L58

@illicitonion should have the definitive take on it, but I think you want to undo the hack or else add a comment that calls the shot and file an issue to clean up the hack. Afaict this cleanup you did was extraneous to the purpose of this PR even if it was a nice cleanup.

Chris Livingston added some commits Oct 15, 2018

@CMLivingston CMLivingston force-pushed the CMLivingston:clivingston/fix-dep-on-pyprep branch from c931a4d to 5b92f21 Oct 15, 2018

Chris Livingston added some commits Oct 15, 2018

Chris Livingston
if not os.path.exists(pex_path):
with self.context.new_workunit(name='build-checker'):
with safe_concurrent_creation(pex_path) as chroot:
builder = PEXBuilder(path=chroot, interpreter=interpreter)
builder.add_interpreter_constraint(str(interpreter.identity.requirement))

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

This should not be needed. We pass an explicit interpreter to all build steps and use the same one for the run step (PEX(... interpreter=interpreter)). Can you dig and figure out what is going on here?

This comment has been minimized.

@CMLivingston

CMLivingston Oct 15, 2018

Contributor

This is needed to guard against the case where a user has a pexrc with PEX_PYTHON or PEX_PYTHON_PATH set. If no constraints are supplied, pex will exec with the lowest compatible interpreter, hence python2.7 executing for a python 3 pex chroot.

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

Hrm - that's seems like a bug. If a PEX() is constructed with a specific interpreter it should probably use that interpreter and ignore pexrc, etc. This seems like an out of scope API hole, but lots of pants code is currently broken I think in the face of a user with a pexrc. Thanks for the explanation, I'll file an issue.

This comment has been minimized.

@jsirois
target = self.make_target('a/python:fail', PythonLibrary, sources=['fail.py'])
with self.assertRaises(TaskError) as task_error:
self.execute_task(target_roots=[target])
self.assertIn('3 Python Style issues found', str(task_error.exception))

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

String searching - assertIn here - or assertEqual (above) but not both.

This comment has been minimized.

@CMLivingston

CMLivingston Oct 15, 2018

Contributor

This is following prior art from the failure tests above. Do you want me to refactor all these?

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

Urg - you're right. No, that's fine.

target = self.make_target('a/python:fail', PythonLibrary, sources=['fail.py'])
with self.assertRaises(TaskError) as task_error:
self.execute_task(target_roots=[target])
self.assertIn('3 Python Style issues found', str(task_error.exception))

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

Urg - you're right. No, that's fine.

if not os.path.exists(pex_path):
with self.context.new_workunit(name='build-checker'):
with safe_concurrent_creation(pex_path) as chroot:
builder = PEXBuilder(path=chroot, interpreter=interpreter)
builder.add_interpreter_constraint(str(interpreter.identity.requirement))

This comment has been minimized.

@jsirois

jsirois Oct 15, 2018

Member

Hrm - that's seems like a bug. If a PEX() is constructed with a specific interpreter it should probably use that interpreter and ignore pexrc, etc. This seems like an out of scope API hole, but lots of pants code is currently broken I think in the face of a user with a pexrc. Thanks for the explanation, I'll file an issue.

@stuhood

This comment has been minimized.

Member

stuhood commented Oct 15, 2018

@CMLivingston : As discussed offline, let's:

  1. add a flag to a nearby subsystem/task to enable python3 linting, but leave this disabled for now
  2. add a deprecated_conditional (similar to
    deprecated_conditional(
    lambda: True,
    '1.12.0.dev0',
    "RewriteBase's sideeffecting property should be a class property, not an instance property "
    "but class {} didn't have a class property.".format(cls.__name__),
    )
    ) that warns if the flag is not enabled, and suggests that people fix their code and then enable the flag. We'd target this for at least 3 months out.
  3. toggle the default for the flag when the deprecation triggers, and deprecate the flag itself in a new cycle.
python_tgts = self.context.targets(
lambda tgt: isinstance(tgt, (PythonTarget))
)
if not python_tgts:
return

This comment has been minimized.

@jsirois

jsirois Oct 16, 2018

Member

Technically - return 0

@@ -155,30 +155,36 @@ def checkstyle(self, interpreter, sources):
failure_count = checker.run(args=args,

This comment has been minimized.

@jsirois

jsirois Oct 16, 2018

Member

You might just inline now for return checker.run(....

Chris Livingston added some commits Oct 16, 2018

Chris Livingston
@@ -43,6 +45,16 @@ class Checkstyle(LintTaskMixin, Task):
FlakeCheckSubsystem,
)
@memoized_classproperty
def _python_3_compatible_dists(cls):
_VERSIONS = ["3.0", "3.1", "3.2", "3.3", "3.4", "3.5", "3.6", "3.7", "3.8", "3.9"]

This comment has been minimized.

@jsirois

jsirois Oct 16, 2018

Member

This is making me cry. We're right back to all the problems of the initial PR.

A few things are needed to make this bulletproof:

  1. We don't need to care about CPython, stackless, pypy, etc here, so best not to track
  2. What about "==3.4.5",">3.4,<3.5" etc?

Given this, there are only 2 robust options I can see.

  1. Drop pkg_resources.{Distribution,Requirement}, and leverage packaging.{requirements,specifiers,version} to parse these things fully and then get down to the (operator, version) level to implement a robust test of whether a constraint includes some version of python 3.
  2. Whitelist constraints instead.

Option 2 should be much simpler and - I think - get you what you need. The idea would be to whitelist the value of the python-setup.interpreter_constraints option taking that as the default for the repo and which currently hardcode defaults to ['>=2.7,<3']. Lint any target with constraints in that set only by default. Provide a bool option here on the task to allow all - with no mention or care of "3". Alternatively - if a bool is not enough, make the option a list of interpreter constraints to whitelist that defaults to the value of python-setup.interpreter_constraints and can be appended, replaced or blanked as the magic value to say "all".

This comment has been minimized.

@CMLivingston

CMLivingston Oct 16, 2018

Contributor

What do you think of a third option:
run minimal interpreter selection and see if the minimum interpreter is a 3 interpreter? That way we can get compatibility checking at the cost of setting up the interpreter cache for filters:

allowed_interpreters = set(interpreter_cache.setup(filters=filters))
if not allowed_interpreters:
  raise TaskError('No valid interpreters found for targets: {}\n(filters: {})'
                              .format(targets, filters))
interpreter = min(allowed_interpreters)

I like option 2 but afaict it puts us in the same position as the original code because we need constraint intersection logic per option 1.

This comment has been minimized.

@jsirois

jsirois Oct 16, 2018

Member

We spoke offline, but I don't think intersection is needed. The idea is to treat interpreter constraints as a finite set of labels and whitelist from that set. The motivating assumption being real world repos have 2-3 interpreter constraints. Canonically 2, the default pants hardcoded ">=2.7,<3" and maybe an additional ">=3.4,<4". In this example you'd selectively add ">=3.4,<4" to the new whitelist option. For robustness you still need to parse things using packaging, but you only whitelist exact matches.

Chris Livingston
@jsirois

Thanks Chris.

if self._acceptable_interpreter_constraints == []:
# The user wants to lint everything.
return True
num_matches = 0

This comment has been minimized.

@jsirois

jsirois Oct 17, 2018

Member

In the spirit of your any before, how about just:

return all(version.parse(constraint) in self._acceptable_interpreter_constraints
           for constraint in constraint_tuple)

And this points out that nowhere here do you rely on a constraints_tuple, just an iterable, so consider a less-specific name like constraints.

This comment has been minimized.

@CMLivingston

CMLivingston Oct 17, 2018

Contributor

Great idea

@@ -56,6 +56,8 @@ def product_types(cls):
return [PythonInterpreter]
def execute(self):
# TODO(CMLivingston): Grabbing `PythonRequirementLibrary`s here seems like it is a hack.

This comment has been minimized.

@jsirois

jsirois Oct 17, 2018

Member

I can take this in a follow-up, but if you understand the explanation at #6632 (comment) perhaps you can just change this TODO into a comment that points out downstream interpreter consumers may need an interpreter for any sort of importable python target (including PythonRequirementLibrary targets) even if we don't need all such targets here to setup interpreters. IE: we only need those importable targets with compatibility constraints.

This comment has been minimized.

@CMLivingston

CMLivingston Oct 17, 2018

Contributor

I'll add the comment.

Chris Livingston added some commits Oct 17, 2018

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 17, 2018

@CMLivingston speak up when you get a green CI for the merge.

Chris Livingston added some commits Oct 18, 2018

Chris Livingston
Chris Livingston
@CMLivingston

This comment has been minimized.

Contributor

CMLivingston commented Oct 18, 2018

@jsirois looks green mod an unrelated failure.

@stuhood stuhood merged commit 34c1c7c into pantsbuild:master Oct 18, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@stuhood stuhood added this to the 1.11.x milestone Oct 18, 2018

stuhood added a commit that referenced this pull request Oct 18, 2018

Fix python lint dependency on pyprep goal (#6606)
### Problem

Python linting for a target set that contains incompatible python version constraints throws a fatal exception and terminates the lint run. This happens because the lint task is coupled to pyprep, triggering interpreter selection for the target closure.

### Solution

Use InterpreterCache to partition the target set based on compatibility constraint, and run the link task for each partition.

### Result

Linting incompatible target closures does not throw an exception and instead lints each constraint partition of the closure separately.

Resolves: #5764
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment