Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract a generalized V2 rule to inject __init__.py files #7722

Merged
merged 18 commits into from
May 16, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/python/pants/backend/python/register.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from pants.backend.python.python_artifact import PythonArtifact
from pants.backend.python.python_requirement import PythonRequirement
from pants.backend.python.python_requirements import PythonRequirements
from pants.backend.python.rules import python_test_runner
from pants.backend.python.rules import inject_init, python_test_runner
from pants.backend.python.targets.python_app import PythonApp
from pants.backend.python.targets.python_binary import PythonBinary
from pants.backend.python.targets.python_distribution import PythonDistribution
Expand Down Expand Up @@ -81,4 +81,4 @@ def register_goals():


def rules():
return tuple(python_test_runner.rules())
return inject_init.rules() + python_test_runner.rules()
1 change: 1 addition & 0 deletions src/python/pants/backend/python/rules/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ python_library(
'src/python/pants/engine:selectors',
'src/python/pants/rules/core',
'src/python/pants/source:source',
'src/python/pants/util:objects',
],
)
44 changes: 44 additions & 0 deletions src/python/pants/backend/python/rules/inject_init.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# coding=utf-8
# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

from __future__ import absolute_import, division, print_function, unicode_literals

from pants.backend.python.subsystems.pex_build_util import identify_missing_init_files
from pants.engine.fs import EMPTY_DIRECTORY_DIGEST, Digest, Snapshot
from pants.engine.isolated_process import ExecuteProcessRequest, ExecuteProcessResult
from pants.engine.rules import rule
from pants.engine.selectors import Get
from pants.util.objects import datatype


# TODO(#7710): Once this gets fixed, rename this to InitInjectedDigest.
class InjectedInitDigest(datatype([('directory_digest', Digest)])): pass


@rule(InjectedInitDigest, [Snapshot])
def inject_init(snapshot):
"""Ensure that every package has an __init__.py file in it."""
missing_init_files = tuple(sorted(identify_missing_init_files(snapshot.files)))
if not missing_init_files:
new_init_files_digest = EMPTY_DIRECTORY_DIGEST
else:
# TODO(7718): add a builtin rule for FilesContent->Snapshot, so that we can avoid using touch
# and the absolute path and have the engine build the files for us.
touch_init_request = ExecuteProcessRequest(
argv=("/usr/bin/touch",) + missing_init_files,
Copy link
Contributor

@cosmicexplorer cosmicexplorer May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like us to not use any absolute paths to files in ExecuteProcessRequests in production code without a very clear TODO to fix it. Alternatively we can introduce an intermediate e.g. @rule(FallibleExecuteProcessResult, [TouchFileRequest]) which wraps the creation of the command line, where TouchFileRequest is some new datatype. We're pretty good about not doing this in v1 python code -- it's more important here, since those files won't necessarily be on the remote host if executed remotely. I would like to amortize the process of figuring out which rules need special filesystem/etc resources instead of having to fix them later.

Copy link
Sponsor Member

@stuhood stuhood May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very near term plan to fix that (in the comment), and @Eric-Arellano has discussed it with two reviewers, so I think that it is fine in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what the significance of the proposed builtin rule is to the line below the comment. If it could include something about not having absolute paths to files in this comment, I would be satisfied. I don't see why that's something we don't want to do here.

output_files=missing_init_files,
description="Inject missing __init__.py files: {}".format(", ".join(missing_init_files)),
input_files=snapshot.directory_digest,
)
touch_init_result = yield Get(ExecuteProcessResult, ExecuteProcessRequest, touch_init_request)
new_init_files_digest = touch_init_result.output_directory_digest
# TODO(#7710): Once this gets fixed, merge the original source digest and the new init digest
# into one unified digest.
yield InjectedInitDigest(directory_digest=new_init_files_digest)


def rules():
return [
inject_init,
]
30 changes: 9 additions & 21 deletions src/python/pants/backend/python/rules/python_test_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@

from future.utils import text_type

from pants.backend.python.subsystems.pex_build_util import identify_missing_init_files
from pants.backend.python.rules.inject_init import InjectedInitDigest
from pants.backend.python.subsystems.pytest import PyTest
from pants.backend.python.subsystems.python_setup import PythonSetup
from pants.engine.fs import (Digest, DirectoriesToMerge, DirectoryWithPrefixToStrip, FilesContent,
Snapshot, UrlToFetch)
from pants.engine.fs import (Digest, DirectoriesToMerge, DirectoryWithPrefixToStrip, Snapshot,
UrlToFetch)
from pants.engine.isolated_process import (ExecuteProcessRequest, ExecuteProcessResult,
FallibleExecuteProcessResult)
from pants.engine.legacy.graph import BuildFileAddresses, TransitiveHydratedTargets
Expand Down Expand Up @@ -129,25 +129,13 @@ def run_python_test(test_target, pytest, python_setup, source_root_config):
Digest, DirectoriesToMerge(directories=tuple(all_sources_digests)),
)

# TODO(7716): add a builtin rule to go from DirectoriesToMerge->Snapshot or Digest->Snapshot.
# TODO(7715): generalize the injection of __init__.py files.
# TODO(7718): add a builtin rule for FilesContent->Snapshot.
file_contents = yield Get(FilesContent, Digest, sources_digest)
file_paths = [fc.path for fc in file_contents]
injected_inits = tuple(sorted(identify_missing_init_files(file_paths)))
if injected_inits:
touch_init_request = ExecuteProcessRequest(
argv=("/usr/bin/touch",) + injected_inits,
output_files=injected_inits,
description="Inject empty __init__.py into all packages without one already.",
input_files=sources_digest,
)
touch_init_result = yield Get(ExecuteProcessResult, ExecuteProcessRequest, touch_init_request)

all_input_digests = [sources_digest, requirements_pex_response.output_directory_digest]
if injected_inits:
all_input_digests.append(touch_init_result.output_directory_digest)
inits_digest = yield Get(InjectedInitDigest, Digest, sources_digest)

all_input_digests = [
sources_digest,
inits_digest.directory_digest,
requirements_pex_response.output_directory_digest,
]
merged_input_files = yield Get(
Digest,
DirectoriesToMerge,
Expand Down
12 changes: 12 additions & 0 deletions tests/python/pants_test/backend/python/rules/BUILD
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

python_tests(
name='inject_init',
sources=['test_inject_init.py'],
dependencies=[
'src/python/pants/backend/python/rules',
'src/python/pants/engine:fs',
'src/python/pants/engine:rules',
'src/python/pants/util:collections',
'tests/python/pants_test:test_base',
]
)

python_tests(
name='python_test_runner',
sources=['test_python_test_runner.py'],
Expand Down
39 changes: 39 additions & 0 deletions tests/python/pants_test/backend/python/rules/test_inject_init.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# coding=utf-8
# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

from __future__ import absolute_import, division, print_function, unicode_literals

from pants.backend.python.rules.inject_init import InjectedInitDigest, inject_init
from pants.engine.fs import EMPTY_DIRECTORY_DIGEST, EMPTY_SNAPSHOT, Snapshot
from pants.engine.rules import RootRule
from pants.util.collections import assert_single_element
from pants_test.test_base import TestBase


class TestInjectInit(TestBase):

@classmethod
def rules(cls):
return super(TestInjectInit, cls).rules() + [inject_init, RootRule(Snapshot)]

def assert_result(self, input_snapshot, expected_digest):
injected_digest = assert_single_element(
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert_single_element call can be replaced by a single comma. Just saying! =D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah true. I'm personally rooting for us to add scheduler.single_product_request() and converge on that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a bit of an anti-pattern when doing tuple destructuring, I think, but I didn't realize this at first so there's lots of bad examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And @Eric-Arellano I would really like to encourage task parallelism if possible (requesting multiple things at once) and using tuples by default encourages that with a really neat syntax!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using tuples by default encourages that with a really neat syntax!

To clarify, this means that I should use the tuple syntax of injected_digest, = blah(), right?

Why is this the case that using tuple syntax would increase parallelism? Is the idea that it's a good practice, or will it actually result in that here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this as-is.

I agree with some other folks that the foo, = bar syntax is pretty subtle and can be confusing, and we should probably avoid it.

In test code, I'd advocate for introducing:

class TestBase
  def single_scheduler_product_request(self, product, subject):
    return assert_single_element(self.scheduler.product_request(product, [subject]))

which IIRC we used to have before, but I can't find?

I'm mildly against scheduler.single_product_request as it may discourage parallelism in non-test code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to the noise, I also find the foo, = bar syntax very confusing and easy to miss, the seconds we save when typing longer accessors are all lost when we spend an extra hour or two going on fruitless debug tangents when we miss the comma and assume things that are not true. Rant over.

I'm in favour of the helper method being in TestBase vs the scheduler, since I agree that we should make it slightly painful to request only one product. This may avoid problems like this in the future (not v2-specific, but it's the same UX reasoning).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mildly against scheduler.single_product_request as it may discourage parallelism in non-test code.

Had no idea that'd be a thing. Good to know, and agreed this can live in TestBase. I can put up a PR later today to add this and refactor every current usage of the alternatives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also find the foo, = bar syntax very confusing and easy to miss

Ok, I hadn't realized this. It might make sense to have an explicit single product request method if we make it clear in the docstring how to request in parallel (since tuple destructuring with more than one element should be less ambiguous, I think?). Not sure.

But also, syntax highlighting saves lives!

the seconds we save when typing longer accessors

If anyone isn't using an editor that allows them to autocomplete python symbols, we could consider adding documentation on how to do that in their preferred editor perhaps!

self.scheduler.product_request(InjectedInitDigest, [input_snapshot])
)
self.assertEqual(injected_digest.directory_digest, expected_digest)

def test_noops_when_empty_snapshot(self):
self.assert_result(input_snapshot=EMPTY_SNAPSHOT, expected_digest=EMPTY_DIRECTORY_DIGEST)

def test_noops_when_init_already_present(self):
snapshot = self.make_snapshot({
"test/foo.py": "",
"test/__init__.py": ""
})
self.assert_result(input_snapshot=snapshot, expected_digest=EMPTY_DIRECTORY_DIGEST)

def test_adds_when_init_missing(self):
snapshot = self.make_snapshot({"test/foo.py": ""})
expected_digest = self.make_snapshot({"test/__init__.py": ""}).directory_digest
self.assert_result(input_snapshot=snapshot, expected_digest=expected_digest)