-
-
Notifications
You must be signed in to change notification settings - Fork 626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract a generalized V2 rule to inject __init__.py
files
#7722
Changes from all commits
ab35242
fcf9ee4
ae03a70
d9c0f95
10e423a
c027710
1d10da4
a0f399b
6c6caff
76a0fc4
e70686d
2f86e86
4f401d7
aa3f06e
149efc3
6228552
c5b3d12
0e533a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# coding=utf-8 | ||
# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md). | ||
# Licensed under the Apache License, Version 2.0 (see LICENSE). | ||
|
||
from __future__ import absolute_import, division, print_function, unicode_literals | ||
|
||
from pants.backend.python.subsystems.pex_build_util import identify_missing_init_files | ||
from pants.engine.fs import EMPTY_DIRECTORY_DIGEST, Digest, Snapshot | ||
from pants.engine.isolated_process import ExecuteProcessRequest, ExecuteProcessResult | ||
from pants.engine.rules import rule | ||
from pants.engine.selectors import Get | ||
from pants.util.objects import datatype | ||
|
||
|
||
# TODO(#7710): Once this gets fixed, rename this to InitInjectedDigest. | ||
class InjectedInitDigest(datatype([('directory_digest', Digest)])): pass | ||
|
||
|
||
@rule(InjectedInitDigest, [Snapshot]) | ||
def inject_init(snapshot): | ||
"""Ensure that every package has an __init__.py file in it.""" | ||
missing_init_files = tuple(sorted(identify_missing_init_files(snapshot.files))) | ||
if not missing_init_files: | ||
new_init_files_digest = EMPTY_DIRECTORY_DIGEST | ||
else: | ||
# TODO(7718): add a builtin rule for FilesContent->Snapshot, so that we can avoid using touch | ||
# and the absolute path and have the engine build the files for us. | ||
touch_init_request = ExecuteProcessRequest( | ||
argv=("/usr/bin/touch",) + missing_init_files, | ||
output_files=missing_init_files, | ||
description="Inject missing __init__.py files: {}".format(", ".join(missing_init_files)), | ||
input_files=snapshot.directory_digest, | ||
) | ||
touch_init_result = yield Get(ExecuteProcessResult, ExecuteProcessRequest, touch_init_request) | ||
new_init_files_digest = touch_init_result.output_directory_digest | ||
# TODO(#7710): Once this gets fixed, merge the original source digest and the new init digest | ||
# into one unified digest. | ||
yield InjectedInitDigest(directory_digest=new_init_files_digest) | ||
|
||
|
||
def rules(): | ||
return [ | ||
inject_init, | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# coding=utf-8 | ||
# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md). | ||
# Licensed under the Apache License, Version 2.0 (see LICENSE). | ||
|
||
from __future__ import absolute_import, division, print_function, unicode_literals | ||
|
||
from pants.backend.python.rules.inject_init import InjectedInitDigest, inject_init | ||
from pants.engine.fs import EMPTY_DIRECTORY_DIGEST, EMPTY_SNAPSHOT, Snapshot | ||
from pants.engine.rules import RootRule | ||
from pants.util.collections import assert_single_element | ||
from pants_test.test_base import TestBase | ||
|
||
|
||
class TestInjectInit(TestBase): | ||
|
||
@classmethod | ||
def rules(cls): | ||
return super(TestInjectInit, cls).rules() + [inject_init, RootRule(Snapshot)] | ||
|
||
def assert_result(self, input_snapshot, expected_digest): | ||
injected_digest = assert_single_element( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hah true. I'm personally rooting for us to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it's a bit of an anti-pattern when doing tuple destructuring, I think, but I didn't realize this at first so there's lots of bad examples. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And @Eric-Arellano I would really like to encourage task parallelism if possible (requesting multiple things at once) and using tuples by default encourages that with a really neat syntax! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
To clarify, this means that I should use the tuple syntax of Why is this the case that using tuple syntax would increase parallelism? Is the idea that it's a good practice, or will it actually result in that here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm happy with this as-is. I agree with some other folks that the In test code, I'd advocate for introducing: class TestBase
def single_scheduler_product_request(self, product, subject):
return assert_single_element(self.scheduler.product_request(product, [subject])) which IIRC we used to have before, but I can't find? I'm mildly against There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To add to the noise, I also find the I'm in favour of the helper method being in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Had no idea that'd be a thing. Good to know, and agreed this can live in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ok, I hadn't realized this. It might make sense to have an explicit single product request method if we make it clear in the docstring how to request in parallel (since tuple destructuring with more than one element should be less ambiguous, I think?). Not sure. But also, syntax highlighting saves lives!
If anyone isn't using an editor that allows them to autocomplete python symbols, we could consider adding documentation on how to do that in their preferred editor perhaps! |
||
self.scheduler.product_request(InjectedInitDigest, [input_snapshot]) | ||
) | ||
self.assertEqual(injected_digest.directory_digest, expected_digest) | ||
|
||
def test_noops_when_empty_snapshot(self): | ||
self.assert_result(input_snapshot=EMPTY_SNAPSHOT, expected_digest=EMPTY_DIRECTORY_DIGEST) | ||
|
||
def test_noops_when_init_already_present(self): | ||
snapshot = self.make_snapshot({ | ||
"test/foo.py": "", | ||
"test/__init__.py": "" | ||
}) | ||
self.assert_result(input_snapshot=snapshot, expected_digest=EMPTY_DIRECTORY_DIGEST) | ||
|
||
def test_adds_when_init_missing(self): | ||
snapshot = self.make_snapshot({"test/foo.py": ""}) | ||
expected_digest = self.make_snapshot({"test/__init__.py": ""}).directory_digest | ||
self.assert_result(input_snapshot=snapshot, expected_digest=expected_digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really like us to not use any absolute paths to files in
ExecuteProcessRequest
s in production code without a very clear TODO to fix it. Alternatively we can introduce an intermediate e.g.@rule(FallibleExecuteProcessResult, [TouchFileRequest])
which wraps the creation of the command line, whereTouchFileRequest
is some newdatatype
. We're pretty good about not doing this in v1 python code -- it's more important here, since those files won't necessarily be on the remote host if executed remotely. I would like to amortize the process of figuring out which rules need special filesystem/etc resources instead of having to fix them later.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a very near term plan to fix that (in the comment), and @Eric-Arellano has discussed it with two reviewers, so I think that it is fine in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear what the significance of the proposed builtin rule is to the line below the comment. If it could include something about not having absolute paths to files in this comment, I would be satisfied. I don't see why that's something we don't want to do here.