New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warn or error on matching source globs #5769

Merged
merged 57 commits into from May 31, 2018

Conversation

Projects
None yet
4 participants
@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented Apr 30, 2018

Problem

Pants gives no indication when a file specified in the sources kwarg of a target doesn't exist, or if a glob pattern doesn't match anything. This is often an error. See #5430, as well as this longstanding TODO in wrapped_globs.py.

Solution

  • Add the --glob-expansion-failure global option (defaulting to warn).
  • Warn or raise an error if any of the arguments to globs, rglobs, zglobs, or a literal file list doesn't expand to any existing source files.

Result

There are now warning messages emitted (warnings don't work yet, see #5863) when a glob doesn't match in a sources or bundles argument in a target. Setting the global option --error-on-glob-match-failure will cause an exception to be raised if any glob fails to match a file. Error messages contain a representation of the globs/rglobs/etc call from the BUILD file to aid in debugging.

Followup Issues

  1. #5863
  2. #5864
  3. #5871

@cosmicexplorer cosmicexplorer changed the title warn on matching source globs warn or error on matching source globs May 1, 2018

@cosmicexplorer cosmicexplorer changed the title warn or error on matching source globs [WIP] warn or error on matching source globs May 2, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented May 2, 2018

I'm still working on this. After @illicitonion mentioned on slack that I'm re-parsing the global options, I realized I had to take a different approach and haven't fully worked it out. I will leave another comment when this is resolved.

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch 2 times, most recently from fa21d9f to 7062880 May 2, 2018

@cosmicexplorer cosmicexplorer changed the title [WIP] warn or error on matching source globs warn or error on matching source globs May 3, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented May 3, 2018

So I took a step back and started with a fresh branch and quickly replicated the functionality I wanted from the previous attempt without any changes to the options system, then force-pushed, so this is much cleaner and simpler. This is ready for review now.

@jsirois: I changed TypedDatatypeInstanceConstructionError to subclass TypeError instead of Exception in 4f9ba6c, and caught TypeError in tests, but thinking about your comment in the other pr I feel like I might want to revert that? It's not clear to me whether the distinction between the exception types here matters, since they really are just to add exception message prefixes. There may be a better way to arrange the exceptions as well.

relpath_adjusted_filespec = FilesetRelPathWrapper.to_filespec(filespec['globs'], spec_path)
rel_include_globs = filespec['globs']

_get_globs_owning_files(files,

This comment has been minimized.

@stuhood

stuhood May 7, 2018

Member

Eek... sorry. Unfortunately, I think this is the wrong place to do this check. At this point, it's necessary to reverse engineer the match.


In the ticket, I said:

...since we now control the implementation of glob matching, we no longer have a great excuse not to.

...by which I meant: we have an implementation of glob matching in:

/// Recursively expands PathGlobs into PathStats while applying excludes.

What I had in my head when I said that (but didn't explain very well: sorry), was that in fn expand we could track the original/input of each PathGlob instance (ie, this PathGlob came from this original glob string), and then at the end see which original/input glob strings never matched anything. I don't know concretely what this would look like, but I suspect that 1) it would be more efficient than re-constructing later, 2) it would be less likely to break.

@stuhood stuhood force-pushed the pantsbuild:master branch from b6bb42d to 9e2fdb5 May 11, 2018

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch 2 times, most recently from 609cf65 to 69a04e0 May 13, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

I like the goal of this change :)

I worry that we're adding a bit much complexity to the model, and a few too many layers of wrapping in the rust code.

How would you feel about having the rust code always returning up the match information when expanding PathGlobs, and having the Python code decide whether to ignore it? That way we can avoid a bunch of the do-we-report-stuff state passing around. Hopefully the overhead of tracking that data is low enough that people shouldn't notice/care...

@@ -121,6 +121,9 @@ def file_stats(self):
)


class SnapshotWithMatchData(datatype([('snapshot', Snapshot), 'match_data'])): pass

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

It seems weird to have semi-typed datatypes...

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Yes, that one wasn't all the way thought through anyway. It is now gone regardless (I removed the intrinsic and all changes to the native interface).

@@ -239,3 +282,10 @@ def register_options(cls, register):
register('--lock', advanced=True, type=bool, default=True,
help='Use a global lock to exclude other versions of pants from running during '
'critical operations.')
# TODO(cosmicexplorer): make a custom type abstract class to automate the production of an

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

This sounds nice, assuming it can be done simply :)

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Left it as a TODO due to the length of this PR, I agree 100%.

@@ -121,18 +122,24 @@ impl PathStat {
}
}

impl fmt::Debug for PathStat {

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Why are we dropping the stat here?

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

This change seems unrelated... would be good to remove it from this PR.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

This has been removed.

@@ -171,13 +184,26 @@ impl PathGlob {
}

pub fn create(filespecs: &[String]) -> Result<Vec<PathGlob>, String> {
let mut path_globs = Vec::new();
let filespecs_globs = Self::spread_filespecs(filespecs)?;

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Would be nice to phrase these spread and flattens as things which accept and return iterators, which can be collected by the consumer, rather than needing to allocate intermediate Vecs. Probably worth waiting for this until 1.26 lands with impl trait, though :)

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 21, 2018

Contributor

After some very brief research -- is there a way to phrase that without impl trait that I'm not seeing?

This comment has been minimized.

@stuhood

stuhood May 21, 2018

Member

@cosmicexplorer : In theory you can use Box<Iterator<..>>, but it's semi-advanced usage, so I wouldn't worry about it here.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 21, 2018

Contributor

ok, great! This actually can probably be removed entirely when implementing your suggestion below.

&PathGlob::Wildcard {
ref symbolic_path,
ref wildcard,
..

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Why are we ignoring fields here?

But if we're going to, let's explicitly ref _canonical_dir rather than .., so it becomes a compile error not to consider Debug formatting when adding new fields

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 21, 2018

Contributor

This is a great piece of advice!!! I think the canonical_dir should be added to the Debug for sure, I think I didn't add it because it was unnecessary screen space when I was Debugging earlier, but I will add it now. I will remember your second comment for future reference though!

) -> BoxFuture<(Vec<PathStat>, Vec<PathGlob>), E> {
match path_glob {
) -> BoxFuture<SingleExpansionResult, E> {
// TODO(cosmicexplorer): I would love to do something like `glob @ PathGlob::Wildcard {...}`,

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Yeah, this makes me sad too :( I've generally favoured the re-construction over the clone, but I have no strong feelings other than sadness :)

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 22, 2018

Contributor

Thinking about this now, it makes sense, I think to disallow:

match path_glob {
  glob @ PathGlob::Wildcard { canonical_dir, symbolic_path, wildcard } => {
    path_glob
  }
}

Because you're moving path_glob into the destructured canonical_dir, symbolic_path, and wildcard, but then trying to move it again into the result of the expression -- there needs to be a clone happening somewhere. However if we were to match on &path_glob (which I just tried), it seems like it's impossible to move something while borrowing it, even if the move happens at the end of the borrowed scope. That seems like it doesn't need to happen? I'm not sure if my analysis is correct.

This comment has been minimized.

@illicitonion

illicitonion May 22, 2018

Contributor

Yeah; your analysis is correct that there isn't a nice way to make this work, and the reasons why. I'd quite like the language to somehow support (maybe as a special case), some syntax like:

match path_globs {
  PathGlob::Wildcard => path_globs
}

or

match path_globs {
  wildcard: PathGlob::Wildcard => wildcard
}

or something similar, to allow this, because it's a pretty common desire.

WithGlobMatchData,
}

impl fmt::Debug for Snapshot {

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Why not derive?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Deriving now!

ref globs,
} = entry;
let match_found_for_input: bool = globs
.into_iter()

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

Feels like this could just be an iter()?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

That particular line has been removed, but I am looking out for other places where I may have made this mistake.

@@ -915,7 +1087,8 @@ impl NodeKey {
keystr(&s.subject),
typstr(&s.product)
),
&NodeKey::Snapshot(ref s) => format!("Snapshot({})", keystr(&s.0)),
// TODO(cosmicexplorer): why aren't we implementing an fmt::Debug for all of these nodes?

This comment has been minimized.

@illicitonion

illicitonion May 21, 2018

Contributor

We probably should be :)

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Turned that into a TODO for a followup PR.

#[derive(Clone, Debug)]
pub struct SnapshotExpansionResult {
snapshot: fs::Snapshot,
found_files: HashMap<PathGlob, GlobMatch>,

This comment has been minimized.

@stuhood

stuhood May 21, 2018

Member

Perhaps I'm missing something, but why would we expose the found_files out of VFS::expand rather than just eagerly failing in that method based on the configured behaviour?

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch from a135429 to dd2f422 May 22, 2018

@@ -42,7 +46,9 @@ def parse_address_family(address_mapper, directory):
The AddressFamily may be empty, but it will not be None.
"""
logger.debug("directory: {}".format(directory))

This comment has been minimized.

@stuhood

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Removed!

default_option_value = 'warn'

@classmethod
def create(cls, value=None):

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

Rather than creating a new instance of the class for each call to create, you could return singletons here:

  if value == 'ignore':
    return GlobMatchErrorBehavior.IGNORE
  ..

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 22, 2018

Contributor

Great!! I'll do that.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

(did that)

@@ -121,18 +122,24 @@ impl PathStat {
}
}

impl fmt::Debug for PathStat {

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

This change seems unrelated... would be good to remove it from this PR.

}

///
/// Given a filespec as Patterns, create a series of PathGlob objects.
///
fn parse_globs(
// FIXME: add as_zsh_glob() method which generates a string from the available PathGlob info, and

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

I don't think this fixme is concrete enough. Would remove, or post as a ticket if it's a thing you actually think we should do. But these weren't really intended to be zsh globs so much as git globs: see the comment in the method body about git. ...but it looks like you removed that comment maybe got removed in the refactoring?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 22, 2018

Contributor

Oh, I actually added it in the as_zsh_style_globs() method, and used it to generate the error message in expand() (which is why I wanted it). But I will absolutely change the naming to gitignore-style globs.

"""Given a BundlesField, request Snapshots for each of its filesets and create BundleAdaptors."""
snapshot_list = yield [Get(Snapshot, PathGlobs, pg) for pg in bundles_field.path_globs_list]
logger.debug("glob_match_error_behavior={}".format(glob_match_error_behavior))

This comment has been minimized.

@stuhood

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Also removed!

@@ -714,12 +837,13 @@ pub trait VFS<E: Send + Sync + 'static>: Clone + Send + Sync + 'static {
return future::ok(vec![]).to_boxed();
}

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

Darn, sorry... this was more challenging than I thought it would be. I'm worried that the result is too complicated right now (because the post expansion loop essentially ends up recreating/rexecuting the expansion logic, the same way you were recreating it in the python code).

In theory tracking associations all the way through (input glob string -> PathGlob -> PathStat) should allow us to avoid reconstructing the match logic afterward. It's also possible I'm missing something: would be happy to pair on this to try and figure it out together.

I also expect that this would be easier with (memoized) recursion, and wonder whether we shouldn't land a separate PR to switch this to memoized recursion before tackling this.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 22, 2018

Contributor

In the previous warn-globs-v2 branch I messaged in the slack, I had done exactly that, by having a Vec of "source" PathGlobs stored in an IndexMap (so we can iterate in reverse to do this in topological order), which would memoize exactly what we describe. I can reintroduce that now that the other kinks are worked out.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 22, 2018

Contributor

I can also land that in a separate PR, though.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

This should be fixed now -- we merely record all the "sources" as GlobSource instances, for each glob in completed. A GlobSource can be either another PathGlob, or a GlobParsedSource. We then iterate in reverse over completed. This is very low-cost for the ignore case.

@@ -2,6 +2,7 @@
// Licensed under the Apache License, Version 2.0 (see LICENSE).

use std::ffi::OsString;
use std::iter::Iterator;

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

Unused?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Removed!

static ref PYTHON_FALSE: Value = eval("False").unwrap();
}

impl From<bool> for Value {

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

Is this used?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

No; removed!


resources(
name='missing-zglobs',
sources=zglobs('**/*.a'),

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

All of the globs types (z/r/etc) are converted into one underlying format, so accounting for just "recursive", "non-recursive", and "literal" should be sufficient.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Sure, but I'd like to leave the testprojects and integration testing as is until we can close #5863. Does that sound reasonable? I would like to make sure the testing captures the error messages and ensures they display something useful to the user, and all of the skipped tests in the new test_graph_integration.py need to be updated when that happens.

@@ -9,8 +9,7 @@
import pickle
from abc import abstractmethod

from pants.util.objects import (Exactly, SubclassesOf, SuperclassesOf, TypeCheckError,
TypedDatatypeInstanceConstructionError, datatype)
from pants.util.objects import Exactly, SubclassesOf, SuperclassesOf, datatype

This comment has been minimized.

@stuhood

stuhood May 22, 2018

Member

Unrelated changes?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 24, 2018

Contributor

Reverted!

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch from 97098d1 to 3d78392 May 23, 2018

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch from 42f2533 to 322c16e May 24, 2018

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch from 9976e8b to 846f0be May 24, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented May 24, 2018

This should be ready for another look. I have responded to all the review comments left previously.

@stuhood
Copy link
Member

stuhood left a comment

Thanks Danny... I have a bunch of comments, but I think once those are addressed this should be landable.

@@ -239,3 +283,10 @@ def register_options(cls, register):
register('--lock', advanced=True, type=bool, default=True,
help='Use a global lock to exclude other versions of pants from running during '
'critical operations.')
# TODO(cosmicexplorer): Make a custom type abstract class to automate the production of an

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

We used to use the TODO($name) format, but have moved away from it in favor of including ticket links in cases where we plan to follow up on something.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

I have edited all the TODOs to that effect and will use tickets from now on, or unnamed smaller TODOs. Sorry about that.

@@ -134,6 +134,8 @@ def __str__(self):
field_value = getattr(self, field_name)
if not constraint_for_field:
elements_formatted.append(
# FIXME(cosmicexplorer): use {field_value!r} in this method, even though this is

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

Ditto above regarding names. Fine to leave TODOs/FIXMEs, but rather than targeting them at a particular person, create a ticket with enough explanation to allow anyone to fix the issue. Or if the TODO is self-explanatory or low priority, can skip a ticket.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

See above. Thanks for clarifying this.

@@ -121,15 +121,63 @@ impl PathStat {
}
}

#[derive(Clone, Debug)]

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

Cloning this type would be expensive, so it's good that GitignoreStyleExcludes::create returns an Arc<Self>. But in cases where cloning things would be expensive, it's best not to derive Clone: that way you send the signal to a user that they should avoid copying the type (such as by using an Arc, or sharing it in some other way).

I think I found the callsite below where you were cloning this, and it should not be necessary.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 25, 2018

Contributor

Removed the Clone derive and everything seems to work!

@@ -147,6 +195,31 @@ pub enum PathGlob {
},
}

#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct GlobParsedSource {

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

I'm on the fence about whether this type is useful... but I guess I'd bias toward leaving it in.

Rather than giving the field a long name (since it is already in a wrapper type), you might consider using what's known as a "tuple struct":

pub struct GlobParsedSource(String);

...which comes with different matching and construction syntax:

let str = "*.txt".to_string();
match GlobParsedSource(str) {
  GlobParsedSource(s) => ..
}

The only downside to tuple structs is that because their fields aren't named, you have to refer to them by index in cases where you're not destructuring:

let gps = GlobParsedSource("*.txt".to_string());
let str = gps.0;

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 25, 2018

Contributor

I think the type is useful in the sense that it means we're not passing around strings, there's at least some context. I converted it to a 1-element tuple struct as you suggested -- the named field is definitely unnecessary here.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

Thanks for the in-depth description of tuple structs here, by the way!

"warn" => Ok(StrictGlobMatching::Warn),
"error" => Ok(StrictGlobMatching::Error),
_ => Err(format!(
"???/unrecognized strict glob matching behavior: {}",

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

Can drop the ???.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 25, 2018

Contributor

Dropped it!

})
self.assert_failure(pants_run)
self.assertIn(AddressLookupError.__name__, pants_run.stderr_data)
expected_msg = """

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

This is much, much too general. Changing either error reporting or the shape of the rule graph would break it.

Should validate a much more specific thing... maybe just that both the target address and glob are included.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

c3750a4 makes the testing here much more useful as you suggest.

self.assert_success(pants_run)
self.assertNotIn("WARN]", pants_run.stderr_data)

@unittest.skip('Skipped to expedite landing #5769: see #5863')

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

Since you have this test set to error, I don't think this is related to the warnings issue... is it possible to actually enable this test?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 25, 2018

Contributor

Yes, it requires overriding the __repr__ of BaseGlobs -- after b72ab17 this can now be unskipped.

# TODO: this is passing, but glob_match_error_behavior='ignore' in the target. We should have a
# field which targets with bundles (and/or sources) can override which is the same as the global
# option.
expected_msg = """

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

ditto

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

See above response!

"""
self.assertIn(expected_msg, pants_run.stderr_data)

def test_exception_invalid_option_value(self):

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

This effectively just tests options parsing... would remove it (or if you don't see a test of the choices impl, move something like this test to the relevant test class).

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

I have removed this test. Thanks!


class GlobalOptionsTest(BaseTest):

def test_exception_glob_match_constructor(self):

This comment has been minimized.

@stuhood

stuhood May 24, 2018

Member

Ditto testing options parsing.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer May 26, 2018

Contributor

I have removed this file, there is ample testing already for this functionality.

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:warn-or-err-matching-source-globs branch 2 times, most recently from a18b5ba to 34e4aa9 May 26, 2018

cosmicexplorer added some commits May 19, 2018

cosmicexplorer added some commits May 23, 2018

@stuhood
Copy link
Member

stuhood left a comment

Thanks for iterating @cosmicexplorer ! I don't think any of my comments are blockers, so we can merge once @illicitonion 's feedback is in.

@@ -288,6 +293,8 @@ def legacy_globs_class(self):
"""The corresponding `wrapped_globs` class for this BaseGlobs."""

def __init__(self, *patterns, **kwargs):
self._patterns = patterns
self._kwargs = kwargs

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

Since this is mutated below, it will not end up containing anything interesting. Can fix in a followup, but IMO, attempting to reproduce the syntax here is probably overkill.

@@ -272,6 +272,9 @@ fn execute(top_match: clap::ArgMatches) -> Result<(), ExitError> {
.map(|s| s.to_string())
.collect::<Vec<String>>(),
&[],
// By using `Ignore`, we assume all elements of the globs will definitely expand to

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

I believe that this case could at least warn if something isn't matched, but @illicitonion could confirm.

}

pub fn should_check_glob_matches(&self) -> bool {
match self {

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

If you derive Eq/PartialEq for this enum, you can do &StrictGlobMatching::Ignore == self.

source: GlobSource,
}

#[derive(Clone, Debug, Eq, Hash, PartialEq)]

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

For enums that don't own any data (or even enums that own very small amounts of data), it's generally fine to derive Copy, which avoids having to manually call Clone.

.collect();

if !non_matching_inputs.is_empty() {
// TODO(#5684): explain what global and/or target-specific option to set to

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

Bad ticket reference.


class GlobMatchErrorBehavior(datatype(['failure_behavior'])):
"""Describe the action to perform when matching globs in BUILD files to source files.

This comment has been minimized.

@stuhood

stuhood May 31, 2018

Member

Potentially controversial, but: given that this type will be validated on the rust side, I think removing the python wrapper type and just having it be a checked string would be totally fine. Take it or leave it.

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Yeah, I think the wrapper type is slightly overkill here :)

(Would maybe consider re-introducing it if it led to an easy way to keep the python and rust enums in sync, but I don't see an obvious one)

@illicitonion
Copy link
Contributor

illicitonion left a comment

Let's address comments as a follow-up to avoid merge conflicts :)

@@ -272,6 +272,9 @@ fn execute(top_match: clap::ArgMatches) -> Result<(), ExitError> {
.map(|s| s.to_string())
.collect::<Vec<String>>(),
&[],
// By using `Ignore`, we assume all elements of the globs will definitely expand to
// something here, or we don't care. Is that a valid assumption?

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Yeah, I'd say we don't care here because fs_util users are generally doing weird hacky one-off experiments, rather than wanting correctness safety nets :)

}

impl GitignoreStyleExcludes {
fn create(patterns: &[String]) -> Result<Arc<Self>, String> {

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Why does this return an Arc? It looks like it's to avoid the overhead of cloning EMPTY_IGNORE? Maybe add a comment explaining that (and that it's presumably much more expensive it is to clone a Gitignore than an Arc)?

fn to_sourced_globs(&self) -> Vec<GlobWithSource> {
self
.globs
.clone()

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

I would assume that:

self.globs
  .iter()
  .map(|path_glob| GlobWithSource {
    path_glob.clone(),
    source: GlobSource::ParsedInput(self.input.clone()),
  })
  .collect()

would be slightly more efficient here; cloning the Vec and then doing an into_iter forces you to clone more of the underlying structure of the Vec which you're just going to throw away, whereas cloning each PathGlob as you encounter it, only requires cloning the underlying data you actually need to copy.

@@ -171,13 +240,28 @@ impl PathGlob {
}

pub fn create(filespecs: &[String]) -> Result<Vec<PathGlob>, String> {
let mut path_globs = Vec::new();
// Getting a Vec<PathGlob> per filespec is needed to create a `PathGlobs`, but we don't need

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Drop this comment?

// Getting a Vec<PathGlob> per filespec is needed to create a `PathGlobs`, but we don't need
// that here.
let filespecs_globs = Self::spread_filespecs(filespecs)?;
let all_globs = Self::flatten_entries(filespecs_globs);

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

I'd probably inline flatten_entries; it's small enough and only used once that it's more clear to be able to read its body, rather than needing to look around for its implementation

// reverse order of expansion (using .rev()), we ensure that we have already visited every
// "child" glob of the glob we are operating on while iterating. This is a reverse
// "topological ordering" which preserves the partial order from parent to child globs.
let all_globs: Vec<PathGlob> = completed.keys().rev().map(|pg| pg.clone()).collect();

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

.cloned() instead of .map(|pg| pg.clone())

// `GlobSource`), which may be another glob, or it might be a `GlobParsedSource`. We record
// all `GlobParsedSource` inputs which transitively expanded to some file here, and below we
// warn or error if some of the inputs were not found.
let mut inputs_with_matches: HashSet<GlobParsedSource> = HashSet::new();

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Shouldn't need this type ascription?

@@ -29,24 +30,47 @@ class Path(datatype(['path', 'stat'])):
"""


class PathGlobs(datatype(['include', 'exclude'])):
class PathGlobs(datatype([
'include',

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

I'd be tempted to add tuple type constraints to include and exclude - having half-typed objects seems weird

@@ -70,7 +70,7 @@ class Field(object):
"""A marker for Target(Adaptor) fields for which the engine might perform extra construction."""


class SourcesField(datatype(['address', 'arg', 'filespecs', 'path_globs']), Field):
class SourcesField(datatype(['address', 'arg', 'filespecs', 'base_globs', 'path_globs']), Field):

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Can you add pydoc explaining what base_globs are? It's not self-evident...


class GlobMatchErrorBehavior(datatype(['failure_behavior'])):
"""Describe the action to perform when matching globs in BUILD files to source files.

This comment has been minimized.

@illicitonion

illicitonion May 31, 2018

Contributor

Yeah, I think the wrapper type is slightly overkill here :)

(Would maybe consider re-introducing it if it led to an easy way to keep the python and rust enums in sync, but I don't see an obvious one)

@illicitonion illicitonion merged commit bbd8466 into pantsbuild:master May 31, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@benjyw

This comment has been minimized.

Copy link
Contributor

benjyw commented Jul 25, 2018

There is a major issue here, which is that if you have multiple globs in a target, every one has to match something separately. For example, the default globs for python_tests is ('test_*.py', '*_test.py'), so that shops can pick whether they prefer test_foo.py or foo_test.py as their idiom. But now you'll get a warning on whichever one you didn't pick...

I will open an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment