Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gazelle): Add "python_test_file_pattern" directive #1819

Merged
merged 26 commits into from
Apr 5, 2024

Conversation

dougthor42
Copy link
Contributor

@dougthor42 dougthor42 commented Mar 22, 2024

Add the python_test_file_pattern directive. This directive allows
users to configure what python files get mapped to the py_test
rule.

The default behavior is unchanged: both test_* and *_test.py
files generate py_test targets if the directive is unset.

The directive supports multiple glob patterns, separated by a comma.

Note: The original code used, effectively, test_* for one of the
patterns. This code uses test_*.py instead. These are equivalent
because of the .py extension check prior to pattern matching.

Fixes #1816.

@dougthor42 dougthor42 marked this pull request as ready for review March 22, 2024 03:57
func makeCompiledGlobs(globs []string) []glob.Glob {
compiledGlobs := []glob.Glob{}
for _, value := range globs {
compiledGlob := glob.MustCompile(value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might prefer a more helpful error message if the glob fails. Could we add an integration test? Maybe it is good enough already? Right now this code would panic

Copy link
Contributor Author

@dougthor42 dougthor42 Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Failure to compile: error message added, test added

I had a similar thought last night: we also need to handle the case where the user has # gazelle:python_test_file_pattern (with no value). Right now no files will be mapped to py_test if there is no pattern. If we want to keep that behavior, I'll need to update the docs.

gazelle/python/generate.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@dougthor42 dougthor42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a test that subpackages can override the value.

func makeCompiledGlobs(globs []string) []glob.Glob {
compiledGlobs := []glob.Glob{}
for _, value := range globs {
compiledGlob := glob.MustCompile(value)
Copy link
Contributor Author

@dougthor42 dougthor42 Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Failure to compile: error message added, test added

I had a similar thought last night: we also need to handle the case where the user has # gazelle:python_test_file_pattern (with no value). Right now no files will be mapped to py_test if there is no pattern. If we want to keep that behavior, I'll need to update the docs.

Comment on lines 185 to 187
case pythonconfig.TestFilePattern:
globStrings := strings.Split(strings.TrimSpace(d.Value), ",")
config.SetTestFilePattern(globStrings)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #1787 I added an option for resetting the directive to the rules_python default. Do we want to support the same for this directive?

// N.B.: pseudocode; haven't tested yet
case pythonconfig.TestFilePattern:
    directiveArg := strings.TrimSpace(d.Value)
    if directiveArg == "DEFAULT" {
        directiveArg = pythonconfig.DefaultTestFilePatternString
    }
    globStrings := strings.Split(directiveArg, ",")
    config.SetTestFilePattern(globStrings)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency I would agree, but at the same time my brain is also telling me that we could have:

gazelle:python_test_file_pattern

to reset the config to defaults.

The empty pattern yielding no test targets is not super intuitive, IMHO.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty pattern yielding no test targets is not super intuitive, IMHO.

Agreed.

That said, I would argue that the empty pattern resetting to defaults is also not super intuitive. Some readers will see that and think "no test targets" while others will think "resets to default".

I like explicit things 😁. There's no ambiguity and it's easier to grok when returning to something N years down the road.

If we wanted to support yielding no test targets1 it should be something like gazelle:python_test_file_pattern NONE or have a different directive altogether (gazelle:python_no_tests or something).

Footnotes

  1. I can't think of a reason why someone would want that, but that doesn't mean there aren't reasons

Copy link
Contributor

@wingsofovnia wingsofovnia Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passerby opinion: yielding no test targets can (and imho should) be achieved by ignoring files with gazelle:python_ignore_files. That said, setting pattern to NONE should not be allowed (also, it's a bit confusing to set pattern to NONE, some might thing there is NONE file or smths). gazelle:python_test_file_pattern (blank) should also fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yielding no test targets can (and imho should) be achieved by ignoring files with gazelle:python_ignore_files

I disagree. There are definitely cases where you want to yield no test targets but do want to yield lib and binary targets.

project/
+ BUILD.bazel
+ MODULE.bazel
+ src/myfoo/
    + BUILD.bazel
    + foo.py
    + foo_test.py
    + subdir/
        + BUILD.bazel
        + run_test.py      # py_binary
        + example_test.py  # py_library

Using gazelle:python_ignore_files would mean that no targets are generated in subdir above, when really we'd want only py_library and py_binary targets made.

There's also the case where test files are outside of the package, as described by PyPA Tutorial. In that case, you can be sure that all targets in src/mypackage below are py_library and py_binary.

project/
+ BUILD.bazel
+ MODULE.bazel
+ src/mypackage/               # guaranteed to have no py_test targets
    + BUILD.bazel
    + run_electrical_test.py   # py_binary
    + electrical_test.py       # py_library
+ tests/                       # any py_* targets
    + BUILD.bazel
    + foo_test.py              # py_test
    + bar_test.py              # py_test
    + test_utils.py            # py_library
    + run_all_test.py          # py_binary

also, it's a bit confusing to set pattern to NONE, some might thing there is NONE file

A valid point. Perhaps using something that is unlikely to be a filename would be better? Maybe one of these or similar?

# gazelle:python_test_file_pattern %NONE%
# gazelle:python_test_file_pattern !!NO_FILES
# gazelle:python_test_file_pattern ~!~NO_PATTERN~!~

One issue is that linux filenames can be basically anything haha so it's not feasible to pick a string that can't be a filename.

If we do go with a more "special" string, I'd probably want to update the python_default_visibility directive special strings to match.

gazelle:python_test_file_pattern (blank) should also fail.

+1 to that.

Copy link
Contributor

@wingsofovnia wingsofovnia Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with arguments around gazelle:python_ignore_files.

How about...not having any magic names at all? I mean, by the time someone wants to reset the test file patterns, he/she already learned what default values are (while customising them in the first place), and if resetting the patterns is needed, it can be done with # gazelle:python_test_file_pattern *_test,test_*.

Yet another idea is to have dedicated no-args directive for resetting to default, something like:
# gazelle:default_python_test_file_pattern. However, it only makes sense if applied conventionally everywhere in the plugin imho. Otherwise, I'd vote for not having any special values and let users "reset" by explicitly setting python_test_file_pattern to default values.

Copy link
Contributor Author

@dougthor42 dougthor42 Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with no magic names at all and a "reset" by explicitly setting the value to defaults. gazelle:python_test_file_pattern *_test.py,test_*.py

That said, I'd still like to mention the two (minor) issues with no magic values:

  1. [can ignore] There's no way to defer to rules_python's defaults. For example, if rules_python changes the default from *_test.py,test_*.py to something else, users would have to update their configs manually if they wanted to always use the rules_python default.
    • TBH this is probably a good thing because it results in less unexpected behavior.
  2. We're back to the # gazelle:python_test_file_pattern (no value) for preventing any files from mapping to py_test. As mentioned previously:
    • this is not very intuitive.
    • we said that "no value" should fail.

For (2), perhaps just making sure that's documented and tested is sufficient? Unless we implement another directive gazelle:python_no_test_files or something (which I'm against - it seems to only complicate things).

have dedicated no-args directive for resetting to default ... it only makes sense if applied conventionally everywhere in the plugin

+1. I'm slightly unsure about having directives to reset to default (as you said, people can just set the default manually), but if we add them they definitely should be applied everywhere.

Copy link
Contributor

@wingsofovnia wingsofovnia Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, the plugin currently has no switches for enabling/disabling parts of the generation process. Setting pattern to none sounds to me like an abuse of the directive to disable tests and I'd rather have an explicit python_no_test_files so that there is a clear indication no tests are being processed for a) confusing fewer humans, and b) plugin itself, so that it can potentially deviate its processing depending on that fact (e.g. print debug/warning message etc).

This should be, however, out of context for this PR. I'd say, we should just disallow no-arg python_test_file_pattern, provide no magic values and wrap-up the PR.

Have I already mentioned this is great work and nice improvement 🙂? Thank you! Looking forward to it. I have exactly the place where I will use it already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting pattern to none sounds to me like an abuse of the directive

I can see that and it's a good point.

I'd rather have an explicit python_no_test_files so that there is a clear indication no tests are being processed ... we should just disallow no-arg python_test_file_pattern, provide no magic values and wrap-up the PR.

SGTM. I'd like to continue the discussion, so I've opened #1826.

this is great work and nice improvement

Thanks! I've been really enjoying being able to add more features.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this SGTM, thanks for the discussion. I've been enjoying the read! :)

Copy link
Contributor Author

@dougthor42 dougthor42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the test case for "no value causes error" and updated the docs for this directive. I think we're ready for re-review.

gazelle/README.md Show resolved Hide resolved
gazelle/go.mod Outdated
@@ -9,6 +9,7 @@ require (
github.com/bmatcuk/doublestar v1.3.4
github.com/emirpasic/gods v1.18.1
github.com/ghodss/yaml v1.0.0
github.com/gobwas/glob v0.2.3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the most popular glob package, though I don't see any github activity in 6 years. Other options were:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about doublestar.Match? It is used by bazel-gazelle and is already included in this go.mod:9 too.

import (
	"fmt"

	"github.com/bmatcuk/doublestar/v4"
)

func matchesAnyGlob(s string, globs []string) bool {
	for _, g := range globs {
		if ok, _ := doublestar.Match(g, s); ok {
			return true
		}
	}
	return false
}

func main() {
	defaultGlob := []string{"*_test.py", "test_*.py"}
	fmt.Println(matchesAnyGlob("test_my_target.py", defaultGlob))
	fmt.Println(matchesAnyGlob("my_target_test.py", defaultGlob))

	customGlob := []string{"*_pytest.py", "pytest_*.py"}
	fmt.Println(matchesAnyGlob("pytest_upload.py", customGlob))
	fmt.Println(matchesAnyGlob("download_pytest.py", customGlob))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for not adding another dep! Updated.

Copy link
Contributor Author

@dougthor42 dougthor42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed gobwas/glob and migrated/upgraded to doublestar/v4.

gazelle/go.mod Outdated
@@ -9,6 +9,7 @@ require (
github.com/bmatcuk/doublestar v1.3.4
github.com/emirpasic/gods v1.18.1
github.com/ghodss/yaml v1.0.0
github.com/gobwas/glob v0.2.3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for not adding another dep! Updated.

Copy link
Contributor

@wingsofovnia wingsofovnia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great work!

@aignas aignas added this pull request to the merge queue Apr 3, 2024
@aignas aignas removed this pull request from the merge queue due to a manual request Apr 3, 2024
@@ -54,6 +54,19 @@ func GetActualKindName(kind string, args language.GenerateArgs) string {
return kind
}

func matchesAnyGlob(s string, globs []string) bool {
for _, g := range globs {
ok, err := doublestar.Match(g, s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly concerned that we are compiling the glob here. Would it be possible to compile it elsewhere? When parsing the config would be another option.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is used in larger repos I am worried that we perform a linear number of glob compilations as opposed to doing it once when loading the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We were doing that before1 with gobwas/glob. From what I can tell, doublestar doesn't have any kind of "compiled pattern" object that I can pass around instead.

We can validate patterns during config load, but we wouldn't be able to tell doublestart.Match to not validate patterns and thus save some cycles. (Unless there's a way to access matchWithSeparator? My understanding of go is that lowercase names aren't exported and are thus not accessible.)

I think we have two (three?) options:

  1. Go back to gobwas/glob and precompile patterns
  2. Accept the perf hit.
  3. Some go ability that I'm not aware of?

Either way we should validate/compile patterns when loading the config so that we fail earlier. I've made that update. It means we validate one more time, but we're already validating (N_files * Patterns) times so (N*P)+1 is no big deal

Footnotes

  1. Well, almost. It wasn't done when loading config, but at least it was outside of the for _, f := range args.RegularFiles loop.

Copy link
Contributor

@wingsofovnia wingsofovnia Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the match's code there is nothing to "compile" in glob really, like we typically do with regex. This is common for globs to be matched directly as they are simple, do not need complicated parsing, creating underlying tree used for matching etc.

This [rules_python] repo already does doublestart.Match in a loop here and so does bazel-gazelle here and it hasn't been an issue so far.

Validating earlier is a good improvement.

Unless there's a way to access matchWithSeparator? My understanding of go is that lowercase names aren't exported and are thus not accessible.

That is correct for go. I created a ticket bmatcuk/doublestar#92

@@ -54,6 +54,19 @@ func GetActualKindName(kind string, args language.GenerateArgs) string {
return kind
}

func matchesAnyGlob(s string, globs []string) bool {
for _, g := range globs {
ok, err := doublestar.Match(g, s)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We were doing that before1 with gobwas/glob. From what I can tell, doublestar doesn't have any kind of "compiled pattern" object that I can pass around instead.

We can validate patterns during config load, but we wouldn't be able to tell doublestart.Match to not validate patterns and thus save some cycles. (Unless there's a way to access matchWithSeparator? My understanding of go is that lowercase names aren't exported and are thus not accessible.)

I think we have two (three?) options:

  1. Go back to gobwas/glob and precompile patterns
  2. Accept the perf hit.
  3. Some go ability that I'm not aware of?

Either way we should validate/compile patterns when loading the config so that we fail earlier. I've made that update. It means we validate one more time, but we're already validating (N_files * Patterns) times so (N*P)+1 is no big deal

Footnotes

  1. Well, almost. It wasn't done when loading config, but at least it was outside of the for _, f := range args.RegularFiles loop.

func (c *Config) SetTestFilePattern(patterns []string) {
for _, p := range patterns {
if !doublestar.ValidatePattern(p) {
log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

globstar errors with path.ErrBadPAttern but also states "users should not count on the returned error, doublestar.ErrBadPattern, being equal to path.ErrBadPattern" so I've opted to inject the string manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually... I could just do:

// We only care if the pattern is valid. Use `Match` instead of `ValidatePattern` (which
// just returns `bool`) so that we always use the error message provided by `doublestar`.
if _, err := doublestar.Match(p, ""); err != nil {
    log.Fatalf("ERROR: Failed to compile glob '%v'. Error: %v\n", p, err)
}
...

Any preference between
(a) using ValidatePattern and manually injecting the error string as I've already done or
(b) abusing Match?

Copy link
Contributor

@wingsofovnia wingsofovnia Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the word "abusing" in your comment speaks for itself :) Looking at the Match's code, err.ErrBadPattern brings no more than just an error flag (no error message, no extra context etc.) so it is effectively does the same as having true/false. I'd keep it as is. Except maybe a nit regarding go error message style (see the comment in configure.go).

Suggested change
log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p)
log.Fatalf("invalid glob pattern '%s'", p)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 kept, aside from style nit

// Will exit 1 if the pattern is invalid.
func (c *Config) SetTestFilePattern(patterns []string) {
for _, p := range patterns {
if !doublestar.ValidatePattern(p) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this check be better in configure.go case pythonconfig.TestFilePattern:?

Copy link
Contributor

@wingsofovnia wingsofovnia Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it would be slightly more aligned with existing code to do to in a case indeed as there are already some invalid value errors and parsing there while these stay simple setters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 moved.

@aignas
Copy link
Collaborator

aignas commented Apr 4, 2024 via email

case pythonconfig.TestFilePattern:
value := strings.TrimSpace(d.Value)
if value == "" {
log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.")
Copy link
Contributor

@wingsofovnia wingsofovnia Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is no need to double severity in the message here as Fatal will be already in the log. Also, go's error messages are conventionally lowercased since errors can be wrapped/nested (e.g. something bad happened: directive 'python...). \n is also unnecessary.

Suggested change
log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.")
log.Fatal("directive 'python_test_file_pattern' requires a value.")

(See https://go.dev/wiki/CodeReviewComments#error-strings)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thanks! Updated.

As an FYI, the ERROR: and \n were included to maintain consistency with the other log.Fatal messages:

$ rg 'log.Fatal.?\("'
manifest/generate/generate.go
90:             log.Fatalf("ERROR: %v\n", err)
109:            log.Fatalf("ERROR: %v\n", err)

pythonconfig/pythonconfig.go
444:                    log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p)

python/lifecycle.go
41:                     log.Fatalf("failed to write parser zip: %v", err)
47:                     log.Fatalf("cannot write %q: %v", helperPath, err)

python/generate.go
238:                    log.Fatalf("ERROR: %v\n", err)
319:                    log.Fatalf("ERROR: %v\n", err)
353:                    log.Fatalf("ERROR: %v\n", err)
384:                    log.Fatalf("ERROR: %v\n", err)
504:            log.Fatalf("ERROR: %v\n", err)

python/configure.go
188:                            log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.")

Prior to this PR, 7 of 9 calls included both ERROR: and \n.

@dougthor42 dougthor42 requested a review from aignas April 5, 2024 20:54
@aignas aignas added this pull request to the merge queue Apr 5, 2024
Merged via the queue into bazelbuild:main with commit 5667a86 Apr 5, 2024
4 checks passed
@dougthor42 dougthor42 deleted the test-file-pattern-gh1816 branch April 6, 2024 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gazelle considers both test_*.py and *_test.py files to be py_test targets.
3 participants