-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gazelle): create py_binary targets for if __name__ == "__main__"
#1566
Conversation
Previously a `py_binary` target was only created if a file named `__main__.py` existed in the package. Now we look at each file that would otherwise be added to a `py_library` target and see if it contains a check for `if __name__ == "__main__"`. If such a check is found in the file, then we create a `py_binary` target for it instead. No existing tests fail with this change, though I think in real code it's possible this could change target generation. New tests have been added to check the new functionality.
// hasNameEqualsMain determines if the file contains 'if __name__ == "__main__"'. | ||
func hasNameEqualsMain(path string) bool { | ||
searchString := `if __name__ == ['"]__main__['"]:` | ||
bytesContents, err := os.ReadFile(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we are reading the python files once to get the imports and here it would be for once more to check if there is an if __name__ == "main"
which may not scale well in super large repos because we have now twice the number of files to process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me see how the read to get the import works. Maybe I can figure out a way to not read it twice.
if err != nil { | ||
return false | ||
} | ||
match, err := regexp.Match(searchString, bytesContents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading from the end of the file line by line in reverse and matching the whole line (stripped) could be faster? If you want to still use regexp
, you could at least compile the regexp upfront and store it as a var
at the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let me look into making this faster.
@@ -463,6 +470,20 @@ func hasEntrypointFile(dir string) bool { | |||
return false | |||
} | |||
|
|||
// hasNameEqualsMain determines if the file contains 'if __name__ == "__main__"'. | |||
func hasNameEqualsMain(path string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that this new feature needs a feature toggle. There should be a gazelle directive that enables the behaviour.
What is more, sometimes developers add if __name__
in order to test the script locally but do not intend to create a binary target for others to consume. I wonder if in those cases we should be able to say in the python file:
if __name__ == "__main__": # gazelle: ignore
main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A feature toggle is a great call. Let me look into making that work.
As for the ignore, is there any reason you wouldn't want that to get moved to a py_binary
? Then you could bazel run
it and test it locally. (If the script is already in a test target, then this change won't move it - it only moves scripts from py_library
-> py_binary
, not from py_test
, because you can already bazel run
a test.) I guess I'm just curious when you might want an ignore / might want to keep a script out of a py_binary
target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the ignore part out of scope for now. It probably does not need to be supported in the initial version.
Thanks for the feedback! I'll work on it and put up an improved version. |
Many existing Python repos don't use `__main__.py` to indicate the the main module. Instead, they put something like below in any Python files: ```python if __name__ == "__main__": main() ``` This PR makes the Gazelle extension able to recognize main modules like this, when `__main__.py` doesn't exist. This reduces the need to create `__main__.py` when enabling Gazelle extensions in existing Python repos. The current behavior of creating single `py_binary` for `__main__.py` is preserved and takes precedence. So this is a backward-compatible change. Closes #1566.
Previously a
py_binary
target was only created if a file named__main__.py
existed in the package. All other Python files get put into one or morepy_library
targets. This change makes Gazelle look at each file that would otherwise be added to apy_library
target and see if it contains a check forif __name__ == "__main__"
. If such a check is found in the file, then apy_binary
target is created for it instead.No existing tests fail with this change, though I think in real code it's possible this could be a breaking change for target generation. New tests have been added to check the new functionality.
I used the check for
if __name__ == "__main__"
because that seems like the most Pythonic way of showing that a script is meant to be run.