Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(gazelle): embed Python zip file #1485

Merged
merged 9 commits into from
Oct 17, 2023
Merged

build(gazelle): embed Python zip file #1485

merged 9 commits into from
Oct 17, 2023

Conversation

linzhp
Copy link
Contributor

@linzhp linzhp commented Oct 11, 2023

The runtime dependencies of Gazelle Python extension makes it hard to distribute Gazelle binaries: we have to preserve the runfiles structure and distribute it with Gazelle binaries.

Instead, we can build a single Python zip file that comes a built-in interpreter, and embed the zip file into the Go binary in compile time and avoid the runtime dependency.

Fixes #1455

@linzhp linzhp requested a review from f0rmiga as a code owner October 11, 2023 23:00
gazelle/python/parser.go Outdated Show resolved Hide resolved
gazelle/python/BUILD.bazel Outdated Show resolved Hide resolved
@fmeum
Copy link
Contributor

fmeum commented Oct 12, 2023

  1. It assumes that the Gazelle plugin repo is called `rules_python_gazelle_plugin". This may not be the case when the Gazelle plugin is managed as a go dependency, and by convention named as "com_github_bazelbuild_rules_python_gazelle".

This could also be fixed by baking the $(rlocationpath ...) into x_defs of the go_library.

  1. It makes Gazelle only runnable under bazel run. So we won't be able to debug Gazelle using regular Go IDEs without special Bazel plugins, or distribute pre-built Gazelle binary

Could you explain why that is the case? Runfiles discovery should work equally well when the binary is executed directly as long as its .runfiles directory is present beside it. If that doesn't work, then there is another bug to fix in the setup.

@linzhp
Copy link
Contributor Author

linzhp commented Oct 14, 2023

Could you explain why that is the case? Runfiles discovery should work equally well when the binary is executed directly as long as its .runfiles directory is present beside it. If that doesn't work, then there is another bug to fix in the setup.

The concept of ".runfiles" doesn't exist outside Bazel world. If we want to distribute pre-built Gazelle binary, we have to also distribute the runfiles in a certain layout; if people want to debug Gazelle using IDEs that use "go build" under the hood, they won't be able to find the runfiles.

In addition, I think runtime dependency makes sense when we need to change it at runtime, e.g., a config file for prod, and another config file for staging. For such files, the location should be configurable through command line arguments or environment variables. For static content like html template or Python scripts like this, we never want to change them in runtime, it's better to embed them into the binary, so we have one less thing to worry about when deploying/distributing the binary.

@fmeum
Copy link
Contributor

fmeum commented Oct 14, 2023

Makes sense. In the Go case, Gazelle with Bzlmod reads files generated by Bazel (the list of repos), so it would be difficult to distribute it as a single binary anyway. But if it works for Python, that's good.

Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the change looks good to me (once the imports are tidied up), thanks for the improvements! Could you please change the description of the PR so that it more suitable as a commit message (after the commit is merged, the PR description is going to be used for it), add a CHANGELOG.md entry and mention the #1455 ticket in the description. I personally think that it may fix the issue reported in #1455, but it would be good to double check with the issue author.

@rickeylev, I am not too familiar with the python_zip functionality that is used here and I am wondering if there are any gotchas here which we should be careful about before merging?

In the long term, I think maight probably need include zip files for each X.Y interpreter version we support and add additional support in gazelle which would allow us to tell it to use a particular Python version to use. That way we could correctly handle edge cases where tomllib could be detected as a std_module in Python 3.11 and above but not in earlier versions. That is definitely outside the scope of this PR.

gazelle/python/python_test.go Outdated Show resolved Hide resolved
gazelle/python/parser.go Outdated Show resolved Hide resolved
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
t.Cleanup(cancel)
cmd := exec.CommandContext(ctx, gazellePath, args...)
cmd := exec.Command(gazellePath, args...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why are we removing the 2 second timeout here? Does running gazelle become really expensive because of the needing to write the embedded file? What is roughly the impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, writing and extracting dozens of zip files take time. I introduced a new variable so tests can use the runfiles directly

@aignas aignas changed the title build: Embed Python scripts build(gazelle): embed Python scripts Oct 15, 2023
@linzhp linzhp changed the title build(gazelle): embed Python scripts build(gazelle): embed Python zip file Oct 15, 2023
@linzhp
Copy link
Contributor Author

linzhp commented Oct 15, 2023

In the long term, I think maight probably need include zip files for each X.Y interpreter version we support and add additional support in gazelle which would allow us to tell it to use a particular Python version to use. That way we could correctly handle edge cases where tomllib could be detected as a std_module in Python 3.11 and above but not in earlier versions. That is definitely outside the scope of this PR.

We can just document that for those who are prebuilding and distributing Gazelle binaries, the registered Python interpreter in the repo where Gazelle is built needs to be the same as the target repo. It's the distributor's responsibility to build a Gazelle binary for each combination of OS+arch+python interpreter version.

CHANGELOG.md Outdated Show resolved Hide resolved
@aignas
Copy link
Collaborator

aignas commented Oct 15, 2023

In the long term, I think maight probably need include zip files for each X.Y interpreter version we support and add additional support in gazelle which would allow us to tell it to use a particular Python version to use. That way we could correctly handle edge cases where tomllib could be detected as a std_module in Python 3.11 and above but not in earlier versions. That is definitely outside the scope of this PR.

We can just document that for those who are prebuilding and distributing Gazelle binaries, the registered Python interpreter in the repo where Gazelle is built needs to be the same as the target repo. It's the distributor's responsibility to build a Gazelle binary for each combination of OS+arch+python interpreter version.

+1 for documenting this for now. The gazelle_python.yaml needs to be version aware for this to fully work, right now the gazelle Python plugin does not work well in cases where repos are using version aware rules and have multiple Python toolchains being used at the same time. However this is not something we can solve right now with the current structure.

linzhp and others added 2 commits October 16, 2023 08:42
Co-authored-by: Ignas Anikevicius <240938+aignas@users.noreply.github.com>
@linzhp
Copy link
Contributor Author

linzhp commented Oct 16, 2023

Updated the doc, but somehow caused integration tests on Windows to fail

gazelle/README.md Outdated Show resolved Hide resolved
@aignas aignas enabled auto-merge October 17, 2023 04:52
@aignas aignas added this pull request to the merge queue Oct 17, 2023
Merged via the queue into bazelbuild:main with commit a94deb8 Oct 17, 2023
2 checks passed
@linzhp linzhp deleted the embed branch October 19, 2023 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to compile standalone gazelle binary with rules_python plugin
3 participants