Refactor Python Toolchain Configuration to use Bazel Repository Rules#1050
Conversation
…is required for bazel to recognize it as a package. This allows us to include Python dependencies in our bazel build.
There was a problem hiding this comment.
Code Review
This pull request migrates Python interpreter detection and toolchain registration from the configure.sh script to a native Bazel repository rule in third_party/python_configure.bzl. This change streamlines the build configuration and improves integration with Bazel's toolchain mechanism. Key feedback includes addressing potential path escaping issues on Windows, ensuring the repository rule tracks changes to the system PATH to avoid stale configurations, and maintaining consistency by prioritizing the PYTHON_BIN_PATH environment variable in both the configuration logic and documentation. Additionally, the reviewer noted potential redundancy in the creation of multiple identical python repositories.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
mhucka
left a comment
There was a problem hiding this comment.
Thanks for the updates! Looks good.
Summary
This PR refactors how Python is detected and configured in TensorFlow Quantum. It moves away from manual script-based file generation for Python paths and adopts a more robust, Bazel-idiomatic approach using repository rules and rules_python toolchains.
Key Changes
Bazel-Managed Python: Introduced third_party/python_configure.bzl to handle Python detection via a repository rule. This dynamically generates the Python toolchain based on the environment.
Fixes multiple python versions:
One of the primary pain points in the previous setup was ambiguity when multiple Python versions (e.g., system Python, pyenv, Conda, and venvs) were present on the same machine. This PR addresses those issues through:
Strict Path Locking: Locks the specific interpreter identified during configuration into the Bazel workspace via --repo_env. This prevents "it works in my shell but fails in Bazel" errors caused by path mismatches.
Explicit Toolchain Registration: Registers a dedicated py_runtime so Bazel uses the exact same binary for testing, dependency resolution, and execution, removing reliance on ambiguous host environment variables like PYTHONPATH.
Ambiguity Resolution via --python: Provides a surgical override through ./configure.sh --python=/path/to/python, allowing developers to bypass auto-detection and explicitly define the interpreter Bazel should use.
Hermetic Pip Parsing: Synchronizes the pip_parse interpreter with the build toolchain, ensuring that downloaded wheels and environment scans are binary-compatible with the runtime environment.