Skip to content

Surfacing all bootstrap runfiles in PyExecutableInfo #3745

@Illedran

Description

@Illedran

🚀 feature request

Relevant Rules

Adding a runfile object containing all "runfiles" of the (stage2) bootstrappers to PyExecutableInfo (impacts py_binary, py_test).

Description

We have a custom collect_layers rule that takes a py_binary and produces different output groups that map to different layers in an OCI image that will be built in downstream rules. Most of this is captured by PyRuntimeInfo (for the interpreter) and PyExecutableInfo.app_runfiles - however, we would like to split 3rd party deps/runfiles (from PyPI) and our own code into different layers for optimization. At the moment, we are effectively rebuilding similar depsets to app_runfiles but are doing so via an aspect that traverses the build graph in order to do so efficiently without flattening the depset.

#3324 added extra fields to identify the stage2_bootstrap, but additional files required by the stage2 bootstrapper to work correctly (like the venv site-packages files venv _bazel_site_init.pth and bazel_site_init.py) are still only surfaced via the PyExecutableInfo.app_runfiles - this makes it impossible to add them to my output groups without flattening that depset which is not desirable.

Describe the solution you'd like

It would be great if PyExecutableInfo could return a stage2_runfiles (for stage2 specifically) or bootstrap_runfiles (for all bootstrap-related files) fields which includes any additional runfiles it needs.

Describe alternatives you've considered

At the moment, these "extra runfiles" are already modeled somewhat in code under files_without_interpreter, we just surface them via a targeted patch of the provider like so:

# This patch is for rules_python 1.9.0, as of time of writing
diff --git python/private/py_executable.bzl python/private/py_executable.bzl
index 284aea6b..95bb4676 100644
--- python/private/py_executable.bzl
+++ python/private/py_executable.bzl
@@ -491,6 +491,7 @@ WARNING: Target: {}
         app_runfiles = app_runfiles.build(ctx),
         # File|None; the venv `bin/python3` file, if any.
         venv_python_exe = venv.interpreter if venv else None,
+        files_without_interpreter = venv.files_without_interpreter if venv else None,
     )
 
 def _create_zip_main(ctx, *, stage2_bootstrap, runtime_details, venv):
@@ -1094,6 +1095,7 @@ def py_executable_base_impl(ctx, *, semantics, is_test, inherited_environment =
         app_runfiles = app_runfiles,
         venv_python_exe = exec_result.venv_python_exe,
         interpreter_args = ctx.attr.interpreter_args,
+        files_without_interpreter = exec_result.files_without_interpreter,
     )
 
 def _get_build_info(ctx, cc_toolchain):
@@ -1639,7 +1641,8 @@ def _create_providers(
         stage2_bootstrap,
         app_runfiles,
         venv_python_exe,
-        interpreter_args):
+        interpreter_args,
+        files_without_interpreter):
     """Creates the providers an executable should return.
 
     Args:
@@ -1700,6 +1703,7 @@ def _create_providers(
             app_runfiles = app_runfiles,
             venv_python_exe = venv_python_exe,
             interpreter_args = interpreter_args,
+            files_without_interpreter = files_without_interpreter,
         ),
     ]
 
diff --git python/private/py_executable_info.bzl python/private/py_executable_info.bzl
index defbd3a0..c1b7592a 100644
--- python/private/py_executable_info.bzl
+++ python/private/py_executable_info.bzl
@@ -87,5 +87,6 @@ mode is not enabled.
 :::{versionadded} 1.9.0
 :::
 """,
+        "files_without_interpreter": "Extra files required by the stage2 bootstrapper",
     },
 )

At the moment, we are still relying on bootstrap_impl=system_python (but "hack" the shebang to point to the interpreter in runfiles - this is the very first layer in the image before the other two).

I believe that the venv/bin/python3 symlink is also required for the stage1 bootstrap if one were to use bootstrap_impl=script - if so, it might be nicer to model this as a a generic bootstrap_runfiles field in which all bootstrap runfiles could live (so everything under venv and the stage2 bootstrap file itself)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions