Skip to content

Create py_zipapp_binary to for building zip from py_binary #2586

@rickeylev

Description

@rickeylev

Right now, py_binary has an implicit output ({name}.zip}) and output group (python_zip_file) that builds a zip file of the py_binary. This zip file is a "zipapp", basically a zip file with a top-level __main__.py file. See https://peps.python.org/pep-0441/ and https://docs.python.org/3/library/zipapp.html for details.

If the Bazel builtin flag --build_python_zip is set, then the zip output changes slightly: it is turned into a self-executable zip file by prepending some shell code to the zip file.

Having this part of py_binary itself is problematic for four main reasons:

  1. It makes maintaining py_binary harder. The logic for creating the zip is subtle and introduces several branches on code that is otherwise straight forward.
  2. A zipapp is just one of many ways to create a "distributable" py_binary. It doesn't make sense to bake one directly into py_binary.
  3. A zipapp only works if it's pure-python code. If there are C library dependencies, then the zip must be extracted, which then breaks the logic for handling zipapp-based startup.
  4. A zip file doesn't have native support for symlinks. This makes it incompatible with bootstrap=script and how it creates a venv with symlinks to the desired python interpreter. Working around this requires hacks being applied at runtime with their own problems.

What to do:

  • Create a py_zipapp rule. As input, they take a py_binary or py_test. As output, they produce a zipapp zip file.
  • Create a py_zipapp_binary and py_zipapp_test rule. As input, they take a py_zipapp target. As output, they produce a self-executable zip file.
  • As part of the above: create a tool to generate zip files that contain symlinks. Use File.is_symlink to detect if inputs should be symlinks or not. For a python-based implementation, how to create a symlink: https://stackoverflow.com/questions/35782941/archiving-symlinks-with-python-zipfile . It doesn't necessairily have to be implemented in Python. Another language is fine, however, if it's a compiled language, then we'll need to update rules_python handle building and consuming prebuilt artifacts.

I'll note that there are rule sets, e.g. rules_pkg that support somewhat arbitrary creation of zip files. Using one of those would also be fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Good first issueA good first issue for people looking to contribute

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions