From c41c6d2c36d77283238ba4c448ef51addfa9adaf Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Fri, 4 Aug 2023 02:13:07 -0400 Subject: [PATCH 1/7] PEP 723: Embedding metadata in single-file scripts --- pep-0723.rst | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 185 insertions(+) create mode 100644 pep-0723.rst diff --git a/pep-0723.rst b/pep-0723.rst new file mode 100644 index 00000000000..e314e997b5d --- /dev/null +++ b/pep-0723.rst @@ -0,0 +1,185 @@ +PEP: 723 +Title: Embedding metadata in single-file scripts +Author: Ofek Lev +PEP-Delegate: Brett Cannon +Discussions-To: https://discuss.python.org/t/29905 +Status: Draft +Type: Standards Track +Topic: Packaging +Content-Type: text/x-rst +Created: 04-Aug-2023 +Post-History: `19-Jul-2023 `__ + + +Abstract +======== + +This PEP specifies a format for defining metadata about single-file Python +scripts, such as their runtime dependencies. This PEP merely extends +:pep:`722` with an alternative format and therefore elides describing the +motivation for such a proposal. + +Specification +============= + +This PEP chooses to follow the latest developments of other modern packaging +ecosystems (namely `Rust `__ +and `Go `__) by embedding the existing +standard :pep:`621` that is used to describe projects. + +Any Python script may assign a variable named ``__pyproject__`` to a multi-line +double-quoted string containing a valid TOML document. This document contains +what you would expect to see in a ``pyproject.toml`` file, with the following +exceptions regarding the ``[project]`` table: + +* The ``name`` and ``version`` fields MUST be defined dynamically by tools if + the user does not define them +* These fields do not need to be listed in the ``dynamic`` array + +Example +------- + +The following is an example of a script with an embedded ``pyproject.toml``: + +.. code:: python + + __pyproject__ = """ + [project] + requires-python = ">=3.11" + dependencies = [ + "requests<3", + "rich", + ] + """ + + import requests + from rich.pretty import pprint + + resp = requests.get("https://peps.python.org/api/peps.json") + data = resp.json() + pprint([(k, v["title"]) for k, v in data.items()][:10]) + + +Reference Implementation +======================== + +This regular expression may be used to parse the metadata:: + + (?ms)^__pyproject__ *= *"""$(.+?)^"""$ + +For languages that do not support easily accessing the match group, one may +parse the entire match as TOML and then access the ``__pyproject__`` key. + +The following is an example of how to read the metadata on Python 3.11 or +higher. + +.. code:: python + + import re, tomllib + + def read(script: str) -> dict | None: + match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script) + return tomllib.loads(match.group(1)) if match else None + +Often tools will provide commands to manage dependencies. The following is a +crude example of modifying the content using the ``tomlkit`` library. + +.. code:: python + + import re, tomlkit + + def add(script: str, dependency: str) -> str: + match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script) + config = tomlkit.parse(match.group(1)) + config['project']['dependencies'].append(dependency) + + start, end = match.span(1) + return script[:start] + tomlkit.dumps(config) + script[end:] + + +Backwards Compatibility +======================= + +At the time of writing, ``__pyproject__`` only appears 5 times +`on GitHub `__ and 4 of +those belong to a user who appears to already be using this PEP's format +for its intended purpose. + + +How to Teach This +================= + +The format is intended to bridge the gap between users just writing scripts and +users writing packages. Knowledge of how to write metadata for one use case will +be directly transferable to the other. + +Everything in the parent PEP regarding teaching is applicable to this as well. + + +Recommendations +=============== + +For situations in which users do not define the required name and version +fields, the following defaults should be preferred: + +* ``name``: ``script-`` e.g. ``script-3a5c6b...`` to + provide interoperability with other tools that use the name to derive file + system storage paths for things like virtual environments +* ``version``: ``0.0.0`` + + +Benefits +======== + +Ecosystem cohesion +------------------ + +One of the central themes we discovered from the recent +`packaging survey `__ is that users have +begun getting frustrated with the lack of unification regarding both tooling +and specs. + +Adding yet another way to define metadata would further fragment the community. + + +Extensibility +------------- + +The parent PEP allows for extensions to the custom metadata block format. This +PEP benefits from all future extensions to project metadata and immediately +supports the 2 hypothetical examples that were mentioned: + +* versioning - It is quite common to version scripts for persistence even when + using a VCS like Git. When not using a VCS it is even more common to version, + for example the author has been in multiple time sensitive debugging sessions + with customers where due to the airgapped nature of the environment, the only + way to transfer the script was via email or copying and pasting it into a + chat window. In these cases, versioning is invaluable to ensure that the + customer is using the latest version of the script. +* Python runtime requirements - This is useful for tools that are able to run + specific versions of Python. + + +Broader applicability +--------------------- + +This PEP does not prohibit any class of tooling from using embedded metadata +if they so desire and envisions support to be ubiquitous across the ecosystem. +In addition to script runners, one would expect: + +* IDEs to provide TOML syntax highlighting +* Dependency version checkers and security scanners to verify and offer updates + to scripts +* Package managers like Hatch and Poetry to gain the ability to run scripts + +Open Issues +=========== + +None at this point. + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. From 5f3fcf2373da6032b7ad53e9bea81e501c4e6115 Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Sun, 6 Aug 2023 01:29:48 -0400 Subject: [PATCH 2/7] update --- .github/CODEOWNERS | 3 +- pep-0723.rst | 485 +++++++++++++++++++++++++++++++++++++++------ 2 files changed, 422 insertions(+), 66 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 6d605c74736..ba2f5f5a298 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -509,7 +509,7 @@ pep-0627.rst @encukou pep-0628.txt @ncoghlan pep-0629.rst @dstufft pep-0630.rst @encukou -pep-0631.rst @pganssle +pep-0631.rst @ofek @pganssle pep-0632.rst @zooba pep-0633.rst @brettcannon pep-0634.rst @brandtbucher @gvanrossum @@ -601,6 +601,7 @@ pep-0719.rst @Yhg1s pep-0720.rst @FFY00 pep-0721.rst @encukou pep-0722.rst @pfmoore +pep-0723.rst @ofek # ... # pep-0754.txt # ... diff --git a/pep-0723.rst b/pep-0723.rst index e314e997b5d..72af2ed8a3a 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -1,41 +1,100 @@ PEP: 723 -Title: Embedding metadata in single-file scripts +Title: Embedding pyproject metadata in single-file scripts Author: Ofek Lev PEP-Delegate: Brett Cannon -Discussions-To: https://discuss.python.org/t/29905 +Discussions-To: https://discuss.python.org/t/30979 Status: Draft Type: Standards Track Topic: Packaging Content-Type: text/x-rst Created: 04-Aug-2023 -Post-History: `19-Jul-2023 `__ +Post-History: `04-Aug-2023 `__ Abstract ======== This PEP specifies a format for defining metadata about single-file Python -scripts, such as their runtime dependencies. This PEP merely extends -:pep:`722` with an alternative format and therefore elides describing the -motivation for such a proposal. +scripts, such as their runtime dependencies. + + +Motivation +========== + +Python is routinely used as a scripting language, with Python scripts as a +(better) alternative to shell scripts, batch files, etc. When Python code is +structured as a script, it is usually stored as a single file and does not +expect the availability of any other local code that may be used for imports. +As such, it is possible to share with others over arbitrary text-based means +such as email, a URL to the script, or even a chat window. Code that is +structured like this may live as a single file forever, never becoming a +full-fledged project with its own directory and ``pyproject.toml`` file. + +An issue that users encounter with this approach is that there is no standard +mechanism to define metadata for tools whose job it is to execute such scripts. +For example, a tool that runs a script may need to know which dependencies are +required or the supported version(s) of Python. + +There is currently no standard tool that addresses this issue, and this PEP +does *not* attempt to define one. However, any tool that *does* address this +issue will need to know what the runtime requirements of scripts are. By +defining a standard format for storing such metadata, existing tools, as well +as any future tools, will be able to obtain that information without requiring +users to include tool-specific metadata in their scripts. -Specification -============= -This PEP chooses to follow the latest developments of other modern packaging +Rationale +========= + +This PEP defines a mechanism for embedding metadata *within the script itself*, +and not in an external file. + +We choose to follow the latest developments of other modern packaging ecosystems (namely `Rust `__ and `Go `__) by embedding the existing -standard :pep:`621` that is used to describe projects. +standard :pep:`621` metadata that is used to describe projects. + +The format is intended to bridge the gap between different types of users +of Python. Knowledge of how to write project metadata will be directly +transferable to all use cases, whether writing a script or maintaining a +project that is distributed via PyPI. Additionally, users will benefit from +seamless interoperability with tools that are already familiar with the format. + +One of the central themes we discovered from the recent +`packaging survey `__ is that users have +begun getting frustrated with the lack of unification regarding both tooling +and specs. Adding yet another way to define metadata, even for a currently +unsatisfied use case, would further fragment the community. + + +Specification +============= Any Python script may assign a variable named ``__pyproject__`` to a multi-line -double-quoted string containing a valid TOML document. This document contains -what you would expect to see in a ``pyproject.toml`` file, with the following -exceptions regarding the ``[project]`` table: +*double-quoted* string (``"""``) containing a valid TOML document. The opening +of the string MUST be on the same line as the assignment. The closing of the +string MUST be on a line by itself, and MUST NOT be indented. + +The TOML document MUST NOT contain multi-line double-quoted strings, as that +would conflict with the Python string containing the document. Single-quoted +multi-line TOML strings may be used instead. -* The ``name`` and ``version`` fields MUST be defined dynamically by tools if - the user does not define them +This document MUST be formatted according to :pep:`621` and the +`project metadata `_ standard. The document MAY include +the ``[tool]`` table and sub-tables as described in :pep:`518`. + +The ``[project]`` table differs in the following ways: + +* The ``name`` and ``version`` fields are not required and MAY be defined + dynamically by tools if the user does not define them * These fields do not need to be listed in the ``dynamic`` array +Non-script running tools MAY choose to read from their expected ``[tool]`` +sub-table if the script is the only target of the tool's functionality. In all +other cases tools MUST NOT alter behavior based on the embedded metadata. For +example, if a linter is invoked with the path to a directory, it MUST behave +the same as if zero files had embedded metadata. + Example ------- @@ -59,16 +118,41 @@ The following is an example of a script with an embedded ``pyproject.toml``: data = resp.json() pprint([(k, v["title"]) for k, v in data.items()][:10]) +The following is an example of a single-file Rust project that embeds their +version of ``pyproject.toml``, which is called ``Cargo.toml``: + +.. code:: rust + + #!/usr/bin/env cargo + + //! ```cargo + //! [dependencies] + //! regex = "1.8.0" + //! ``` + + fn main() { + let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap(); + println!("Did our date match? {}", re.is_match("2014-01-01")); + } + +One important thing to note is that the metadata is embedded in a comment mostly +for introspection since Rust documentation is generated from comments. Another +is that users rarely edit dependencies manually, but rather use their Cargo +package manager. + +We argue that our choice provides easier edits for both humans and tools. + Reference Implementation ======================== -This regular expression may be used to parse the metadata:: +This is the canonical regular expression that may be used to parse the metadata:: - (?ms)^__pyproject__ *= *"""$(.+?)^"""$ + (?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$ For languages that do not support easily accessing the match group, one may -parse the entire match as TOML and then access the ``__pyproject__`` key. +parse the entire match as TOML (that is valid syntax) and then access +the ``__pyproject__`` key from the resulting mapping. The following is an example of how to read the metadata on Python 3.11 or higher. @@ -78,18 +162,19 @@ higher. import re, tomllib def read(script: str) -> dict | None: - match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script) + match = re.search(r'(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$', script) return tomllib.loads(match.group(1)) if match else None -Often tools will provide commands to manage dependencies. The following is a -crude example of modifying the content using the ``tomlkit`` library. +Often tools will edit dependencies like package managers or dependency update +automation in CI. The following is a crude example of modifying the content +using the ``tomlkit`` library. .. code:: python import re, tomlkit def add(script: str, dependency: str) -> str: - match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script) + match = re.search(r'(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$', script) config = tomlkit.parse(match.group(1)) config['project']['dependencies'].append(dependency) @@ -100,77 +185,339 @@ crude example of modifying the content using the ``tomlkit`` library. Backwards Compatibility ======================= -At the time of writing, ``__pyproject__`` only appears 5 times -`on GitHub `__ and 4 of -those belong to a user who appears to already be using this PEP's format -for its intended purpose. +At the time of writing, the ``__pyproject__`` variable only appears five times +`on GitHub `__ and four of +those belong to a user who appears to already be using this PEP's exact format. + +For example, `this script `__ +uses ``matplotlib`` and ``pandas`` to plot a timeseries. It is a good example +of a script that you would see in the wild: self-contained and short. + +This user's tooling invokes scripts by creating a project at runtime using the +embedded metadata and then uses an entry point that references the main function. + +This PEP allows this user's tooling to remove that extra step of indirection. + +This PEP's author has discovered after writing a draft that this pattern is +used in the wild by others (sent private messages). + + +Security Implications +===================== + +If a script containing embedded metadata is ran using a tool that automatically +installs dependencies, this could cause arbitrary code to be downloaded and +installed in the user's environment. + +The risk here is part of the functionality of the tool being used to run the +script, and as such should already be addressed by the tool itself. The only +additional risk introduced by this PEP is if an untrusted script with a +embedded metadata is run, when a potentially malicious dependency might be +installed. This risk is addressed by the normal good practice of reviewing code +before running it. How to Teach This ================= -The format is intended to bridge the gap between users just writing scripts and -users writing packages. Knowledge of how to write metadata for one use case will -be directly transferable to the other. +Since the format chosen is the same as the official metadata standard, we can +direct users to the same page that describes +`project metadata `_. On that page we can add a section +that describes how to embed the metadata in scripts or we can have a separate +page for that which links to the page describing project metadata. -Everything in the parent PEP regarding teaching is applicable to this as well. +Additionally, we may want to list some tools that support this PEP's format. Recommendations =============== -For situations in which users do not define the required name and version -fields, the following defaults should be preferred: +For situations in which users do not define the name and version fields, the +following defaults should be preferred by tools: * ``name``: ``script-`` e.g. ``script-3a5c6b...`` to provide interoperability with other tools that use the name to derive file system storage paths for things like virtual environments * ``version``: ``0.0.0`` +Tools that support managing different versions of Python should attempt to use +the highest available version of Python that is compatible with the script's +``requires-python`` metadata, if defined. -Benefits -======== - -Ecosystem cohesion ------------------- -One of the central themes we discovered from the recent -`packaging survey `__ is that users have -begun getting frustrated with the lack of unification regarding both tooling -and specs. +Rejected Ideas +============== -Adding yet another way to define metadata would further fragment the community. +Why not limit to specific metadata fields? +------------------------------------------ +By limiting the metadata to a specific set of fields, for example just +``dependencies``, we would prevent legitimate use cases both known and unknown. +The following are examples of known use cases: -Extensibility -------------- - -The parent PEP allows for extensions to the custom metadata block format. This -PEP benefits from all future extensions to project metadata and immediately -supports the 2 hypothetical examples that were mentioned: - -* versioning - It is quite common to version scripts for persistence even when +* ``requires-python``: For tools that support managing Python installations, + this allows users to target specific versions of Python for new syntax + or standard library functionality. +* ``version``: It is quite common to version scripts for persistence even when using a VCS like Git. When not using a VCS it is even more common to version, for example the author has been in multiple time sensitive debugging sessions with customers where due to the airgapped nature of the environment, the only way to transfer the script was via email or copying and pasting it into a chat window. In these cases, versioning is invaluable to ensure that the - customer is using the latest version of the script. -* Python runtime requirements - This is useful for tools that are able to run - specific versions of Python. - - -Broader applicability ---------------------- - -This PEP does not prohibit any class of tooling from using embedded metadata -if they so desire and envisions support to be ubiquitous across the ecosystem. -In addition to script runners, one would expect: + customer is using the latest (or a specific) version of the script. +* ``description``: For scripts that don't need an argument parser, or if the + author has never used one, tools can treat this as help text which can be + shown to the user. + +By not allowing the ``[tool]`` section, we would prevent especially script +runners from allowing users to configure behavior. For example, a script runner +may support configuration instructing to run scripts in containers for +situations in which there is no cross-platform support for a dependency or if +the setup is too complex for the average user like when requiring Nvidia +drivers. Situations like this would allow users to proceed with what they want +to do whereas otherwise they may stop at that point altogether. + + +Why not use a comment block resembling requirements.txt? +-------------------------------------------------------- + +This PEP considers there to be different types of users for whom Python code +would live as single-file scripts: + +* Non-programmers who are just using Python as a scripting language to achieve a + specific task. These users are unlikely to be familiar with concepts of + operating systems like shebang lines or the ``PATH`` environment variable. + Some examples: + + * The average person, perhaps at a workplace, who wants to write a script to + automate something for efficiency or to reduce tedium + * Someone doing data science or machine learning in industry or academia who + wants to write a script to analyze some data or for research purposes. + These users are special in that, although they have limited programming + knowledge, they learn from sources like StackOverflow and blogs that have a + programming bent and are increasingly likely to be part of communities that + share knowledge and code. Therefore, a non-trivial number of these users + will have some familiarity with things like Git(Hub), Jupyter, HuggingFace, + etc. +* Non-programmers who manage operating systems e.g. a sysadmin. These users are + able to set up ``PATH``, for example, but are unlikely to be familiar with + Python concepts like virtual environments. These users often operate in + isolation and have limited need to gain exposure to tools intended for sharing + like Git. +* Programmers who manage operating systems/infrastructure e.g. SREs. These + users are not very likely to be familiar with Python concepts like virtual + environments, but are likely to be familiar with Git and most often use it + to version control everything required to manage infrastructure like Python + scripts and Kubernetes config. +* Programmers who write scripts primarily for themselves. These users over time + accumulate a great number of scripts in various languages that they use to + automate their workflow and often store them in a single directory, that is + potentially version controlled for persistence. Non-Windows users may set + up each Python script with a shebang line pointing to the desired Python + executable or script runner. + +This PEP argues that reusing our TOML-based metadata format is the best for +each category of user and that the block comment is only approachable for +those who have familiarity with ``requirements.txt``, which represents a +small subset of users. + +* For the average person automating a task or the data scientist, they are + already starting with zero context and are unlikely to be familiar with + TOML nor ``requirements.txt``. These users will very likely rely on + snippets found online via a search engine or utilize AI in the form + of a chat bot or direct code completion software. Searching for Python + metadata formatting will lead them to the TOML-based format that already + exists which they can reuse. The author tested GitHub Copilot with this + PEP and it already supports auto-completion of fields and dependencies. + + Additionally, these users are most susceptible to formatting quirks and + syntax errors. TOML is a well-defined format with existing online + validators that features assignment that is compatible with Python + expressions and has no strict indenting rules. The block comment format + on the other hand could be easily malformed by forgetting the colon, for + example, and debugging why it's not working with a search engine would be + a difficult task for such a user. +* For the sysadmin types, they are equally unlikely as the previously described + users to be familiar with TOML or ``requirements.txt``. For either format + they would have to read documentation. They would likely be more comfortable + with TOML since they are used to structured data formats and there would be + less perceived magic in their systems. + + Additionally, for maintenance of their systems ``__pyproject__`` would be + much easier to search for from a shell than a block comment with potentially + numerous extensions over time. +* For the SRE types, they are likely to be familiar with TOML already from other + projects that they might have to work with like configuring the + `GitLab Runner `__ + or `Cloud Native Buildpacks `__. + + These users are responsible for the security of their systems and most likely + have security scanners set up to automatically open PRs to update versions + of dependencies. Such automated tools like Dependabot would have a much easier + time using existing TOML libraries than writing their own custom parser for a + block comment format. +* For the programmer types, they are more likely to be familiar with TOML + than they have ever seen a ``requirements.txt`` file, unless they are a + Python programmer who has had previous experience with writing applications. + In the case of experience with the requirements format, it necessarily means + that they are at least somewhat familiar with the ecosystem and therefore + it is safe to assume they know what TOML is. + + Another benefit of this PEP to these users is that their IDEs like Visual + Studio Code would be able to provide TOML syntax highlighting much more + easily than each writing custom logic for this feature. + + +Why not consider scripts as projects without wheels? +---------------------------------------------------- + +There is `an ongoing discussion `_ about how to +use ``pyproject.toml`` for projects that are not intended to be built as wheels. +Although the outcome of that will likely be that the project name and version +become optional in certain circumstances, this PEP considers the discussion only +tangentially related. + +The use case described in that thread is primarily talking about projects that +represent applications like a Django app or a Flask app. These projects are +often installed on each server in a virtual environment with strict dependency +pinning e.g. a lock file with some sort of hash checking. Such projects would +never be distributed as a wheel (except for maybe a transient editable one +that is created when doing ``pip install -e .``). + +In contrast, scripts are managed loosely by its runner and would almost +always have relaxed dependency constraints. Additionally, to reduce +friction associated with managing small projects there may be a future +in which there is a standard prescribed way to ship projects that are in +the form of a single file. The author of the Rust RFC for embedding metadata +`mentioned to us `__ that they are +actively looking into that based on user feedback. + +Why not just set up a Python project with a ``pyproject.toml``? +--------------------------------------------------------------- + +Again, a key issue here is that the target audience for this proposal is people +writing scripts which aren't intended for distribution. Sometimes scripts will +be "shared", but this is far more informal than "distribution" - it typically +involves sending a script via an email with some written instructions on how to +run it, or passing someone a link to a GitHub gist. + +Expecting such users to learn the complexities of Python packaging is a +significant step up in complexity, and would almost certainly give the +impression that "Python is too hard for scripts". + +In addition, if the expectation here is that the ``pyproject.toml`` will somehow +be designed for running scripts in place, that's a new feature of the standard +that doesn't currently exist. At a minimum, this isn't a reasonable suggestion +until the `current discussion on Discourse `_ about +using ``pyproject.toml`` for projects that won't be distributed as wheels is +resolved. And even then, it doesn't address the "sending someone a script in a +gist or email" use case. + +Why not use a requirements file for dependencies? +------------------------------------------------- + +Putting your requirements in a requirements file, doesn't require a PEP. You can +do that right now, and in fact it's quite likely that many adhoc solutions do +this. However, without a standard, there's no way of knowing how to locate a +script's dependency data. And furthermore, the requirements file format is +pip-specific, so tools relying on it are depending on a pip implementation +detail. + +So in order to make a standard, two things would be required: + +1. A standardised replacement for the requirements file format. +2. A standard for how to locate the requiements file for a given script. + +The first item is a significant undertaking. It has been discussed on a number +of occasions, but so far no-one has attempted to actually do it. The most likely +approach would be for standards to be developed for individual use cases +currently addressed with requirements files. One option here would be for this +PEP to simply define a new file format which is simply a text file containing +:pep:`508` requirements, one per line. That would just leave the question of how +to locate that file. + +The "obvious" solution here would be to do something like name the file the same +as the script, but with a ``.reqs`` extension (or something similar). However, +this still requires *two* files, where currently only a single file is needed, +and as such, does not match the "better batch file" model (shell scripts and +batch files are typically self-contained). It requires the developer to remember +to keep the two files together, and this may not always be possible. For +example, system administration policies may require that *all* files in a +certain directory are executable (the Linux filesystem standards require this of +``/usr/bin``, for example). And some methods of sharing a script (for example, +publishing it on a text file sharing service like Github's gist, or a corporate +intranet) may not allow for deriving the location of an associated requirements +file from the script's location (tools like ``pipx`` support running a script +directly from a URL, so "download and unpack a zip of the script and its +dependencies" may not be an appropriate requirement). + +Essentially, though, the issue here is that there is an explicitly stated +requirement that the format supports storing dependency data *in the script file +itself*. Solutions that don't do that are simply ignoring that requirement. + +Why not use (possibly restricted) Python syntax? +------------------------------------------------ + +This would typically involve storing metadata like dependencies as +a (runtime) list variable with a conventional name, such as:: + + __requires__ = [ + "requests", + "click", + ] -* IDEs to provide TOML syntax highlighting -* Dependency version checkers and security scanners to verify and offer updates - to scripts -* Package managers like Hatch and Poetry to gain the ability to run scripts +The most significant problem with this proposal is that it requires all +consumers of the dependency data to implement a Python parser. Even if the +syntax is restricted, the *rest* of the script will use the full Python syntax, +and trying to define a syntax which can be successfully parsed in isolation from +the surrounding code is likely to be extremely difficult and error-prone. + +Furthermore, Python's syntax changes in every release. If extracting dependency +data needs a Python parser, the parser will need to know which version of Python +the script is written for, and the overhead for a generic tool of having a +parser that can handle *multiple* versions of Python is unsustainable. + +Even if the above issues could be addressed, the format would give the +impression that the data could be altered at runtime. However, this is not the +case in general, and code that tries to do so will encounter unexpected and +confusing behaviour. + +And finally, there is no evidence that having metadata available at +runtime is of any practical use for scripts. Should such a use be found, +it is simple enough to get the data by parsing the value as TOML. + +It is worth noting, though, that the ``pip-run`` utility does implement (an +extended form of) this approach. `Further discussion `_ of +the ``pip-run`` design is available on the project's issue tracker. + +Should scripts be able to specify a package index? +-------------------------------------------------- + +Dependency metadata is about *what* package the code depends on, and not *where* +that package comes from. There is no difference here between metadata for +scripts, and metadata for distribution packages (as defined in +``pyproject.toml``). In both cases, dependencies are given in "abstract" form, +without specifying how they are obtained. + +Some tools that use the dependency information may, of course, need to locate +concrete dependency artifacts - for example if they expect to create an +environment containing those dependencies. But the way they choose to do that +will be closely linked to the tool's UI in general, and this PEP does not try to +dictate the UI for tools. + +There is more discussion of this point, and in particular of the UI choices made +by the ``pip-run`` tool, in `the previously mentioned pip-run issue `_. + +What about local dependencies? +------------------------------ + +These can be handled without needing special metadata and tooling, simply by +adding the location of the dependencies to ``sys.path``. This PEP simply isn't +needed for this case. If, on the other hand, the "local dependencies" are actual +distributions which are published locally, they can be specified as usual with a +:pep:`508` requirement, and the local package index specified when running a +tool by using the tool's UI for that. Open Issues =========== @@ -178,6 +525,14 @@ Open Issues None at this point. +References +========== + +.. _pyproject metadata: https://packaging.python.org/en/latest/specifications/declaring-project-metadata/ +.. _pip-run issue: https://github.com/jaraco/pip-run/issues/44 +.. _pyproject without wheels: https://discuss.python.org/t/projects-that-arent-meant-to-generate-a-wheel-and-pyproject-toml/29684 + + Copyright ========= From 498262460a1d379fdefeb76fb08c620794394215 Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Sun, 6 Aug 2023 13:15:26 -0400 Subject: [PATCH 3/7] address feedback --- .github/CODEOWNERS | 4 +- pep-0723.rst | 130 +++++++++++++++++++++++++-------------------- 2 files changed, 74 insertions(+), 60 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index ba2f5f5a298..2f8c175b53d 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -509,7 +509,7 @@ pep-0627.rst @encukou pep-0628.txt @ncoghlan pep-0629.rst @dstufft pep-0630.rst @encukou -pep-0631.rst @ofek @pganssle +pep-0631.rst @pganssle pep-0632.rst @zooba pep-0633.rst @brettcannon pep-0634.rst @brandtbucher @gvanrossum @@ -601,7 +601,7 @@ pep-0719.rst @Yhg1s pep-0720.rst @FFY00 pep-0721.rst @encukou pep-0722.rst @pfmoore -pep-0723.rst @ofek +pep-0723.rst # ... # pep-0754.txt # ... diff --git a/pep-0723.rst b/pep-0723.rst index 72af2ed8a3a..995a65478ab 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -1,5 +1,5 @@ PEP: 723 -Title: Embedding pyproject metadata in single-file scripts +Title: Embedding pyproject.toml in single-file scripts Author: Ofek Lev PEP-Delegate: Brett Cannon Discussions-To: https://discuss.python.org/t/30979 @@ -9,13 +9,14 @@ Topic: Packaging Content-Type: text/x-rst Created: 04-Aug-2023 Post-History: `04-Aug-2023 `__ +Replaces: 722 Abstract ======== This PEP specifies a format for defining metadata about single-file Python -scripts, such as their runtime dependencies. +scripts that is required for proper runtime execution. Motivation @@ -52,7 +53,8 @@ and not in an external file. We choose to follow the latest developments of other modern packaging ecosystems (namely `Rust `__ and `Go `__) by embedding the existing -standard :pep:`621` metadata that is used to describe projects. +`metadata standard `_ that is used to describe +projects. The format is intended to bridge the gap between different types of users of Python. Knowledge of how to write project metadata will be directly @@ -66,6 +68,16 @@ begun getting frustrated with the lack of unification regarding both tooling and specs. Adding yet another way to define metadata, even for a currently unsatisfied use case, would further fragment the community. +A use case that this PEP wishes to support that other formats may preclude is +a script that desires to transition to a directory-type project. A user may +be rapidly prototyping locally or in a remote REPL environment and then decide +to transition to a more formal project if their idea works out. This +intermediate script stage would be very useful to have fully reproducible bug +reports. By using the same metadata format, the user can simply copy and paste +the metadata into a ``pyproject.toml`` file and continue working without having +to learn a new format. More likely, even, is that tooling will eventually support +this transformation with a single command. + Specification ============= @@ -79,9 +91,13 @@ The TOML document MUST NOT contain multi-line double-quoted strings, as that would conflict with the Python string containing the document. Single-quoted multi-line TOML strings may be used instead. -This document MUST be formatted according to :pep:`621` and the -`project metadata `_ standard. The document MAY include -the ``[tool]`` table and sub-tables as described in :pep:`518`. +Tools reading embedded metadata MAY respect the standard Python encoding +declaration. If they choose not to do so, they MUST process the file as UTF-8. + +This document MAY include the ``[project]`` and ``[tool]`` tables but MUST NOT +define the ``[build-system]`` table. The ``[build-system]`` table MAY be +allowed in a future PEP that standardizes how backends are to build distributions +from single file scripts. The ``[project]`` table differs in the following ways: @@ -90,10 +106,10 @@ The ``[project]`` table differs in the following ways: * These fields do not need to be listed in the ``dynamic`` array Non-script running tools MAY choose to read from their expected ``[tool]`` -sub-table if the script is the only target of the tool's functionality. In all -other cases tools MUST NOT alter behavior based on the embedded metadata. For -example, if a linter is invoked with the path to a directory, it MUST behave -the same as if zero files had embedded metadata. +sub-table. If a single-file script is not the sole input to a tool then +behavior SHOULD NOT be altered based on the embedded metadata. For example, +if a linter is invoked with the path to a directory, it SHOULD behave the same +as if zero files had embedded metadata. Example ------- @@ -140,7 +156,13 @@ for introspection since Rust documentation is generated from comments. Another is that users rarely edit dependencies manually, but rather use their Cargo package manager. -We argue that our choice provides easier edits for both humans and tools. +We argue that our choice, in comparison to the Rust format, is easier to read +and provides easier edits for humans by virtue of the contents starting at the +beginning of lines so would precisely match the contents of a ``pyproject.toml`` +file. It is also is easier for tools to parse and modify this continuous block +of text which was +`one of the concerns `__ +raised in the Rust pre-RFC. Reference Implementation @@ -150,10 +172,6 @@ This is the canonical regular expression that may be used to parse the metadata: (?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$ -For languages that do not support easily accessing the match group, one may -parse the entire match as TOML (that is valid syntax) and then access -the ``__pyproject__`` key from the resulting mapping. - The following is an example of how to read the metadata on Python 3.11 or higher. @@ -181,6 +199,10 @@ using the ``tomlkit`` library. start, end = match.span(1) return script[:start] + tomlkit.dumps(config) + script[end:] +Note that this example used a library that preserves TOML formatting. This is +not a requirement for editing by any means but rather is a "nice to have" +especially since there are unlikely to be embedded comments. + Backwards Compatibility ======================= @@ -221,25 +243,21 @@ How to Teach This ================= Since the format chosen is the same as the official metadata standard, we can -direct users to the same page that describes -`project metadata `_. On that page we can add a section -that describes how to embed the metadata in scripts or we can have a separate -page for that which links to the page describing project metadata. +have a page that describes how to embed the metadata in scripts and to learn +about metadata itself direct users to the living document that describes +`project metadata `_. + +We will document that the name and version fields in the ``[project]`` table +may be elided for simplicity. Additionally, we will have guidance (perhaps +temporary) explaining that single-file scripts cannot be built into a wheel +and therefore you would never see the associated ``[build-system]`` metadata. -Additionally, we may want to list some tools that support this PEP's format. +Finally, we may want to list some tools that support this PEP's format. Recommendations =============== -For situations in which users do not define the name and version fields, the -following defaults should be preferred by tools: - -* ``name``: ``script-`` e.g. ``script-3a5c6b...`` to - provide interoperability with other tools that use the name to derive file - system storage paths for things like virtual environments -* ``version``: ``0.0.0`` - Tools that support managing different versions of Python should attempt to use the highest available version of Python that is compatible with the script's ``requires-python`` metadata, if defined. @@ -329,6 +347,8 @@ small subset of users. metadata formatting will lead them to the TOML-based format that already exists which they can reuse. The author tested GitHub Copilot with this PEP and it already supports auto-completion of fields and dependencies. + In contrast, a new format may take years of being trained on the Internet + for models to learn. Additionally, these users are most susceptible to formatting quirks and syntax errors. TOML is a well-defined format with existing online @@ -367,6 +387,21 @@ small subset of users. Studio Code would be able to provide TOML syntax highlighting much more easily than each writing custom logic for this feature. +Additionally, the block comment format goes against the recommendation of +:pep:`008`:: + + Each line of a block comment starts with a ``#`` and a single space (unless + it is indented text inside the comment). [...] Paragraphs inside a block + comment are separated by a line containing a single ``#``. + +Linters and IDE auto-formatters that respect this long-time recommendation +would fail by default. The following uses the example from :pep:`722`:: + + $ flake8 . + .\script.py:3:1: E266 too many leading '#' for block comment + .\script.py:4:1: E266 too many leading '#' for block comment + .\script.py:5:1: E266 too many leading '#' for block comment + Why not consider scripts as projects without wheels? ---------------------------------------------------- @@ -458,10 +493,13 @@ itself*. Solutions that don't do that are simply ignoring that requirement. Why not use (possibly restricted) Python syntax? ------------------------------------------------ -This would typically involve storing metadata like dependencies as -a (runtime) list variable with a conventional name, such as:: +This would typically involve storing metadata as multiple special variables, +such as the following. - __requires__ = [ +.. code:: python + + __requires_python__ = ">=3.11" + __dependencies__ = [ "requests", "click", ] @@ -477,38 +515,14 @@ data needs a Python parser, the parser will need to know which version of Python the script is written for, and the overhead for a generic tool of having a parser that can handle *multiple* versions of Python is unsustainable. -Even if the above issues could be addressed, the format would give the -impression that the data could be altered at runtime. However, this is not the -case in general, and code that tries to do so will encounter unexpected and -confusing behaviour. - -And finally, there is no evidence that having metadata available at -runtime is of any practical use for scripts. Should such a use be found, -it is simple enough to get the data by parsing the value as TOML. +With this approach there is the potential to clutter scripts with many variables +as new extensions get added. Additionally, intuiting which metadata fields +correspond to which variable names would cause confusion for users. It is worth noting, though, that the ``pip-run`` utility does implement (an extended form of) this approach. `Further discussion `_ of the ``pip-run`` design is available on the project's issue tracker. -Should scripts be able to specify a package index? --------------------------------------------------- - -Dependency metadata is about *what* package the code depends on, and not *where* -that package comes from. There is no difference here between metadata for -scripts, and metadata for distribution packages (as defined in -``pyproject.toml``). In both cases, dependencies are given in "abstract" form, -without specifying how they are obtained. - -Some tools that use the dependency information may, of course, need to locate -concrete dependency artifacts - for example if they expect to create an -environment containing those dependencies. But the way they choose to do that -will be closely linked to the tool's UI in general, and this PEP does not try to -dictate the UI for tools. - -There is more discussion of this point, and in particular of the UI choices made -by the ``pip-run`` tool, in `the previously mentioned pip-run issue `_. - What about local dependencies? ------------------------------ From f683a74984011b14bd27a7c813117f597963df0f Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Sun, 6 Aug 2023 13:34:55 -0400 Subject: [PATCH 4/7] address final feedback --- .github/CODEOWNERS | 2 +- pep-0723.rst | 11 ++++++++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 2f8c175b53d..f4e0840981e 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -601,7 +601,7 @@ pep-0719.rst @Yhg1s pep-0720.rst @FFY00 pep-0721.rst @encukou pep-0722.rst @pfmoore -pep-0723.rst +pep-0723.rst @AA-Turner # ... # pep-0754.txt # ... diff --git a/pep-0723.rst b/pep-0723.rst index 995a65478ab..016c3664892 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -1,6 +1,7 @@ PEP: 723 Title: Embedding pyproject.toml in single-file scripts Author: Ofek Lev +Sponsor: Adam Turner PEP-Delegate: Brett Cannon Discussions-To: https://discuss.python.org/t/30979 Status: Draft @@ -15,8 +16,9 @@ Replaces: 722 Abstract ======== -This PEP specifies a format for defining metadata about single-file Python -scripts that is required for proper runtime execution. +This PEP specifies a metadata format that can be embedded in single-file Python +scripts to assist launchers, IDEs and other external tools which may need to +interact with such scripts. Motivation @@ -168,10 +170,13 @@ raised in the Rust pre-RFC. Reference Implementation ======================== -This is the canonical regular expression that may be used to parse the metadata:: +This regular expression may be used to parse the metadata:: (?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$ +In circumstances where there is a discrepancy between the regular expression +and the text specification, the text specification takes precedence. + The following is an example of how to read the metadata on Python 3.11 or higher. From 3aee5c80bcf3f86b25d1b10a973ff564ca3c6c73 Mon Sep 17 00:00:00 2001 From: Adam Turner <9087854+aa-turner@users.noreply.github.com> Date: Sun, 6 Aug 2023 18:55:23 +0100 Subject: [PATCH 5/7] Wrap to 79 --- pep-0723.rst | 180 +++++++++++++++++++++++++++------------------------ 1 file changed, 97 insertions(+), 83 deletions(-) diff --git a/pep-0723.rst b/pep-0723.rst index 016c3664892..c0f00376da1 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -53,11 +53,13 @@ This PEP defines a mechanism for embedding metadata *within the script itself*, and not in an external file. We choose to follow the latest developments of other modern packaging -ecosystems (namely `Rust `__ -and `Go `__) by embedding the existing +ecosystems (namely `Rust`__ and `Go`__) by embedding the existing `metadata standard `_ that is used to describe projects. +__ https://github.com/rust-lang/rfcs/blob/master/text/3424-cargo-script.md +__ https://github.com/erning/gorun + The format is intended to bridge the gap between different types of users of Python. Knowledge of how to write project metadata will be directly transferable to all use cases, whether writing a script or maintaining a @@ -77,8 +79,8 @@ to transition to a more formal project if their idea works out. This intermediate script stage would be very useful to have fully reproducible bug reports. By using the same metadata format, the user can simply copy and paste the metadata into a ``pyproject.toml`` file and continue working without having -to learn a new format. More likely, even, is that tooling will eventually support -this transformation with a single command. +to learn a new format. More likely, even, is that tooling will eventually +support this transformation with a single command. Specification @@ -98,8 +100,8 @@ declaration. If they choose not to do so, they MUST process the file as UTF-8. This document MAY include the ``[project]`` and ``[tool]`` tables but MUST NOT define the ``[build-system]`` table. The ``[build-system]`` table MAY be -allowed in a future PEP that standardizes how backends are to build distributions -from single file scripts. +allowed in a future PEP that standardizes how backends are to build +distributions from single file scripts. The ``[project]`` table differs in the following ways: @@ -153,26 +155,28 @@ version of ``pyproject.toml``, which is called ``Cargo.toml``: println!("Did our date match? {}", re.is_match("2014-01-01")); } -One important thing to note is that the metadata is embedded in a comment mostly -for introspection since Rust documentation is generated from comments. Another -is that users rarely edit dependencies manually, but rather use their Cargo -package manager. +One important thing to note is that the metadata is embedded in a comment +mostly for introspection since Rust documentation is generated from comments. +Another is that users rarely edit dependencies manually, but rather use their +Cargo package manager. We argue that our choice, in comparison to the Rust format, is easier to read and provides easier edits for humans by virtue of the contents starting at the -beginning of lines so would precisely match the contents of a ``pyproject.toml`` -file. It is also is easier for tools to parse and modify this continuous block -of text which was -`one of the concerns `__ -raised in the Rust pre-RFC. +beginning of lines so would precisely match the contents of a +``pyproject.toml`` file. It is also is easier for tools to parse and modify +this continuous block of text which was `one of the concerns`__ raised in the +Rust pre-RFC. +__ https://github.com/epage/cargo-script-mvs/blob/main/0000-cargo-script.md#embedded-manifest-format Reference Implementation ======================== -This regular expression may be used to parse the metadata:: +This regular expression may be used to parse the metadata: + +.. code:: text - (?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$ + (?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$ In circumstances where there is a discrepancy between the regular expression and the text specification, the text specification takes precedence. @@ -213,15 +217,20 @@ Backwards Compatibility ======================= At the time of writing, the ``__pyproject__`` variable only appears five times -`on GitHub `__ and four of -those belong to a user who appears to already be using this PEP's exact format. +`on GitHub`__ and four of those belong to a user who appears to already be +using this PEP's exact format. + +__ https://github.com/search?q=__pyproject__&type=code -For example, `this script `__ -uses ``matplotlib`` and ``pandas`` to plot a timeseries. It is a good example -of a script that you would see in the wild: self-contained and short. +For example, `this script`__ uses ``matplotlib`` and ``pandas`` to plot a +timeseries. It is a good example of a script that you would see in the wild: +self-contained and short. + +__ https://github.com/cjolowicz/scripts/blob/31c61e7dad8d17e0070b080abee68f4f505da211/python/plot_timeseries.py This user's tooling invokes scripts by creating a project at runtime using the -embedded metadata and then uses an entry point that references the main function. +embedded metadata and then uses an entry point that references the main +function. This PEP allows this user's tooling to remove that extra step of indirection. @@ -307,8 +316,8 @@ Why not use a comment block resembling requirements.txt? This PEP considers there to be different types of users for whom Python code would live as single-file scripts: -* Non-programmers who are just using Python as a scripting language to achieve a - specific task. These users are unlikely to be familiar with concepts of +* Non-programmers who are just using Python as a scripting language to achieve + a specific task. These users are unlikely to be familiar with concepts of operating systems like shebang lines or the ``PATH`` environment variable. Some examples: @@ -325,8 +334,8 @@ would live as single-file scripts: * Non-programmers who manage operating systems e.g. a sysadmin. These users are able to set up ``PATH``, for example, but are unlikely to be familiar with Python concepts like virtual environments. These users often operate in - isolation and have limited need to gain exposure to tools intended for sharing - like Git. + isolation and have limited need to gain exposure to tools intended for + sharing like Git. * Programmers who manage operating systems/infrastructure e.g. SREs. These users are not very likely to be familiar with Python concepts like virtual environments, but are likely to be familiar with Git and most often use it @@ -371,16 +380,18 @@ small subset of users. Additionally, for maintenance of their systems ``__pyproject__`` would be much easier to search for from a shell than a block comment with potentially numerous extensions over time. -* For the SRE types, they are likely to be familiar with TOML already from other - projects that they might have to work with like configuring the - `GitLab Runner `__ - or `Cloud Native Buildpacks `__. +* For the SRE types, they are likely to be familiar with TOML already from + other projects that they might have to work with like configuring the + `GitLab Runner`__ or `Cloud Native Buildpacks`__. + + __ https://docs.gitlab.com/runner/configuration/advanced-configuration.html + __ https://buildpacks.io/docs/reference/config/ These users are responsible for the security of their systems and most likely have security scanners set up to automatically open PRs to update versions - of dependencies. Such automated tools like Dependabot would have a much easier - time using existing TOML libraries than writing their own custom parser for a - block comment format. + of dependencies. Such automated tools like Dependabot would have a much + easier time using existing TOML libraries than writing their own custom + parser for a block comment format. * For the programmer types, they are more likely to be familiar with TOML than they have ever seen a ``requirements.txt`` file, unless they are a Python programmer who has had previous experience with writing applications. @@ -393,14 +404,16 @@ small subset of users. easily than each writing custom logic for this feature. Additionally, the block comment format goes against the recommendation of -:pep:`008`:: +:pep:`8`: Each line of a block comment starts with a ``#`` and a single space (unless it is indented text inside the comment). [...] Paragraphs inside a block comment are separated by a line containing a single ``#``. Linters and IDE auto-formatters that respect this long-time recommendation -would fail by default. The following uses the example from :pep:`722`:: +would fail by default. The following uses the example from :pep:`722`: + +.. code:: bash $ flake8 . .\script.py:3:1: E266 too many leading '#' for block comment @@ -412,10 +425,10 @@ Why not consider scripts as projects without wheels? ---------------------------------------------------- There is `an ongoing discussion `_ about how to -use ``pyproject.toml`` for projects that are not intended to be built as wheels. -Although the outcome of that will likely be that the project name and version -become optional in certain circumstances, this PEP considers the discussion only -tangentially related. +use ``pyproject.toml`` for projects that are not intended to be built as +wheels. Although the outcome of that will likely be that the project name and +version become optional in certain circumstances, this PEP considers the +discussion only tangentially related. The use case described in that thread is primarily talking about projects that represent applications like a Django app or a Flask app. These projects are @@ -445,20 +458,20 @@ Expecting such users to learn the complexities of Python packaging is a significant step up in complexity, and would almost certainly give the impression that "Python is too hard for scripts". -In addition, if the expectation here is that the ``pyproject.toml`` will somehow -be designed for running scripts in place, that's a new feature of the standard -that doesn't currently exist. At a minimum, this isn't a reasonable suggestion -until the `current discussion on Discourse `_ about -using ``pyproject.toml`` for projects that won't be distributed as wheels is -resolved. And even then, it doesn't address the "sending someone a script in a -gist or email" use case. +In addition, if the expectation here is that the ``pyproject.toml`` will +somehow be designed for running scripts in place, that's a new feature of the +standard that doesn't currently exist. At a minimum, this isn't a reasonable +suggestion until the `current discussion on Discourse +`_ about using ``pyproject.toml`` for projects that +won't be distributed as wheels is resolved. And even then, it doesn't address +the "sending someone a script in a gist or email" use case. Why not use a requirements file for dependencies? ------------------------------------------------- -Putting your requirements in a requirements file, doesn't require a PEP. You can -do that right now, and in fact it's quite likely that many adhoc solutions do -this. However, without a standard, there's no way of knowing how to locate a +Putting your requirements in a requirements file, doesn't require a PEP. You +can do that right now, and in fact it's quite likely that many adhoc solutions +do this. However, without a standard, there's no way of knowing how to locate a script's dependency data. And furthermore, the requirements file format is pip-specific, so tools relying on it are depending on a pip implementation detail. @@ -469,31 +482,32 @@ So in order to make a standard, two things would be required: 2. A standard for how to locate the requiements file for a given script. The first item is a significant undertaking. It has been discussed on a number -of occasions, but so far no-one has attempted to actually do it. The most likely -approach would be for standards to be developed for individual use cases +of occasions, but so far no-one has attempted to actually do it. The most +likely approach would be for standards to be developed for individual use cases currently addressed with requirements files. One option here would be for this PEP to simply define a new file format which is simply a text file containing -:pep:`508` requirements, one per line. That would just leave the question of how -to locate that file. - -The "obvious" solution here would be to do something like name the file the same -as the script, but with a ``.reqs`` extension (or something similar). However, -this still requires *two* files, where currently only a single file is needed, -and as such, does not match the "better batch file" model (shell scripts and -batch files are typically self-contained). It requires the developer to remember -to keep the two files together, and this may not always be possible. For -example, system administration policies may require that *all* files in a -certain directory are executable (the Linux filesystem standards require this of -``/usr/bin``, for example). And some methods of sharing a script (for example, -publishing it on a text file sharing service like Github's gist, or a corporate -intranet) may not allow for deriving the location of an associated requirements -file from the script's location (tools like ``pipx`` support running a script -directly from a URL, so "download and unpack a zip of the script and its -dependencies" may not be an appropriate requirement). +:pep:`508` requirements, one per line. That would just leave the question of +how to locate that file. + +The "obvious" solution here would be to do something like name the file the +same as the script, but with a ``.reqs`` extension (or something similar). +However, this still requires *two* files, where currently only a single file is +needed, and as such, does not match the "better batch file" model (shell +scripts and batch files are typically self-contained). It requires the +developer to remember to keep the two files together, and this may not always +be possible. For example, system administration policies may require that *all* +files in a certain directory are executable (the Linux filesystem standards +require this of ``/usr/bin``, for example). And some methods of sharing a +script (for example, publishing it on a text file sharing service like Github's +gist, or a corporate intranet) may not allow for deriving the location of an +associated requirements file from the script's location (tools like ``pipx`` +support running a script directly from a URL, so "download and unpack a zip of +the script and itsdependencies" may not be an appropriate requirement). Essentially, though, the issue here is that there is an explicitly stated -requirement that the format supports storing dependency data *in the script file -itself*. Solutions that don't do that are simply ignoring that requirement. +requirement that the format supports storing dependency data *in the script +file itself*. Solutions that don't do that are simply ignoring that +requirement. Why not use (possibly restricted) Python syntax? ------------------------------------------------ @@ -512,17 +526,17 @@ such as the following. The most significant problem with this proposal is that it requires all consumers of the dependency data to implement a Python parser. Even if the syntax is restricted, the *rest* of the script will use the full Python syntax, -and trying to define a syntax which can be successfully parsed in isolation from -the surrounding code is likely to be extremely difficult and error-prone. +and trying to define a syntax which can be successfully parsed in isolation +from the surrounding code is likely to be extremely difficult and error-prone. Furthermore, Python's syntax changes in every release. If extracting dependency -data needs a Python parser, the parser will need to know which version of Python -the script is written for, and the overhead for a generic tool of having a -parser that can handle *multiple* versions of Python is unsustainable. +data needs a Python parser, the parser will need to know which version of +Python the script is written for, and the overhead for a generic tool of having +a parser that can handle *multiple* versions of Python is unsustainable. -With this approach there is the potential to clutter scripts with many variables -as new extensions get added. Additionally, intuiting which metadata fields -correspond to which variable names would cause confusion for users. +With this approach there is the potential to clutter scripts with many +variables as new extensions get added. Additionally, intuiting which metadata +fields correspond to which variable names would cause confusion for users. It is worth noting, though, that the ``pip-run`` utility does implement (an extended form of) this approach. `Further discussion `_ of @@ -533,10 +547,10 @@ What about local dependencies? These can be handled without needing special metadata and tooling, simply by adding the location of the dependencies to ``sys.path``. This PEP simply isn't -needed for this case. If, on the other hand, the "local dependencies" are actual -distributions which are published locally, they can be specified as usual with a -:pep:`508` requirement, and the local package index specified when running a -tool by using the tool's UI for that. +needed for this case. If, on the other hand, the "local dependencies" are +actual distributions which are published locally, they can be specified as +usual with a :pep:`508` requirement, and the local package index specified when +running a tool by using the tool's UI for that. Open Issues =========== From 930dd9b8bc22ff5a800ef787607b8a3c20527355 Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Sun, 6 Aug 2023 14:10:38 -0400 Subject: [PATCH 6/7] final feedback before merge --- pep-0723.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/pep-0723.rst b/pep-0723.rst index c0f00376da1..74e2f5152de 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -3,13 +3,14 @@ Title: Embedding pyproject.toml in single-file scripts Author: Ofek Lev Sponsor: Adam Turner PEP-Delegate: Brett Cannon -Discussions-To: https://discuss.python.org/t/30979 +Discussions-To: https://discuss.python.org/t/31151 Status: Draft Type: Standards Track Topic: Packaging Content-Type: text/x-rst Created: 04-Aug-2023 Post-History: `04-Aug-2023 `__ + `06-Aug-2023 `__ Replaces: 722 @@ -426,9 +427,7 @@ Why not consider scripts as projects without wheels? There is `an ongoing discussion `_ about how to use ``pyproject.toml`` for projects that are not intended to be built as -wheels. Although the outcome of that will likely be that the project name and -version become optional in certain circumstances, this PEP considers the -discussion only tangentially related. +wheels. This PEP considers the discussion only tangentially related. The use case described in that thread is primarily talking about projects that represent applications like a Django app or a Flask app. These projects are From 0aba3a5909c76257e40c82217d767e888d92c4e3 Mon Sep 17 00:00:00 2001 From: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Date: Sun, 6 Aug 2023 19:11:18 +0100 Subject: [PATCH 7/7] comma --- pep-0723.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pep-0723.rst b/pep-0723.rst index 74e2f5152de..cef338337fd 100644 --- a/pep-0723.rst +++ b/pep-0723.rst @@ -9,8 +9,8 @@ Type: Standards Track Topic: Packaging Content-Type: text/x-rst Created: 04-Aug-2023 -Post-History: `04-Aug-2023 `__ - `06-Aug-2023 `__ +Post-History: `04-Aug-2023 `__, + `06-Aug-2023 `__, Replaces: 722