Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PEP 723 support (script dependencies in a TOML block) #96

Merged
merged 7 commits into from
Jan 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,12 @@ variable or ``# Requirements:`` section to the script:
# Requirements:
# requests

# or (PEP 723)

# /// script
# dependencies = ['requests']
# ///

import requests

req = requests.get('https://pypi.org/project/pip-run')
Expand Down
1 change: 1 addition & 0 deletions newsfragments/+96.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add support for script dependencies in a TOML block per PEP 723.
8 changes: 8 additions & 0 deletions pip_run/compat/py310.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""
Compatibility for Python 3.10 and earlier.
"""

try:
import tomllib # type: ignore
except ImportError: # pragma: no cover
import tomli as tomllib # type: ignore
37 changes: 36 additions & 1 deletion pip_run/scripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
from jaraco.context import suppress
from jaraco.functools import compose

from .compat.py310 import tomllib


ValidRequirementString = compose(str, packaging.requirements.Requirement)


Expand Down Expand Up @@ -73,7 +76,39 @@ def search(cls, params):
return cls.try_read(next(files, None)).params()

def read(self):
return self.read_comments() or self.read_python()
return self.read_toml() or self.read_comments() or self.read_python()

def read_toml(self):
r"""
>>> DepsReader('# /// script\n# dependencies = ["foo", "bar"]\n# ///\n').read()
['foo', 'bar']
>>> DepsReader('# /// pyproject\n# dependencies = ["foo", "bar"]\n# ///\n').read_toml()
[]
>>> DepsReader('# /// pyproject\n#dependencies = ["foo", "bar"]\n# ///\n').read_toml()
[]
>>> DepsReader('# /// script\n# dependencies = ["foo", "bar"]\n').read_toml()
[]
>>> DepsReader('# /// script\n# ///\n\n# /// script\n# ///').read_toml()
Traceback (most recent call last):
...
ValueError: Multiple script blocks found
"""
TOML_BLOCK_REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)*)^# ///$'
name = 'script'
matches = list(
filter(lambda m: m.group('type') == name, re.finditer(TOML_BLOCK_REGEX, self.script))
)
if len(matches) > 1:
raise ValueError(f'Multiple {name} blocks found')
elif len(matches) == 1:
content = ''.join(
line[2:] if line.startswith('# ') else line[1:]
for line in matches[0].group('content').splitlines(keepends=True)
)
deps = tomllib.loads(content).get("dependencies", [])
else:
deps = []
Comment on lines +98 to +110
Copy link
Contributor

@bswck bswck Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance gain from filter() is already lost due to lambda. I'd suggest using a clean generator expression to enhance readability.

In the suggested change, because only one match will be needed, we take only one. Next up, we check if there's another one left and raise immediately, not knowing about the rest of the matches as we don't need them.
It's also safer to use .get("dependencies") or [] instead of .get("dependencies", []) in case "dependencies" is defined as null; and if there's an empty list already, we save time on not creating another default empty list.

line[2:] if line.startswith('# ') else line[1:] can also either be replaced with line[2 if line.startswith('# ') else 1:] or line[1+line.startswith('# '):].

Suggested change
matches = list(
filter(lambda m: m.group('type') == name, re.finditer(TOML_BLOCK_REGEX, self.script))
)
if len(matches) > 1:
raise ValueError(f'Multiple {name} blocks found')
elif len(matches) == 1:
content = ''.join(
line[2:] if line.startswith('# ') else line[1:]
for line in matches[0].group('content').splitlines(keepends=True)
)
deps = tomllib.loads(content).get("dependencies", [])
else:
deps = []
deps = []
iter_matches = (
m for m in re.finditer(TOML_BLOCK_REGEX, self.script)
if m.group('type') == name
)
match = next(iter_matches, None)
if match:
if any(iter_matches): # Check if there are any more matches left.
raise ValueError(f'Multiple {name} blocks found')
content = ''.join(
line[1 + line.startswith('# '):]
for line in match.group('content').splitlines(keepends=True)
)
deps[:] = tomllib.loads(content).get("dependencies") or ()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taken straight from the PEP, where it's the canonical implementation of parsing. I'd rather not mess with it, simply because I don't see the benefit in risking the possibility that we introduce bugs by doing so.

Copy link

@ofek ofek Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the author, I am a strong -1 on this change because it is not as easy to translate into other languages as before.

edit: I didn't realize which repository this was, feel free to do as you wish if it works but I agree with Paul that there is an inherent risk of introducing a bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the author, I am a strong -1 on this change because it is not as easy to translate into other languages as before.

Thank you for your feedback. I was suggesting the change for pip-run specifically. Is there any other reason you don't like the suggestions?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope sorry about that, as I mentioned in the edit to my comment feel free to do as you wish!

Copy link
Contributor

@bswck bswck Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested my changes against tests in this PR and observed no regression. In my opinion, a more detailed analysis of the change can disprove the concern regarding the risk of introducing a bug.

Copy link
Contributor

@bswck bswck Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is more, the suggested change costlessly handles an edge case (It's also safer to use .get("dependencies") or [] instead of .get("dependencies", []) in case "dependencies" is defined as null) not handled by the original version which does assume the input data might be invalid (judging by tomllib.loads(content).get("dependencies") used instead of tomllib.loads(content)["dependencies"]).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for joining an already crowded conversation, but @bswck asked for my thoughts, so here they are:

  1. I do think it's worth checking that dependencies is a list of strings. Currently it looks like the user will get a confusing traceback if it's null or a number. Most importantly, if it's a single string, then each character will be treated as a dependency.
  2. The rest of the suggestion doesn't seem to have any significant benefit and isn't worth this much discussion. If @pfmoore doesn't want to change this or talk about it then I support that, I'm sure he has plenty of things to do.
  3. Checking the length of a list is vastly more readable than using next/any. I don't think it makes sense to worry about creating a big list of matches. If it does, I'd suggest using islice instead like here.
  4. deps[:] = instead of deps = is jarring and weird to look at. And even if deps[:] = () is faster than deps = [] (is it?), even if the difference was significant (it obviously isn't), setting deps = () would be fine anyway.
  5. line[1 + line.startswith('# '):] is fun and clever but not super readable. line[2 if line.startswith('# ') else 1:] is maybe an improvement but it's subjective and insignificant.
  6. I do prefer a list comprehension over filter/lambda, but 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can find the time to review the various suggestions. TBH, this was very much a "drive by" PR, motivated by the fact that I use pip-run and I wanted to ensure that it supported the new standard. Copying the reference implementation was the fastest way to achieve that. I didn't want to spend a lot of time over details that could be hashed out later.

In many ways I'd argue that a robust PEP 723 parser should go in a library somewhere, although I can sympathise if people don't want a new dependency just for that one thing. I'm a little bit sad if we end up with lots of different implementations of the parsing code, all with their own quirks and trade-offs.

My recommendation would be that unless I get the time to do an update, the code can go in as is, and it can be fixed in a follow-up PR. I'm absolutely not going to get upset about someone updating the code in a follow-up.

return Dependencies.load(deps)

def read_comments(self):
r"""
Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ install_requires =
jaraco.context
jaraco.text
platformdirs
tomli; python_version < "3.11"
importlib_resources; python_version < "3.9"
jaraco.functools >= 3.7
jaraco.env
Expand Down
13 changes: 13 additions & 0 deletions tests/test_scripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,19 @@ def test_comment_style(self):
reqs = scripts.DepsReader(script).read()
assert reqs == ['foo==3.1']

def test_toml_style(self):
script = textwrap.dedent(
"""
#! shebang

# /// script
# dependencies = ["foo == 3.1"]
# ///
"""
)
reqs = scripts.DepsReader(script).read()
assert reqs == ['foo==3.1']

def test_search_long_parameter(self):
"""
A parameter that is too long to be a filename should not fail.
Expand Down