-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for --scie-busybox
.
#2468
Conversation
This allows you to turn your PEX into a native BusyBox scie by specifying which entry points to expose as BusyBox commands.
Reviewers, a quickie before I hit the road. I'll be back on 7/25 to add tests. For example though: # Build a BusyBox PEX scie with 3 entry-points: 1 console script and 2 module eps from the stdlib:
:; python -mpex cowsay --scie lazy --scie-busybox cowsay,venv:venv,http-server:http.server -obusy-box
# Since the exe name does not match any ep, you get BusyBox help:
:; ./busy-box
Error: Could not determine which command to run.
Please select from the following boot commands:
cowsay
http-server
venv
You can select a boot command by setting the SCIE_BOOT environment variable or else by passing it as the 1st argument.
# Now use the BusyBox:
:; ./busy-box http-server -h
usage: server.py [-h] [--cgi] [--bind ADDRESS] [--directory DIRECTORY] [port]
positional arguments:
port specify alternate port (default: 8000)
options:
-h, --help show this help message and exit
--cgi run as CGI server
--bind ADDRESS, -b ADDRESS
specify alternate bind address (default: all interfaces)
--directory DIRECTORY, -d DIRECTORY
specify alternate directory (default: current directory)
# Now install a link farm for more convenient use (if the install directory is on your PATH anyhow):
:; SCIE=install ./busy-box -s /tmp/bin
:; ls -l /tmp/bin/
total 0
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 cowsay -> /home/jsirois/dev/pex-tool/pex/busy-box
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 http-server -> /home/jsirois/dev/pex-tool/pex/busy-box
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 venv -> /home/jsirois/dev/pex-tool/pex/busy-box
:; /tmp/bin/venv -h
usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear] [--upgrade] [--without-pip] [--prompt PROMPT] [--upgrade-deps] ENV_DIR [ENV_DIR ...]
Creates virtual Python environments in one or more target directories.
positional arguments:
ENV_DIR A directory to create the environment in.
options:
-h, --help show this help message and exit
--system-site-packages
Give the virtual environment access to the system site-packages dir.
--symlinks Try to use symlinks rather than copies, when symlinks are not the default for the platform.
--copies Try to use copies rather than symlinks, even when symlinks are the default for the platform.
--clear Delete the contents of the environment directory if it already exists, before environment creation.
--upgrade Upgrade the environment directory to use this version of Python, assuming Python has been upgraded in-place.
--without-pip Skips installing or upgrading pip in the virtual environment (pip is bootstrapped by default)
--prompt PROMPT Provides an alternative prompt prefix for this environment.
--upgrade-deps Upgrade core dependencies: pip setuptools to the latest version in PyPI
Once an environment has been created, you may wish to activate it, e.g. by sourcing an activate script in its bin directory.
# And always end on a Moo!:
:; mv busy-box cowsay
./cowsay -t Moo!
____
| Moo! |
====
\
\
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for the tests, but this LGTM!
@@ -30,6 +33,88 @@ class Value(Enum.Value): | |||
EAGER = Value("eager") | |||
|
|||
|
|||
@attr.s(frozen=True) | |||
class ConsoleScriptsManifest(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sureshjoshi I have not updated the --scie-busybox
help string yet to reflect this idea - I wanted to run it by you 1st:
You can now, for example, say:
python -m pex ansible --scie eager --scie-busybox *:* -o ansible
That nets you a binary that executes as ansible
, but allows SCIE=install ./ansible ...
to get the BusyBox of all ansible scripts. Now ansible is clean in terms of transitive deps not having unwanted scripts. But for cases where that is so, you can restrict to all scripts in a dist:
python -m pex ansible --scie eager --scie-busybox ansible:*,ansible-core:* -o ansible
Or use some combination of the above subtracting off individual scripts or all scripts from a distribution by prefixing with -
.
Good, bad or ugly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, switched from -
to !
so the CLI use is not a total PITA, but otherwise filled out help docs and tests. I think this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey sorry, just a heads up that I just made it down to the states for a vacation, so I wouldn't be able to review anything until I'm back in the cold north
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd defer to @sureshjoshi but this seems like reasonable syntax to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the <project-name>:*
and *:*
are excessive, (the *
's are not needed at all for parsing disambiguation) but my thought was just <project-name>:
and :
respectively would confuse people (they did enough in Pants-land). I think *
as the wildcard is more expected and neither <project-name>:*
nor *:*
are likely to match in the current directory; so they are not shell-onerous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding flags, this un-overloads syntax. : used to mean many things, now it means just index a function in a module, which is pre-existing Python-ecosystem syntax.
Good approach, I like it!
With that change, I'm happy where this landed. Either of *@*
or :all:@:all:
are fine by me, definitely agree that :all:
doesn't look particularly nice... tolerable (to me), but not nice.
I'm unsure where we landed with validation of the @
-specified module scripts, but definitely 100% happy to go with *
if there is validation. And like 80% happy to go with *
without the validation. I.e. happy with *
either way 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually lean towards dropping the wildcard with the switch from :
-> (=
, :
, @
). There is no need for a wildcard (just as there was none for :
), but I think @ansible-core
, !@ansible-core
, and (less so) @
all make sense on their own side-stepping the shell potentially turning *
into invalid entry points. As I said above though, I'm writing up a documentation page to go with this PR; so we'll see how the explanation reads there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hoisted on my own petard. Can't use @
since argparse is configured with @fromfile
support via a user contribution in 2021. I'll noodle a bit more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring the pre-compile bindings, this looks like a use case for the adhoc=module.deeper:func style, am I correct?
Aha, yes - for the "seed" command, not the default command. There were 2 things going on in that second example and I didn't read closely enough. So, you'd run:
pex uvicorn -c uvicorn --inject-args "apigateway.main:app --port 7999" --scie eager --scie-busybox seed=seeder:main,uvicorn -o uvicorn
I've just finished the --inject-args / --inject-python-args / --inject-env support, which was a bug to be missing under the "behaves just like the PEX" mantra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. @sureshjoshi, @huonw and @benjyw probably worth another look. I've switched to @
syntax, made the needed change in the Pex argparse setup to still support @fromfiles, fixed --inject*
propagation to the matching BusyBox command, and added a new doc page rendered here: https://jsirois.github.io/pex/scie.html.
Alright reviewers, I got to this earlier than expected and I think its good to go for a final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice feature. NB. I haven't reviewed the functionality in detail.
@@ -30,6 +33,88 @@ class Value(Enum.Value): | |||
EAGER = Value("eager") | |||
|
|||
|
|||
@attr.s(frozen=True) | |||
class ConsoleScriptsManifest(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huonw ... Any thoughts on this?
If you're asking about *
and escaping, I'll think out loud:
- There is a risk of people writing a script that doesn't escape properly and it seeming to work until they add a file with an unlucky name.
- If one added
foo:bar
, the immediate consequence would be expanding*:*
to--scie-busybox foo:bar
. - It looks like this would be silently accepted and result in a pex with an unexpected configuration (at least, I don't see validation that the project name
foo
is understood, but maybe I missed it). - Having validation that all input project names are known distributions would reduce that consequence; is that possible?
- Alternatively, use some other shell-safe signifier. I don't have great ideas, only
...
, e.g.ansible:...
or...:...
but that's pretty ugly.*
matches my expectations more.
Summary: I can't think of a better *
I like, but maybe we could validate the foo:bar
values are actually known entrypoints (at least the project name) to reduce the consequence of problems here?
type=str, | ||
default=[], | ||
action="append", | ||
help=( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor feedback: these docs look good, but they're a bit of a wall of text that'll usually be read in the context of a terminal without any formatting aids, and thus I'd expect most people (even well-intentioned diligent ones) to fail to synthesise the information out of it.
Is it possible to make it easier for users by breaking it up into paragraphs and/or lists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not - Argparse doesn't support formatting I know of. I've both avoided spending time on any bespoke argument system to remedy this or turning towards documentation, which needs major work going on 10 years now. One or both are coming, but I'm prioritizing robust users so far in favor of righting the Pex ship going on 8 years. I do finally feel close though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Thanks for the background. 👍
if qualname_separator: | ||
for attr in qualname.split('.'): | ||
entry_point = getattr(entry_point, attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The re-factoring of {EntryPoint,CallableEntryPoint}
-> NamedEntryPoint <<containing>> {ModuleEntryPoint,CallableEntryPoint}
highlighted that this generated console script was wrong for modules (i.e.: when this if branch is not taken) since a module is not a callable! It turns out that in practice, console script entrypoints are always functions (callables) and not modules and so no bug reports. That said, I switched Pex's own pex
entrypoint from a function to the pex
module in its own pyproject.toml
and verified that the pex wheel still builds fine and then a Pex PEX venv built from that wheel had a pex
console script that then failed. Thus, the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, only minor feedback
@sureshjoshi I'd like to proceed with this by tomorrow (August 1st) evening. Let me know if you need more time to review; otherwise, I'll assume you're fine with the |
@jsirois I'm unavailable for any in-depth review until probably Tuesday/Wednesday - but I think we've converged on something and the docs make sense to me |
This allows you to turn your PEX into a native BusyBox scie by
specifying which entry points to expose as BusyBox commands.