Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -t/--target flag for uv pip install #1517

Closed
imdoroshenko opened this issue Feb 16, 2024 · 32 comments · Fixed by #3257
Closed

Add -t/--target flag for uv pip install #1517

imdoroshenko opened this issue Feb 16, 2024 · 32 comments · Fixed by #3257
Labels
compatibility Compatibility with another interface e.g. `pip` enhancement New feature or request

Comments

@imdoroshenko
Copy link

imdoroshenko commented Feb 16, 2024

This option is used a lot for vendoring and for custom build systems.

From pip install --help:

  -t, --target <dir>          Install packages into <dir>. By
                              default this will not replace
                              existing files/folders in <dir>. Use
                              --upgrade to replace existing
                              packages in <dir> with new versions.
@zanieb zanieb added enhancement New feature or request compatibility Compatibility with another interface e.g. `pip` labels Feb 16, 2024
@pfmoore
Copy link
Contributor

pfmoore commented Feb 17, 2024

I'd recommend not following pip's design here, as --target has some undesirable behaviours (it doesn't properly support uninstalls and upgrades, for example, even though the docs suggest it's OK). The use case is important to support, but I'd design a solution from scratch rather than simply taking pip's approach.

In particular, pip's choice of where to put binaries, scripts and include files is essentially arbitrary and may not be the best approach.

@rgilton
Copy link

rgilton commented Feb 17, 2024

I'd like to add that a key application of this is to enable the building of standalone zipapps that contain all of their dependencies. I have some workflows that use pip at the moment for this, and they would be massively accelerated if uv supported -t.

@jcollingj
Copy link

Added a thumbs up. I am also currently using -t from pip and would love to use uv for that use case instead!

@imdoroshenko
Copy link
Author

I'd recommend not following pip's design here, as --target has some undesirable behaviours (it doesn't properly support uninstalls and upgrades, for example, even though the docs suggest it's OK). The use case is important to support, but I'd design a solution from scratch rather than simply taking pip's approach.

In particular, pip's choice of where to put binaries, scripts and include files is essentially arbitrary and may not be the best approach.

As mentioned by other people, -t is mainly used in a very controlled way that does not expect uninstalls or upgrades. Like in vendoring tools (example) and build systems for Zip Apps. Or, In my case, for an in-house build system.

Creating another solution from scratch is OK, but then UV should call itself "Opinionated replacement for pip" rather than drop-in replacement.

@pfmoore
Copy link
Contributor

pfmoore commented Feb 19, 2024

As mentioned by other people, -t is mainly used in a very controlled way that does not expect uninstalls or upgrades.

That doesn't match with our experience on pip, where we have had a number of requests for better support for upgrades, etc. Another issue is pypa/pip#10110, which seems to revolve around people incrementally adding to a directory populated using -t.

But if uv is willing to only support the case where the directory specified in -t is empty, then yes, this is fine.

but then UV should call itself "Opinionated replacement for pip" rather than drop-in replacement.

I don't think uv is a "drop in replacement" for pip, nor should it try to be one. It should aim to be the best installer it can, and not be constrained by pip's legacy behaviours.

@shayanhoshyari
Copy link

shayanhoshyari commented Feb 25, 2024

I also heavily use this feature to make stand alone envs. In my use case they are used to bootstrap workers in a cloud service. I too am only costumer of this use case the directory specified in -t is empty.

Other past alternatives were conda pack and embedding in container which were both even slower than pip!

Now desperately looking at this issue to see when I can reduce our 40 min env packing time when everything changes (there are various envs for various workers) to minutes or perhaps seconds :)

For context, we already started using uv for our dev env pip sync (which does not need -t support), and it went from 6 min to 16 sec. pip compile time went from 2-3 min to a few secs! ❤️

@matthew-chambers-pushly

We utilize pip's -t | --target to vendor packages for Lambda@EDGE deployments, it would be nice to have this supported as an option, or to have some way of programmatically copying the packages from the cache without needing to script it ourselves.

@Tankanow
Copy link

Tankanow commented Mar 5, 2024

I second @matthew-chambers-pushly, @shayanhoshyari, et al. We do something similar to package for AWS Lambda.

I think the general use case is:

I would like to use uv outside a virtual environment to install dependencies into a directory consumable by common Python runtimes. This can be vendored format or otherwise, though vendored format may be useful for backwards compatibility.

Essentially what I want is something equivalent to this script:

function uv.target(){
  local target=${1?} ; shift
  local args=("$@")
  uv venv
  # shellcheck source=/dev/null
  source ".venv/bin/activate"
  uv pip install "${args[@]}"
  rm -rf "${target}" || true
  mkdir -p "${target}/"
  cp -r .venv/bin "${target}/"
  cp -r .venv/lib/python3.*/site-packages/* "${target}/"
  deactivate || true
  rm -rf .venv
}

@mpderbec
Copy link

mpderbec commented Apr 1, 2024

But if uv is willing to only support the case where the directory specified in -t is empty, then yes, this is fine.

FWIW, this matches our use case: The directory is empty. (Largely because of all the limitations described above regarding upgrade/uninstall.)

@henryiii
Copy link
Contributor

henryiii commented Apr 2, 2024

This is the recipe for zip apps:

$ mkdir tmp
$ pip install --no-compile --target=tmp . dep_1 dep_2
$ cd tmpdir
$ python -m zipapp --compress --python=/usr/bin/env python3 --main=myapp.__main__:main --output=../myapp.pyz

Would be quite happy in this case to have the extra dirs (like bin) simply ignored, since they won't show up in the zipapp.

@acatalucci-synth
Copy link

without --prefix or --target it's really challenging to build multi stage images, my use case is copying packages to the final image. is there any workaround, waiting for any of these mechanisms to be supported?

@pfmoore
Copy link
Contributor

pfmoore commented Apr 18, 2024

is there any workaround

It depends how careful you want to be. If you don't mind being somewhat imprecise, the script from the comment above looks reasonable. It's Unix-specific and would need some tweaking for Windows because the lib directory on Windows is structured differently. But it's likely good enough for most use cases.

If you want to handle wheels that have both purelib and platlib sections, on a target system where purelib != platlib, you might need some extra complexity. If you want to install other sections of the wheel, like headers, scripts or data, or you want to handle script wrappers (usually placed outside of the lib directory), then you'll need to do some more work. But these are the sorts of complicated questions that are part of the reason that (a) this isn't a simple feature to add, and (b) I'm advising the uv developers not to blindly follow what pip does (because I'm not sure pip gets the answers to these questions right...)

@acatalucci-synth
Copy link

acatalucci-synth commented Apr 22, 2024

Thanks for the extensive and fast answer! I'm basically trying to understand best practice for docker multi stage builds and uv, the script above won't quite do it as i cannot activate virtualenvs straight from dockerfiles. So far the only thing i managed to get working is:

FROM public.ecr.aws/docker/library/python:3.10-slim-bullseye AS build_artifacts
RUN pip install uv
RUN mkdir /workspace
RUN chown 1000 /workspace
WORKDIR /workspace
USER 1000
RUN uv venv
ENV PATH="/workspace/.venv/bin:$PATH"
COPY setup.py /workspace/
RUN uv pip install --no-cache-dir --quiet .

FROM public.ecr.aws/docker/library/python:3.10-slim-bullseye
RUN mkdir /workspace
RUN chown 1000 /workspace
WORKDIR /workspace
USER 1000
RUN mkdir /workspace/.venv
COPY --from=build_artifacts /workspace/.venv /workspace/.venv
ENV PATH="/workspace/.venv/bin:$PATH"
RUN python app.py

is this the recommended way of doing it? are there any caveats I might stumble onto? i've tried multiple avenues but the lack of support for both --user and --target closed off all the alternatives i could think about

@pfmoore
Copy link
Contributor

pfmoore commented Apr 22, 2024

I can't speak for uv, but my feeling is that there's no "recommended" way. If you want --target support, you can

  1. Use pip, and live with the fact that it's slower.
  2. Wait for uv to implement --target, which I'm sure won't be that far off given the pace of development here 🙂
  3. Work out some sort of solution that does what you want for now, and accept that it's clumsy and might break in edge cases.

I'm not able to comment on your solution, as I don't really know what "i cannot activate virtualenvs straight from dockerfiles" means. ENV PATH="/workspace/.venv/bin:$PATH" is pretty much all you need to activate a virtual environment anyway, and you could even just do RUN /workspace/.venv/bin/python app.py. Activation is only a command line convenience, after all...

@acatalucci-synth
Copy link

acatalucci-synth commented Apr 22, 2024

i meant that i cannot source the activate script and that's why i do that ENV-setting. From your answer i understand that this should do it for now (which apparently isn't going to be long! 👍 )

@Wurstnase
Copy link

You do not need to source the script. You can do export VIRTUAL_ENV=.venv to have something similar.

@konstin
Copy link
Member

konstin commented Apr 22, 2024

I second RUN /workspace/.venv/bin/python app.py, this also ensures that all subprocess calls have the correct python environment. @acatalucci-synth As for general docker+python best practices, your dockerfile looks great.

@acatalucci-synth
Copy link

thanks for the suggestions and compliments for the Dockerfile! I'll be waiting for --target support 👍

@Pixel-Minions
Copy link

Pixel-Minions commented Apr 22, 2024

Hi,

I am bringing a thumbs up for a target or equivalent workflow here. In my company we store independently each package, because we have a different package resolver. I am loving UV and I hope a workflow to offer freedom could be present in the roadmap.

@charliermarsh
Copy link
Member

I'm tempted to treat --target as, roughly, the root of a virtual environment, rather than its site-packages directory. I think that would lead to logical and consistent locations for binaries, purelib vs. platlib, etc.

@pfmoore
Copy link
Contributor

pfmoore commented Apr 23, 2024

That’s more like pip’s --root option. In my experience (and for my use cases) the key benefit of --target is that the directory specified is the one that is put directly on sys.path. The --root option (and it’s similar but subtly different partner --prefix) is far less commonly used than --target.

With your proposal I would need to:

  1. Install to a temporary target.
  2. Locate the site-packages (which is in an OS and interpreter dependent location).
  3. Move that site-packages to my final target.

This is basically what pip does internally to implement --target. (And is why anything beyond “install to an empty target directory” is incompletely supported).

@Scrat94
Copy link

Scrat94 commented Apr 23, 2024

@charliermarsh I would also like to share our use case and why your tempted approach might not be suitable for us.

For AWS Lambda Functions you need to bundle dependencies locally or in CI. For example if you need the requests library as part of your AWS Lambda Python Function, you need to install requests inside your Lambda Function Folder besides having it already in your .venv on root directory. That leads to the requirement of calling pip install -r "path-to-aws-function/requirements.txt" --target "path-to-aws-function".

We mainly use the --target in our CI to install the respective dependencies for the Lambda Function and deploy it to AWS. So locally we completely switched to UV already, but in our CI we rely on a hybrid approach right now (uv used to install all dependencies in order to run e.g. pytest and pip for installing dependencies in the target AWS Lambda Functions folder for deployment).

So its the last puzzle for us to get completely rid of pip. Big thanks to your effort, UV is already amazing and a productivity booster!

@charliermarsh
Copy link
Member

@pfmoore - is there anywhere that I can read about the differences between --prefix and --root, and what use-cases they're intended to support?

@pfmoore
Copy link
Contributor

pfmoore commented Apr 23, 2024

Nothing much that I know of beyond the bare pip help information here. They both come from the original distutils install schemes, as far as I know. There's some discussion of the --prefix scheme in the old distutils docs, here. The --root option was in setuptools, but as the command line interface is deprecated, I couldn't find any documentation that said anything more than what pip install --help has.

From my understanding, --prefix is important for redistributors creating a standalone installation in a "fake root". Things like RPM builds use this, I think. It's possible --root is used in that situation as well. To my knowledge, neither is commonly used by end users, whereas --target is very commonly used by end users, for vendoring dependencies, creating library directories for deployment in situations like webapps or serverless environments, building embedded Python environments, etc.

My view is that --target is by far the most important case to support, and the key aspect of --target is installing all importable parts of a package (purelib and platlib) into the specified directory directly. I don't think installing scripts or headers is commonly needed, and while wheels can contain arbitrary data files, I think most packages these days put their runtime data in the package structure and use importlib.resources to access it.

This is why I'd suggest that you start with a minimal approach that targets1 the known use cases, and extend it based on user feedback. So, for me, I'd go with:

  1. Install the purelib and platlib sections of the wheel to the target directory.
  2. Require the target to be empty before the install (you can create the target directory if it doesn't exist, but you should support an existing, but empty, directory).
  3. Don't support upgrades, uninstalls, or any other supporting features. No need for list, show, sync, or any other subcommands to grow a --target option.
  4. If you want to, install scripts/entry point wrappers to a bin directory in the target. That's purely for pip compatibility. I don't think it's needed, and the binaries almost certainly won't actually work, but it may make people a little more comfortable as they expect that from pip. But I'd personally prefer it if you took a stand and said these aren't needed or supported, so you won't include them.

To be honest, just implementing that shouldn't be too difficult. The hardest part would probably be integrating it into the code, as it's nothing like a "normal" install2.

Footnotes

  1. Excuse the pun 🙂

  2. My long term idea for pip was to replace all of --root, --prefix, --target with a very generalised approach that let the user specify exactly what to do with each of the different subtrees in the wheel (see the "Spread" part of the spec here). Then maybe add back the existing options as convenience wrappers over that. But there are so many special cases in the existing pip codebase that it may not ever be practical. You might be able to do something like that in uv, because you don't have all the historical baggage that pip does, but I haven't looked at that part of the uv code, so I don't know.

@acatalucci-synth
Copy link

thanks, that sounds great! on 4 i'd recommend to still copy binaries. One example is "ddtrace", it does include a binary that one is supposed to call to wrap the python app to add telemetry - with my approach above that seems to work.

@charliermarsh
Copy link
Member

Related to binaries, it looks like pip's --target will put anything in the data dir at the top-level (e.g., the notebook package has ../../share/jupyter/lab/schemas/@jupyter-notebook/application-extension/menus.json1 and friends, and those just end up at <target>/share/...).

@pfmoore
Copy link
Contributor

pfmoore commented Apr 23, 2024

Related to binaries, it looks like pip's --target will put anything in the data dir at the top-level

Correct. But I don't know if that is right or helpful, which is why I suggest not doing so until there's feedback from real use cases that confirms it works and is needed.

For example, I have no idea how Jupyter notebook could correctly find that path in a vendored situation - a brief search suggests that jupyterlab_widgets\__init__.py assumes the share directory will be located in sys.prefix, which simply won't be true in a vendored situation (or any situation I know of where --target would be appropriate).

@pfmoore
Copy link
Contributor

pfmoore commented Apr 23, 2024

Thinking further about binaries, and in particular script wrappers, I don't even see how those can work in the general case. Assuming uv works like pip, when it generates a script wrapper it hard codes the absolute path of the Python interpreter used to do the install. But in all of the key use cases for --target, that interpreter probably won't even be present at runtime:

  • Building a deployment for a cloud provider - the cloud provider's Python interpreter will probably be in a different location.
  • Vendoring dependencies to build a zipapp or a standalone application - the app will be distributed to users who won't necessarily have the interpreter in the same place as the maintainer (in the case of a standalone app, the user may not even have a standalone Python interpreter).
  • Populating a library directory for an embedded interpreter - the embedded interpreter may well not be invokable as a standalone process.

So I'm unclear why there's any value in installing script wrappers, let alone worrying about where to save them.

By the way, I'm aware that all of these points apply equally to pip - the main constraint with pip is that even if there's clearly no use for these things, if we remove them, there's bound to be someone whose workflow will break - https://xkcd.com/1172/. But as a new application, uv doesn't need to worry about that (unless you insist on being bug-for-bug compatible with pip, which is obviously your prerogative, but I don't think is warranted in this case...)

@charliermarsh
Copy link
Member

(By the way: I'm learning a lot from your comments and really appreciate you taking the time to provide input here. Thank you.)

@Tankanow
Copy link

This has been a lovely thread filled with informative helpful people. Kudos to @pfmoore, @charliermarsh, and the rest of the pip and uv teams actively helping out here. ❤️

@charliermarsh
Copy link
Member

@pfmoore - it could be okay though for scripts in (e.g.) distribution-1.0.data/scripts that don't contain the special shebang and thus won't be rewritten, right?

(As opposed to (1) entrypoints and (2) scripts that hit the "Rewrite #!python path here: https://packaging.python.org/en/latest/specifications/binary-distribution-format/#recommended-installer-features.)

@pfmoore
Copy link
Contributor

pfmoore commented Apr 23, 2024

Yeah, those should work. In general, they are discouraged in favour of entry points, though (for various reasons that would be a digression here). The question, I guess, is whether you want to support this for what is in my experience a very small minority of packages, while also installing the other cases which in general won't work as the user expects.

I'm trying to avoid over-complicating things here. You could include a "do you want to install scripts" option. Or even a "do you want to just install (possibly) safe scripts, or do you want to install everything" option. You could also offer options to install other sections of the wheel that aren't typically useful in a --target situation. Or maybe you could simply create a subdirectory of the target directory, and dump everything except purelib and platlib in there.

I'm not wedded to any particular option - even just "do what pip does" is an option. All I'm trying to do here is to explain that apart from the purelib/platlib behaviour, there's nothing particularly helpful about pip's behaviour, and in fact it may be actively misleading (in the sense that it looks like it works, but it doesn't).

The only behaviours I strongly advise are:

  1. Add a --target option 😉
  2. Require that the target is empty (or doesn't exist) in advance.

And a third, I guess - keep it simple and be guided by actual use cases, rather than assuming "if pip has it, it must be important".

(I'm also enjoying these discussions, by the way - it's nice to be able to take a fresh look at these sorts of design decisions, and actually think about what works best, rather than having to always think in terms of "what can we manage to do without breaking things" 🙂)

charliermarsh added a commit that referenced this issue Apr 25, 2024
## Summary

The approach taken here is to model `--target` as an install scheme in
which all the directories are just subdirectories of the `--target`.
From there, everything else... just works? Like, upgrade, uninstalls,
editables, etc. all "just work".

Closes #1517.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with another interface e.g. `pip` enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.