Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize removing deps in poetry intall --sync #9263

Open
adriangb opened this issue Apr 3, 2024 · 8 comments
Open

Parallelize removing deps in poetry intall --sync #9263

adriangb opened this issue Apr 3, 2024 · 8 comments
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged

Comments

@adriangb
Copy link
Contributor

adriangb commented Apr 3, 2024

Issue Kind

Change in current behaviour

Description

Currently poetry install --sync removes dependencies one by one. It's also somewhat slow for each one (~1s, I'd think it should me ms). Uninstalling should be done in parallel and hopefully can be sped up.

Impact

This enables a workflow where you cache all of the dependencies for a monorepo and then uninstall the ones you don't need in each services' Docker layer.

Workarounds

None that I know of. Just don't use the pattern.

@adriangb adriangb added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Apr 3, 2024
@adriangb
Copy link
Contributor Author

adriangb commented Apr 3, 2024

I see now that the source of this is #2658, but is seems like that was never fully investigated and a lot has changed in the internals of poetry since 2020 (e.g. not calling pip via subprocess, I think?). It seems prime candidate to re-evaluate.

@dimbleby
Copy link
Contributor

dimbleby commented Apr 3, 2024

This was certainly not an accident.

# Some operations are unsafe, we must execute them serially in a group
# https://github.com/python-poetry/poetry/issues/3086
# https://github.com/python-poetry/poetry/issues/2658

Uninstall is one of the few places where poetry does still call pip via a subprocess. I expect a pull request to reimplement uninstallation more directly would be desirable - but more work than just "don't act serially"

@adriangb
Copy link
Contributor Author

adriangb commented Apr 3, 2024

Yeah I saw those lines, I was just hoping that "reimplement uninstallation more directly" had already been done so this would be easy. I guess not. Seems like that's what this issue is asking for then.

@adriangb
Copy link
Contributor Author

@dimbleby if I wanted to start working on this, could you give me some pointers as to where to begin? I tried looking at how install does things but it's a lot of code. Some more general guidance around what to replace pip with would be helpful; I know at some point the internals of pip were broken out into libraries and I assume that's what's being used elsewhere but I don't know much more than that.

@Secrus
Copy link
Member

Secrus commented May 27, 2024

@dimbleby if I wanted to start working on this, could you give me some pointers as to where to begin? I tried looking at how install does things but it's a lot of code. Some more general guidance around what to replace pip with would be helpful; I know at some point the internals of pip were broken out into libraries and I assume that's what's being used elsewhere but I don't know much more than that.

I have started some PoC work on this by taking out the uninstaller from pip (there is no library separate for that yet) and putting it here. Didn't have much time to come back to it recently, but it might be a good start. Although I am still thinking that it could be a separate library one day.

@dimbleby
Copy link
Contributor

I guess conceptually the answer is: look at the package dist-info directory in the venv, and remove the items named in the RECORD file. Though perhaps there are edge cases where things are more complicated than that eg packages installed from git maybe? not sure

Practically I expect that Secrus is on the right lines in looking at how other projects do their uninstalling. Others that I might consider taking ideas from would include pdm - here I think - and uv - here, I suppose

I agree wibni this were pulled out into a nice library that we all could use

@adriangb
Copy link
Contributor Author

Thank you folks. @Secrus do you plan on continuing work?

For context I'd like to use this for better caching in a monorepo setup. The idea is that you build a base or builder image with all of the deps and cache that (and you can tag it with the hash of poetry.lock or similar), then for each service you copy over the .venv and do a sync to minimize image size. At some point I'd like to present this at a conference or group.

@Secrus
Copy link
Member

Secrus commented May 27, 2024

@adriangb Yes, but can't give any real timeline right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged
Projects
None yet
Development

No branches or pull requests

3 participants