Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of PyTorch CPU versions #402

Open
PeterJCLaw opened this issue May 31, 2023 · 3 comments
Open

Handling of PyTorch CPU versions #402

PeterJCLaw opened this issue May 31, 2023 · 3 comments

Comments

@PeterJCLaw
Copy link
Contributor

PeterJCLaw commented May 31, 2023

PyTorch has versions like 2.0.0 but also 2.0.0+cpu. I'm not sure what the term for the +cpu part is, though they don't seem to be impacted by using --forbid-post so I'm guessing that they're not seen as post releases.

PyTorch uses the +cpu tag on Linux to provide CPU-only packages which are considerably smaller than the equivalent GPU package. These are available with --extra-index-url https://download.pytorch.org/whl/cpu.

Unfortunately those packages don't exist for MacOS, so including torch==2.0.0+cpu in a requirements file breaks things for developers on macs.

However you can include just --extra-index-url https://download.pytorch.org/whl/cpu and torch == 2.0.0 in requirements files and they'll work fine for Linux users and for Mac users. Notably in this case Linux users will get the CPU-optimised package from the custom index, which will install as 2.0.0+cpu.

Clearly the way that this is working is a bit funky already. I'm not completely sure I like it, however I suspect PyTorch is big enough that getting them to change it is unlikely.

pip-compile seems quite happy to leave torch==2.0.0 in requirements files where that is pinned (in the input), however pip-compile-multi does not do so if there is a sibling package which also pulls in torch.

Thus:

--extra-index-url https://download.pytorch.org/whl/cpu

transformers[torch]
torch == 2.0.0
With pip-compile
#
# This file is autogenerated by pip-compile with Python 3.9
# by the following command:
#
#    pip-compile base.in
#
--extra-index-url https://download.pytorch.org/whl/cpu

accelerate==0.19.0
    # via transformers
certifi==2023.5.7
    # via requests
charset-normalizer==3.1.0
    # via requests
filelock==3.12.0
    # via
    #   huggingface-hub
    #   torch
    #   transformers
fsspec==2023.5.0
    # via huggingface-hub
huggingface-hub==0.14.1
    # via transformers
idna==3.4
    # via requests
jinja2==3.1.2
    # via torch
markupsafe==2.1.2
    # via jinja2
mpmath==1.3.0
    # via sympy
networkx==3.1
    # via torch
numpy==1.24.3
    # via
    #   accelerate
    #   transformers
packaging==23.1
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
psutil==5.9.5
    # via accelerate
pyyaml==6.0
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
regex==2023.5.5
    # via transformers
requests==2.31.0
    # via
    #   huggingface-hub
    #   transformers
sympy==1.12
    # via torch
tokenizers==0.13.3
    # via transformers
torch==2.0.0
    # via
    #   -r base.in
    #   accelerate
    #   transformers
tqdm==4.65.0
    # via
    #   huggingface-hub
    #   transformers
transformers[torch]==4.29.2
    # via -r base.in
typing-extensions==4.6.2
    # via
    #   huggingface-hub
    #   torch
urllib3==2.0.2
    # via requests
With pip-compile-multi
# SHA1:f77d2efafbc3749515b31f61241a7689159cf347
#
# This file is autogenerated by pip-compile-multi
# To update, run:
#
#    pip-compile-multi
#
accelerate==0.19.0
    # via transformers
certifi==2023.5.7
    # via requests
charset-normalizer==3.1.0
    # via requests
filelock==3.12.0
    # via
    #   huggingface-hub
    #   torch
    #   transformers
fsspec==2023.5.0
    # via huggingface-hub
huggingface-hub==0.14.1
    # via transformers
idna==3.4
    # via requests
jinja2==3.1.2
    # via torch
markupsafe==2.1.2
    # via jinja2
mpmath==1.3.0
    # via sympy
networkx==3.1
    # via torch
numpy==1.24.3
    # via
    #   accelerate
    #   transformers
packaging==23.1
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
psutil==5.9.5
    # via accelerate
pyyaml==6.0
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
regex==2023.5.5
    # via transformers
requests==2.31.0
    # via
    #   huggingface-hub
    #   transformers
sympy==1.12
    # via torch
tokenizers==0.13.3
    # via transformers
torch==2.0.0+cpu
    # via
    #   -r base.in
    #   accelerate
    #   transformers
tqdm==4.65.0
    # via
    #   huggingface-hub
    #   transformers
transformers[torch]==4.29.2
    # via -r base.in
typing-extensions==4.6.2
    # via
    #   huggingface-hub
    #   torch
urllib3==2.0.2
    # via requests

I would ideally like to be able to specify that I always want the non-+cpu variant of the package to be listed in the requirements file.

@PeterJCLaw
Copy link
Contributor Author

Changing the torch line to torch == 2.0.0, != 2.0.0+cpu does seem to help here, however this forces me to manually pin the version in the input. It also seems to make compilation of my actual project much longer (though not noticeably so for the toy example here).

@peterdemin
Copy link
Owner

Thanks for the thorough explanation. This issue seems pretty complex. And I'm not sure I completely understand it.
Do you have a good understanding of how the fix would work?

@PeterJCLaw
Copy link
Contributor Author

Not completely. Mostly as I don't understand what kind of thing the +cpu adjustment to the version number is.

I'm guessing that pip-compile doesn't have this issue since it uses the requested names for the package versions, rather than resolved name of what gets installed? If that is indeed what's happening, then following pip-compile's lead and using its naming of the packages would probably work... but without really knowing why that works I don't feel confident it's a great fix.
Additionally I can definitely see the argument that using the name of the actually installed package is more correct!

The fuller solution here is probably to support different versions (modulo some checking that they're just different flavours of the same version) on different platforms, though that feels like it's much bigger change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants