Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why it run on colab does strange thing "Uninstalling torch-1.13.1+cu116" ? #4

Open
lumiamilk opened this issue Jan 12, 2023 · 9 comments

Comments

@lumiamilk
Copy link

Today when I run "!pip install https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl" on google colab, it returned the strange information, and after that the xformers can't be load.

I use the free version of colab, and run stable-diffusion-webui on Tesla T4.
Yesterday the command is wonderful and performed normally, but today it returned some things wrong.

Can you give me some advice ?

Here is the log:
##########################################################################################
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting xformers==0.0.15.dev0+4c06c79.d20221205
Downloading https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl (47.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.4/47.4 MB 12.5 MB/s eta 0:00:00
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from xformers==0.0.15.dev0+4c06c79.d20221205) (1.21.6)
Collecting pyre-extensions==0.0.23
Downloading pyre_extensions-0.0.23-py3-none-any.whl (11 kB)
Collecting torch==1.13
Downloading torch-1.13.0-cp38-cp38-manylinux1_x86_64.whl (890.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 890.2/890.2 MB 171.6 MB/s eta 0:00:01tcmalloc: large alloc 1112711168 bytes == 0x380c8000 @ 0x7fd5b4111615 0x5d6f4c 0x51edd1 0x51ef5b 0x4f750a 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x5d8868 0x4997a2 0x55cd91 0x5d8941 0x49abe4 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 890.2/890.2 MB 1.9 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from pyre-extensions==0.0.23->xformers==0.0.15.dev0+4c06c79.d20221205) (4.4.0)
Collecting typing-inspect
Downloading typing_inspect-0.8.0-py3-none-any.whl (8.7 kB)
Collecting nvidia-cublas-cu11==11.10.3.66
Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 5.1 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 74.1 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu11==8.5.0.96
Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 3.3 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11==11.7.99
Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 KB 27.6 MB/s eta 0:00:00
Requirement already satisfied: setuptools in /usr/local/lib/python3.8/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch==1.13->xformers==0.0.15.dev0+4c06c79.d20221205) (57.4.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch==1.13->xformers==0.0.15.dev0+4c06c79.d20221205) (0.38.4)
Collecting mypy-extensions>=0.3.0
Downloading mypy_extensions-0.4.3-py2.py3-none-any.whl (4.5 kB)
Installing collected packages: mypy-extensions, typing-inspect, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cublas-cu11, pyre-extensions, nvidia-cudnn-cu11, torch, xformers
Attempting uninstall: torch
Found existing installation: torch 1.13.1+cu116
Uninstalling torch-1.13.1+cu116:
Successfully uninstalled torch-1.13.1+cu116
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.14.1+cu116 requires torch==1.13.1, but you have torch 1.13.0 which is incompatible.
torchtext 0.14.1 requires torch==1.13.1, but you have torch 1.13.0 which is incompatible.
torchaudio 0.13.1+cu116 requires torch==1.13.1, but you have torch 1.13.0 which is incompatible.
Successfully installed mypy-extensions-0.4.3 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 pyre-extensions-0.0.23 torch-1.13.0 typing-inspect-0.8.0 xformers-0.0.15.dev0+4c06c79.d20221205

@vt-idiot
Copy link

I encountered the same issue. Doing !pip install --no-deps https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl fixed it. Note the --no-deps. Keeps it from completely screwing up torch.

@brian6091
Copy link
Owner

@vt-idiot can you confirm that training and inference still works with the --no-deps flag? I worked around a different way by unistalling all the packages throwing version errors and reinstalling with pip install -pre, which seems to work. Your trick seems easier though..

@vt-idiot
Copy link

Colab was going haywire with just !pip install https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl and installing/uninstalling a bunch of other stuff as you can see from OP's log.

I'm out of credits for the month so I felt "on the clock" to try and make it work as quickly as possible. When I initially tried popping the terminal open to try manually uninstalling/reinstalling things, I was still running into version and dependency errors, which struck me as odd since SD webui launches with pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113 in its command line arguments.

I've never done any inference or training myself, only used it to speed up image generation.

If it helps, here are some stray uncleared Colab logs I found dated January 7th, installing this package for the first time before its behavior changed, without --no-deps. It might've been after launching webui once in that particular runtime, so anything from its requirements.txt (and any of their deps) might've already been installed. Uh, it was also an older downloaded copy of webui, so this would've been the correct commit of requirements.txt at that point in time.

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting xformers==0.0.15.dev0+4c06c79.d20221205
  Downloading https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl (47.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.4/47.4 MB 19.6 MB/s eta 0:00:00
Requirement already satisfied: pyre-extensions==0.0.23 in /usr/local/lib/python3.8/dist-packages (from xformers==0.0.15.dev0+4c06c79.d20221205) (0.0.23)
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from xformers==0.0.15.dev0+4c06c79.d20221205) (1.22.4)
Requirement already satisfied: torch==1.13 in /usr/local/lib/python3.8/dist-packages (from xformers==0.0.15.dev0+4c06c79.d20221205) (1.13.0+cu116)
Requirement already satisfied: typing-inspect in /usr/local/lib/python3.8/dist-packages (from pyre-extensions==0.0.23->xformers==0.0.15.dev0+4c06c79.d20221205) (0.8.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from pyre-extensions==0.0.23->xformers==0.0.15.dev0+4c06c79.d20221205) (4.4.0)
Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.8/dist-packages (from typing-inspect->pyre-extensions==0.0.23->xformers==0.0.15.dev0+4c06c79.d20221205) (0.4.3)
Installing collected packages: xformers
  Attempting uninstall: xformers
    Found existing installation: xformers 0.0.13
    Uninstalling xformers-0.0.13:
      Successfully uninstalled xformers-0.0.13
Successfully installed xformers-0.0.15.dev0+4c06c79.d20221205

Also people training with xformers, and with no-deps

@vt-idiot
Copy link

vt-idiot commented Feb 1, 2023

@brian6091 for what it's worth, running !pip install --no-deps xformers==0.0.16rc425 (directly from PyPI) seems to be certifiably It Just Works™ territory. Haven't tried it with the release version that came out yesterday, or 0.0.17dev, or without --no-deps. Did you ever manage to figure out what was starting the Dependency Apocalypse on Colab instances?

@brian6091
Copy link
Owner

@vt-idiot thanks for the confirmation. No idea what caused the problem, I think it was Colab updating torch, but I never verified. I've been running !pip install xformers==0.0.16rc425 which works without problems with A100 at least.

@vt-idiot
Copy link

vt-idiot commented Feb 1, 2023

...actually, I think I might know.

The single wheel uploaded here is just a mirror from... this GitHub action from Facebook's xformers repo, right? Kind of silly that the wheel files themselves for their Action all turn into xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl regardless of the torch or CUDA version specified at build time. Time to match CRCs! WinRAR is showing them to me without having to unZIP anyways.
D529F749 = the xformers-ubuntu-22.04-py3.8-torch1.13.0+cu116.whl artifact.

I guess Colab switched to including torch==1.13.1+cu116 by default, which meant they also upgraded to torchvision 0.14.1+cu116, torchtext 0.14.1, and torchaudio 0.13.1+cu116, all of which required torch==1.13.1.

--no-deps was making pip ignore the fact that the xformers wheel wanted torch==1.13.0+cu116 and I guess the changes between 1.13.0 and 1.13.1 weren't breaking, at least for xformers. pip was telling us the issue all along, I was just so distracted by it installing all those other nVIDIA packages I didn't notice...

@vt-idiot thanks for the confirmation. No idea what caused the problem, I think it was Colab updating torch, but I never verified. I've been running !pip install xformers==0.0.16rc425 which works without problems with A100 at least.

Just saw you reply while I was being thorough to satisfy my curiosity. Yep, looks like it. So installing the newer builds works without --no-deps?

nvidia-cublas-cu11==11.10.3.66, nvidia-cuda-nvrtc-cu11==11.7.99, nvidia-cudnn-cu11==8.5.0.96

It doesn't try and install ~1GB of nvidia wheels like before? If so I have a notebook or two to go edit. I'll probably try the 0.0.16 release builds this evening. No point in trying the 0.0.17 dev build yet - the only commit after release was a bot updating the version string. https://github.com/facebookresearch/xformers/commit/5df1f0b682a5b246577f0cf40dd3b15c1a04ce50

@vt-idiot
Copy link

vt-idiot commented Feb 1, 2023

Probably safe to close this issue :)

@vt-idiot
Copy link

vt-idiot commented Feb 1, 2023

Oh, and for context, I exclusively use the T4/"free" GPUs even on Pro, so it was working there as well.

@brian6091
Copy link
Owner

brian6091 commented Feb 2, 2023

@vt-idiot thanks for the detective work!

So installing the newer builds works without --no-deps?

Seems to work fine. I don't recall seeing the nvidia wheels getting installed.

Training and inference work with 0.0.16 build as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants