Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cyclic dependency created between torch 2.0 and triton #99622

Closed
ashishu007 opened this issue Apr 20, 2023 · 15 comments
Closed

Cyclic dependency created between torch 2.0 and triton #99622

ashishu007 opened this issue Apr 20, 2023 · 15 comments
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@ashishu007
Copy link

ashishu007 commented Apr 20, 2023

Issue description

Installing torch 2.0 creates cyclic dependency between torch and triton. However, earlier torch versions (1.x) work fine.

Code example

$ python -m pipdeptree

Warning!! Cyclic dependencies found:
* triton => torch => triton
* torch => triton => torch
------------------------------------------------------------------------
pip==23.0.1
pipdeptree==2.3.3

System Info

  • PyTorch or Caffe2: PyTorch
  • How you installed PyTorch (conda, pip, from source): pip
  • OS: Linux
  • PyTorch version: 2.0.0
  • Python version: 3.10.10
  • Versions of any other relevant libraries: triton==2.0.0

cc @ezyang @soumith @msaroufim @wconstab @ngimel @bdhirsh @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@yanboliang yanboliang added module: inductor triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 20, 2023
@yanboliang
Copy link
Contributor

cc @jansel @ngimel

@ezyang
Copy link
Contributor

ezyang commented Apr 20, 2023

Unfortunately this is unavoidable. As long as the two libraries don't try to use each other on import (which they shouldn't) this is fine and won't break anything.

@milutter
Copy link

Can this cyclic dependency not be removed, or solved differently?

For some environments like pip, the cyclic dependency is not a problem but for other enviroments it is. For example, Bazel cannot handle the cyclic dependency in the pip package.

@ezyang
Copy link
Contributor

ezyang commented Apr 21, 2023

The easiest fix will be to split pytorch into two packages, one which is traditional eager mode pytorch, and the other which has inductor (which is what has the triton dep). I'm not familiar enough with Bazel to say more about how to do this.

@jansel
Copy link
Contributor

jansel commented Apr 21, 2023

The easiest fix will be to split pytorch into two packages, one which is traditional eager mode pytorch, and the other which has inductor (which is what has the triton dep). I'm not familiar enough with Bazel to say more about how to do this.

Wouldn't you still have a cycle of pytorch->inductor->triton->pytorch?

IMO this is a bug in Bazel.

@ezyang
Copy link
Contributor

ezyang commented Apr 21, 2023

the idea is inductor would depend on pytorch but not vice versa

@cphillippi-stripe
Copy link

I saw a change to Triton here that moved the torch dependency only to the "test" rule: triton-lang/triton#1389. This is not in the current release, but would just updating the release solve this?

@ngimel
Copy link
Collaborator

ngimel commented Apr 21, 2023

We cannot update pytorch 2.0.0 to next triton release, but on nightlies triton has been updated

@milutter
Copy link

milutter commented Apr 21, 2023

I just tried the nightly version and the cyclic dependency is gone with the new triton release. So everything should be good with the 2.1.0 version

@groodt
Copy link

groodt commented Apr 22, 2023

Will this fix make it into torch 2.0.1?

@ngimel
Copy link
Collaborator

ngimel commented Apr 22, 2023

No, we cannot update triton version used for 2.0.1 as 2.0.1 is just a bugfix release, and updating triton version is more risk and requires more significant changes.

@michael-quinlan
Copy link

michael-quinlan commented Jun 5, 2023

Can this cyclic dependency not be removed, or solved differently?

For some environments like pip, the cyclic dependency is not a problem but for other enviroments it is. For example, Bazel cannot handle the cyclic dependency in the pip package.

@milutter what was your solution for bazel? Using 2.1 nightly is tricky with some of the package constraints (senetence-transformers etc)

@milutter
Copy link

milutter commented Jun 8, 2023

As far as I know there is no solution other than waiting. I stayed on torch==1.12.1, which is also not great for package constraints.

@ngimel
Is there a release date for 2.1?

@georgevreilly
Copy link

georgevreilly commented Jun 23, 2023

As a gross hack, I have patched the wheel for our local usage at Stripe and it works. We are heavy Bazel users. YMMV.

I spent a few days trying to build the wheel from source. It was a nightmare. I ran out of disk space on the root partition trying to install system packages, so I had to bring up a custom instance. Then I ran out of space on the main partition, trying to compile, so I had to bring up another custom instance. Then I realized I had installed CUDA 12.1 and couldn't install CUDA 11.8 over it, so yet another instance. Then a long list of other problems. I was eventually able to get python setup.py develop to execute, but it took three hours! And I had little confidence that I was building the same thing that was in the official wheels.

Then I had a brainwave: what if I patch the official wheel and remove the requirement on Triton? That worked. Here are my notes.

Manually patching PyTorch wheel to remove dependency on Triton

See Binary distribution format for details on dist-info.

unzip -d torch201.2 torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl
cd torch201.2

# Rename the `dist-info` directory to include '+stripe.2' as a suffix for `2.0.1`
mv torch-2.0.1{,+stripe.2}.dist-info/
cd torch-2.0.1+stripe.2.dist-info/

Update METADATA

Edit torch*.dist-info/METADATA.

  • Update Version to include +stripe.2
  • Remove Requires-Dist line for triton
  • Vim commands shown:
:%s/^\(Version:\)\(.*\)$/\1\2+stripe.2/
:g/Requires-Dist: triton /d

Calculate new values for METADATA's hash and size using this script, record_hash.py:

#!/usr/bin/env python3

import base64
import hashlib
import os
import sys

filename = sys.argv[1]

with open(filename, "rb") as f:
    digest = hashlib.sha256(f.read())
    safe_hash = base64.urlsafe_b64encode(digest.digest()).decode("us-ascii").rstrip("=")
print(f"{os.path.basename(filename)},sha256={safe_hash},{os.path.getsize(filename)}")

Run the script and remember the results:

../record_hash.py torch-2.0.1+stripe.2.dist-info/METADATA
METADATA,sha256=StmZkVzCWlHIxaIGVJocXv7JsDnlrSaNXwtuIlE_PKc,24703

Update RECORD

Edit torch*.dist-info/RECORD:

  • Update lines prefixed with torch*.dist-info/ to include new version, +stripe.2:
  • Vim commands shown
:%s/^\(torch-2.0.1\)\(\.dist-info\)/\1+stripe.2\2/
  • Update torch*.dist-info/METADATA line with the sha256 and filesize from record_hash.py,
    so that the contents of RECORD will pass file integrity checks.

Zip new wheel

zip ../torch-2.0.1+stripe.2-cp38-cp38-manylinux1_x86_64.whl -r .

Upload wheel to local repository

Details omitted

Installation

$ pip install torch==2.0.1+stripe.2
...

$ pip freeze | grep triton && echo "triton installed" || echo "no triton"
no triton

$ pip install pipdeptree triton==2.0.0

$ pipdeptree
pip==22.0.4
pipdeptree==2.7.1
triton==2.0.0
  - cmake [required: Any, installed: 3.26.3]
  - filelock [required: Any, installed: 3.12.2]
  - lit [required: Any, installed: 16.0.5.post0]
  - torch [required: Any, installed: 2.0.1+stripe.2]
    - filelock [required: Any, installed: 3.12.2]
    ...

Summary

Obviously, all of these steps could be automated in a script, but I didn't bother.

@betaboon
Copy link

I saw a change to Triton here that moved the torch dependency only to the "test" rule: openai/triton#1389. This is not in the current release, but would just updating the release solve this?

this PR has been merged a while ago.
it also seems to be contained in triton 2.1.0

PR #95896 has updated triton from 2.0.0 to 2.1.0.
it seems to be contained in torch 2.1.0.

thus i would assume this issue should be closed ?

@jansel jansel closed this as completed Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests