Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python3Packages.torch-bin: 1.13.1 -> 2.0.0 #221652

Closed
wants to merge 5 commits into from

Conversation

junjihashimoto
Copy link
Member

Description of changes

Pytorch 2.0 is released.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.05 Release Notes (or backporting 22.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@breakds
Copy link
Contributor

breakds commented Mar 21, 2023

Out of curiosity, I would like to consult a few questions. It seems that torch.compile in Pytorch 2.0 requires triton. Currently triton is not in Nixpkgs yet.

  1. Does torch.compile work with this PR?
  2. If so, how does it achieve that? My guess is that the wheel binary will have compiled triton shared libraries in it, but what about the triton python package that wraps it?

Thanks a lot!

@ConnorBaker
Copy link
Contributor

@breakds IIRC PyTorch distributes their own builds of Triton from the branch OpenAI maintains for (with?) them: https://github.com/openai/triton/tree/torch-inductor-stable.

I'll try it out later to see if it works.

To be clear, I personally don't think a broken torch.compile should prevent this from getting merged: from what I remember from trying to package Triton a few months ago, it needs MLIR and a few other things we don't have, as well as some manual patches.

@breakds
Copy link
Contributor

breakds commented Mar 21, 2023

@breakds IIRC PyTorch distributes their own builds of Triton from the branch OpenAI maintains for (with?) them: https://github.com/openai/triton/tree/torch-inductor-stable.

I'll try it out later to see if it works.

To be clear, I personally don't think a broken torch.compile should prevent this from getting merged: from what I remember from trying to package Triton a few months ago, it needs MLIR and a few other things we don't have, as well as some manual patches.

Thanks a lot for the explanation, especially the information about the torch-inductor-stable branch. That makes sense to me.

Agree that a broken torch.compile shouldn't be a blocker, thought it is an important feature of torch and would be nice to have. I tried to package triton and hit the problem that llvm.mlir not in Nixpkgs yet. I assume #163878 is an ongoing effort to address that.

@ConnorBaker
Copy link
Contributor

@junjihashimoto I was unable to build the derivation because sympy wasn't added to its inputs. I tried to package the nightlies a month or two back and IIRC we can drop a few inputs and we need to add a few new ones (including sympy).

$ nix build --impure -L github:NixOS/nixpkgs/refs/pull/221652/head#python3Packages.torch-bin
python3.10-torch> Sourcing python-remove-tests-dir-hook
python3.10-torch> Sourcing python-catch-conflicts-hook.sh
python3.10-torch> Sourcing python-remove-bin-bytecode-hook.sh
python3.10-torch> Sourcing wheel setup hook
python3.10-torch> Using wheelUnpackPhase
python3.10-torch> Sourcing pip-install-hook
python3.10-torch> Using pipInstallPhase
python3.10-torch> Sourcing python-imports-check-hook.sh
python3.10-torch> Using pythonImportsCheckPhase
python3.10-torch> Sourcing python-namespaces-hook
python3.10-torch> Sourcing python-catch-conflicts-hook.sh
python3.10-torch> unpacking sources
python3.10-torch> Executing wheelUnpackPhase
python3.10-torch> Finished executing wheelUnpackPhase
python3.10-torch> patching sources
python3.10-torch> configuring
python3.10-torch> no configure script, doing nothing
python3.10-torch> building
python3.10-torch> no Makefile or custom buildPhase, doing nothing
python3.10-torch> installing
python3.10-torch> Executing pipInstallPhase
python3.10-torch> /build/dist /build
python3.10-torch> Processing ./torch-2.0.0-cp310-cp310-linux_x86_64.whl
python3.10-torch> Requirement already satisfied: typing-extensions in /nix/store/bndw06wps3i7xpqdk6ryq6wiqg11ggy8-python3.10-typing-extensions-4.5.0/lib/python3.10/site-packages (from torch==2.0.0) (4.5.0)
python3.10-torch> ERROR: Could not find a version that satisfies the requirement sympy (from torch) (from versions: none)
python3.10-torch> ERROR: No matching distribution found for sympy
python3.10-torch> 
error: builder for '/nix/store/wrbw0hl6p10cc0x230j8z7f655yal0x4-python3.10-torch-2.0.0.drv' failed with exit code 1

@ConnorBaker
Copy link
Contributor

ConnorBaker commented Mar 21, 2023

Ah shoot, so while Triton may be bundled with their distribution when installing using pip, because Nixpkgs' python builder prevents that, we need to make sure we install it.

@junjihashimoto you'll probably have to add a triton-bin package: https://download.pytorch.org/whl/triton/.

After that's added you'll need to update the torch-bin derivation with it.

Seems like with the following it at least builds torch and torchvision successfully. Haven't tried any tests though.

(final: prev: {
  python3Packages = prev.python3Packages.overrideScope (pfinal: pprev: {
    triton-bin = pprev.buildPythonPackage {
      version = "2.0.0";
      pname = "triton";
      format = "wheel";
      dontStrip = true;
      pythonRemoveDeps = [ "cmake" "torch" ];
      nativeBuildInputs = [
        prev.lit
        pprev.pythonRelaxDepsHook
      ];
      propagatedBuildInputs = [
        pprev.filelock
      ];
      src = prev.fetchurl {
        name = "triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl";
        url = "https://download.pytorch.org/whl/triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl";
        hash = "sha256-OIBu6WY/Sw981keQ6WxXk3QInlj0mqxKZggSGqVeJQU=";
      };
    };
    torch-bin = pprev.torch-bin.overrideAttrs (oldAttrs: {
      nativeBuildInputs = oldAttrs.nativeBuildInputs ++ [
        prev.lit
      ];
      propagatedBuildInputs = oldAttrs.propagatedBuildInputs ++ [
        pprev.sympy
        pprev.jinja2
        pprev.networkx
        pprev.filelock
        pfinal.triton-bin
      ];
    });
    torchvision-bin = pprev.torchvision-bin.overrideAttrs (oldAttrs: {
      nativeBuildInputs = oldAttrs.nativeBuildInputs ++ [
        prev.lit
      ];
    });
  });
})

@ConnorBaker
Copy link
Contributor

@junjihashimoto those commits helped!

Trying to use torch.compile, I can see that I'm getting permissions errors because it's not executable.

Try adding

chmod +x $out/lib/<whatever python version>/site-packages/triton/third_party/cuda/bin/ptxas

to whatever phase feels appropriate in the triton-bin derivation.

After I did that, I was able to use torch.compile! 🎉

@ConnorBaker ConnorBaker self-assigned this Mar 22, 2023
@junjihashimoto junjihashimoto force-pushed the feat/pytorch2 branch 3 times, most recently from ca58698 to 7d06d9d Compare March 22, 2023 10:21
@junjihashimoto
Copy link
Member Author

@breakds @ConnorBaker
Thank you for your help!
I tested this sample. It works.
https://gist.github.com/junjihashimoto/76f73f118289a5c33f4e311b66a7677a

@katanallama
Copy link
Member

katanallama commented Mar 22, 2023

@breakds @ConnorBaker Thank you for your help! I tested this sample. It works. https://gist.github.com/junjihashimoto/76f73f118289a5c33f4e311b66a7677a

I gave this a shot along with #222273

Triton appeared to build but when I ran the sample test above, resulted in this.

When trying to import triton:
ImportError: /nix/store/yiflcg7zmirny3654g8l8f85sz958gqk-gcc-11.3.0-lib/lib64/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /nix/store/m2i94jljh4bdjk06rqvfmzpxm3nwk2nm-python3-3.10.10-env/lib/python3.10/site-packages/triton/_C/libtriton.so)

@junjihashimoto
Copy link
Member Author

@katanallama
In my case, libtriton.so is linked to /nix/store/wvm2hvqdbbsp1f11463mrw8nyv678ipm-gcc-12.2.0-lib/lib/libstdc++.so.6 .
image

I created the environment as follows.

nix-shell -p python3 'let pkgs=import ./default.nix {}; in pkgs.python3.withPackages (p: [p.torchvision-bin p.torch-bin])'

Why is it linked to a different file?

@junjihashimoto
Copy link
Member Author

junjihashimoto commented Mar 22, 2023

The link of libz.so.1 should be fixed. Fixed.

@junjihashimoto
Copy link
Member Author

junjihashimoto commented Mar 22, 2023

ptxas links to /lib64/ld-linux-x86-64.so.2, so I'll fix it. Fixed.

url = "https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp38-cp38-linux_x86_64.whl";
hash = "sha256-u/lUbw0Ni1EmPKR5Y3tCaogzX8oANPQs7GPU0y3uBa8=";
name = "torch-2.0.0-cp38-cp38-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp38-cp38-linux_x86_64.whl";
Copy link
Contributor

@breakds breakds Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another question. It seems to me that the binaries here would only work when torch is used with CUDA 11.8, is that right?

I tried to build it with cuda 11.7 and it can build and run

>>> import torch
>>> torch.version.cuda
'11.7'

Can this be a potential problem, if the original torch binary is built from CUDA 11.8?

Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@breakds There are two versions of CUDA. One is a cudatoolkit version. Another is a driver version.
The driver supports multiple versions of cudatoolkit.
And a new GPU like H100 needs a new cudatoolkit to support a new instruction set that is called as compute capability or ptx.

torch-2.0.0+cu118-xxx.whl means that the torch-2.0.0 binary is built by cudatoolkit-11.8.
It is important that the new GPU of H100 is supported by a cudatoolkit of 11.8 or higher version.
https://docs.nvidia.com/deploy/cuda-compatibility/index.html
image

We can just use a latest nvidia driver supporting CUDA-12 or a driver supporting the binary.

For example, torch-1.13.1+cu117 works with A100 and CUDA-12 driver, but torch-1.13.1+cu117 does not work with H100 and CUDA-12 driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, Junji!

license = licenses.bsd3;
# torch's license is BSD3.
# torch-bin includes CUDA and MKL binaries, therefore unfreeRedistributable is set.
license = licenses.unfreeRedistributable;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the license of torch-bin to unfreeRedistributable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very much in support of this change! However, this definitely should go in a separate git commit. Maybe even in a separate pull request for the affected people to land on and leave comments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SomeoneSerge
I've created a sparate git commit to update the license to unfreeRedistributable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment overlaps with the old one, we need to merge them. Thoughts:

I actually didn't notice any components that refer to the Intel Opensource License, but we should keep the links for reference

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-46/26872/1

description = "A language and compiler for custom Deep Learning operations";
homepage = "https://github.com/openai/triton/";
changelog = "https://github.com/openai/triton/releases/tag/v${version}";
license = licenses.mit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it includes a copy of NVIDIA's ptxas

pushd $out/${python.sitePackages}/torch/lib
LIBNVRTC=`ls libnvrtc-* |grep -v libnvrtc-builtins`
if [ ! -z "$LIBNVRTC" ] ; then
ln -s "$LIBNVRTC" libnvrtc.so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that using the nix-packaged libnvrtc ("${cudaPackages.cuda_nvrtc}/lib/libnvrtc.so") should be less fragile. At the very list, we have control over it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: there's an open issue about libnvrtc.so locating libnvrtc-builtins.so, #225240

postFixup = let
rpath = lib.makeLibraryPath [ stdenv.cc.cc.lib ];
in ''
find $out/${python.sitePackages}/torchaudio/lib -type f \( -name '*.so' -or -name '*.so.*' \) | while read lib; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have autoPatchelfHook for exactly this kind of logic. It'll add libraries from buildInputs and things like $out/lib into runpaths of libraries and executables, depending on what they declare as DT_NEEDED

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an update: we have just merged a source-build of triton into a different location. It happens to be pkgs/development/python-modules/openai-triton/default.nix.I think moving this all over to openai-triton/ is the easiest way to go from here

Btw, I think we should keep a -bin version of triton as well, because there are differences between that and our source-build. For one thing, upstream has already moved to llvm17, and we're only using llvm15 at the moment. So it's good to have both

@@ -11890,6 +11890,8 @@ self: super: with self; {

trove-classifiers = callPackage ../development/python-modules/trove-classifiers { };

triton-bin = callPackage ../development/python-modules/triton/bin.nix { };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I suggest we rename this into openai-triton-bin

, jinja2
, networkx
, filelock
, triton-bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H'm... it seems that other derivations (e.g. torchvision-bin) do it this way as well. Nonetheless, I'd rather declare the formal parameter with the source-build's name. I.e. I'd take openai-triton, but pass openai-triton-bin in the callPackage. This way the overrides to torch and torch-bin look exactly the same and I don't have to guess which naming scheme to use

Feel free to ignore this comment though

patchelf --add-needed ${zlib.out}/lib/libz.so \
"$out/${python.sitePackages}/triton/_C/libtriton.so"
'';

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythonImportsCheck (or does it still fail because of the circular dependency?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants