Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python3Packages.torch-bin: 1.13.1 -> 2.0.0 #221652

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
27 changes: 24 additions & 3 deletions pkgs/development/python-modules/torch/bin.nix
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
, isPy38
, isPy39
, isPy310
, isPy311
, python
, addOpenGLRunpath
, future
Expand All @@ -14,13 +15,18 @@
, requests
, setuptools
, typing-extensions
, sympy
, jinja2
, networkx
, filelock
, triton-bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H'm... it seems that other derivations (e.g. torchvision-bin) do it this way as well. Nonetheless, I'd rather declare the formal parameter with the source-build's name. I.e. I'd take openai-triton, but pass openai-triton-bin in the callPackage. This way the overrides to torch and torch-bin look exactly the same and I don't have to guess which naming scheme to use

Feel free to ignore this comment though

}:

let
pyVerNoDot = builtins.replaceStrings [ "." ] [ "" ] python.pythonVersion;
srcs = import ./binary-hashes.nix version;
unsupported = throw "Unsupported system";
version = "1.13.1";
version = "2.0.0";
in buildPythonPackage {
inherit version;

Expand All @@ -29,7 +35,7 @@ in buildPythonPackage {

format = "wheel";

disabled = !(isPy38 || isPy39 || isPy310);
disabled = !(isPy38 || isPy39 || isPy310 || isPy311);

src = fetchurl srcs."${stdenv.system}-${pyVerNoDot}" or unsupported;

Expand All @@ -45,6 +51,12 @@ in buildPythonPackage {
requests
setuptools
typing-extensions
sympy
jinja2
networkx
filelock
] ++ lib.optionals stdenv.isx86_64 [
triton-bin
];

postInstall = ''
Expand All @@ -55,6 +67,13 @@ in buildPythonPackage {
postFixup = let
rpath = lib.makeLibraryPath [ stdenv.cc.cc.lib ];
in ''
pushd $out/${python.sitePackages}/torch/lib
LIBNVRTC=`ls libnvrtc-* |grep -v libnvrtc-builtins`
if [ ! -z "$LIBNVRTC" ] ; then
ln -s "$LIBNVRTC" libnvrtc.so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that using the nix-packaged libnvrtc ("${cudaPackages.cuda_nvrtc}/lib/libnvrtc.so") should be less fragile. At the very list, we have control over it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: there's an open issue about libnvrtc.so locating libnvrtc-builtins.so, #225240

fi
popd

find $out/${python.sitePackages}/torch/lib -type f \( -name '*.so' -or -name '*.so.*' \) | while read lib; do
echo "setting rpath for $lib..."
patchelf --set-rpath "${rpath}:$out/${python.sitePackages}/torch/lib" "$lib"
Expand All @@ -74,7 +93,9 @@ in buildPythonPackage {
# Includes CUDA and Intel MKL, but redistributions of the binary are not limited.
# https://docs.nvidia.com/cuda/eula/index.html
# https://www.intel.com/content/www/us/en/developer/articles/license/onemkl-license-faq.html
license = licenses.bsd3;
# torch's license is BSD3.
# torch-bin includes CUDA and MKL binaries, therefore unfreeRedistributable is set.
license = licenses.unfreeRedistributable;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the license of torch-bin to unfreeRedistributable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very much in support of this change! However, this definitely should go in a separate git commit. Maybe even in a separate pull request for the affected people to land on and leave comments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SomeoneSerge
I've created a sparate git commit to update the license to unfreeRedistributable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment overlaps with the old one, we need to merge them. Thoughts:

I actually didn't notice any components that refer to the Intel Opensource License, but we should keep the links for reference

sourceProvenance = with sourceTypes; [ binaryNativeCode ];
platforms = [ "aarch64-darwin" "aarch64-linux" "x86_64-darwin" "x86_64-linux" ];
hydraPlatforms = []; # output size 3.2G on 1.11.0
Expand Down
94 changes: 57 additions & 37 deletions pkgs/development/python-modules/torch/binary-hashes.nix
Original file line number Diff line number Diff line change
Expand Up @@ -6,66 +6,86 @@
# To add a new version, run "prefetch.sh 'new-version'" to paste the generated file as follows.

version : builtins.getAttr version {
"1.13.1" = {
"2.0.0" = {
x86_64-linux-38 = {
name = "torch-1.13.1-cp38-cp38-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp38-cp38-linux_x86_64.whl";
hash = "sha256-u/lUbw0Ni1EmPKR5Y3tCaogzX8oANPQs7GPU0y3uBa8=";
name = "torch-2.0.0-cp38-cp38-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp38-cp38-linux_x86_64.whl";
Copy link
Contributor

@breakds breakds Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another question. It seems to me that the binaries here would only work when torch is used with CUDA 11.8, is that right?

I tried to build it with cuda 11.7 and it can build and run

>>> import torch
>>> torch.version.cuda
'11.7'

Can this be a potential problem, if the original torch binary is built from CUDA 11.8?

Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@breakds There are two versions of CUDA. One is a cudatoolkit version. Another is a driver version.
The driver supports multiple versions of cudatoolkit.
And a new GPU like H100 needs a new cudatoolkit to support a new instruction set that is called as compute capability or ptx.

torch-2.0.0+cu118-xxx.whl means that the torch-2.0.0 binary is built by cudatoolkit-11.8.
It is important that the new GPU of H100 is supported by a cudatoolkit of 11.8 or higher version.
https://docs.nvidia.com/deploy/cuda-compatibility/index.html
image

We can just use a latest nvidia driver supporting CUDA-12 or a driver supporting the binary.

For example, torch-1.13.1+cu117 works with A100 and CUDA-12 driver, but torch-1.13.1+cu117 does not work with H100 and CUDA-12 driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, Junji!

hash = "sha256-H4766/y7fsOWL9jHw74CxmZu/1OhIEMAanSdZHZWFj4=";
};
x86_64-linux-39 = {
name = "torch-1.13.1-cp39-cp39-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp39-cp39-linux_x86_64.whl";
hash = "sha256-s6wTng1KCzA8wW9R63cUbsfRTAsecCrWOGE2KPUIavc=";
name = "torch-2.0.0-cp39-cp39-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp39-cp39-linux_x86_64.whl";
hash = "sha256-6rl6n+WefjHWVisYb0NecXsd8zMcrcd25sBzIjmp7Tk=";
};
x86_64-linux-310 = {
name = "torch-1.13.1-cp310-cp310-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl";
hash = "sha256-FMXJ2wnfjPGzlCo0ecd52m4pOoShYtimrHHiveMOMMU=";
name = "torch-2.0.0-cp310-cp310-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl";
hash = "sha256-S2kOK3fyEHNQDGXYu56pZWuMtOlp81c3C7yZKjsHR2Q=";
};
x86_64-linux-311 = {
name = "torch-2.0.0-cp311-cp311-linux_x86_64.whl";
url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp311-cp311-linux_x86_64.whl";
hash = "sha256-I4Vz02LFZBE0UQRvZwjDuBWP5rG39sA7cnMyfZVd61Q=";
};
x86_64-darwin-38 = {
name = "torch-1.13.1-cp38-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp38-none-macosx_10_9_x86_64.whl";
hash = "sha256-M+Z+6lJuC7uRUSY+ZUF6nvLY+lPL5ijocxAGDJ3PoxI=";
name = "torch-2.0.0-cp38-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp38-none-macosx_10_9_x86_64.whl";
hash = "sha256-zHiMu7vG60yQ5SxVDv0GdYbCaTCSzzZ8E1s0iTpkrng=";
};
x86_64-darwin-39 = {
name = "torch-1.13.1-cp39-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp39-none-macosx_10_9_x86_64.whl";
hash = "sha256-aTB5HvqHV8tpdK9z1Jlra1DFkogqMkuPsFicapui3a8=";
name = "torch-2.0.0-cp39-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp39-none-macosx_10_9_x86_64.whl";
hash = "sha256-bguXvrA3oWVmnDElkfJCOC6RCaJA4gBU1aV4LZI2ytA=";
};
x86_64-darwin-310 = {
name = "torch-1.13.1-cp310-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp310-none-macosx_10_9_x86_64.whl";
hash = "sha256-OTpic8gy4EdYEGP7dDNf9QtMVmIXAZzGrOMYzXnrBWY=";
name = "torch-2.0.0-cp310-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp310-none-macosx_10_9_x86_64.whl";
hash = "sha256-zptaSb1RPf95UKWgfW4mWU3VGYnO4FujiLA+jjZv1dU=";
};
x86_64-darwin-311 = {
name = "torch-2.0.0-cp311-none-macosx_10_9_x86_64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp311-none-macosx_10_9_x86_64.whl";
hash = "sha256-AYWGIPJfJeep7EtUf/OOXifJLTjsTMupz7+zHXBx7Zw=";
};
aarch64-darwin-38 = {
name = "torch-1.13.1-cp38-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp38-none-macosx_11_0_arm64.whl";
hash = "sha256-7usgTTD9QK9qLYCHm0an77489Dzb64g43U89EmzJCys=";
name = "torch-2.0.0-cp38-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp38-none-macosx_11_0_arm64.whl";
hash = "sha256-0pJkDw/XK3oxsqbjtjXrUGX8vt1EePnK0aHnqeyGHTU=";
};
aarch64-darwin-39 = {
name = "torch-1.13.1-cp39-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp39-none-macosx_11_0_arm64.whl";
hash = "sha256-4N+QKnx91seVaYUy7llwzomGcmJWNdiF6t6ZduWgSUk=";
name = "torch-2.0.0-cp39-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp39-none-macosx_11_0_arm64.whl";
hash = "sha256-KXpJGa/xwPmKWOvpaSAPcTUKHU1PmG2/1gwC/854Dpk=";
};
aarch64-darwin-310 = {
name = "torch-1.13.1-cp310-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-1.13.1-cp310-none-macosx_11_0_arm64.whl";
hash = "sha256-ASKAaxEblJ0h+hpfl2TR/S/MSkfLf4/5FCBP1Px1LtU=";
name = "torch-2.0.0-cp310-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp310-none-macosx_11_0_arm64.whl";
hash = "sha256-U+HDPGiWWDzbmlg2k+IumSZkRMSkM5Ld3FYmQNOeVCs=";
};
aarch64-darwin-311 = {
name = "torch-2.0.0-cp311-none-macosx_11_0_arm64.whl";
url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp311-none-macosx_11_0_arm64.whl";
hash = "sha256-mi5TtXg+9YlqavM4s214LyjoPI3fwqxEtnsGbZ129Jg=";
};
aarch64-linux-38 = {
name = "torch-1.13.1-cp38-cp38-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-1.13.1-cp38-cp38-manylinux2014_aarch64.whl";
hash = "sha256-34Q0sGlenOuMxwZQr8ExDYupSebbKgUl3dnDsrGB5f4=";
name = "torch-2.0.0-cp38-cp38-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-2.0.0-cp38-cp38-manylinux2014_aarch64.whl";
hash = "sha256-EbA4T+PBjAG4/FmS5w/FGc3mXkTFHMh74YOMGAPa9C8=";
};
aarch64-linux-39 = {
name = "torch-1.13.1-cp39-cp39-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-1.13.1-cp39-cp39-manylinux2014_aarch64.whl";
hash = "sha256-LDWBo/2B6x8PIpl83f/qVp/qU7r6NyssBHHbNzsmqvw=";
name = "torch-2.0.0-cp39-cp39-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-2.0.0-cp39-cp39-manylinux2014_aarch64.whl";
hash = "sha256-qDsmvWrjb79f7j1Wlz2YFuIALoo7fZIFUxFnwoqqOKc=";
};
aarch64-linux-310 = {
name = "torch-1.13.1-cp310-cp310-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-1.13.1-cp310-cp310-manylinux2014_aarch64.whl";
hash = "sha256-2f54XTdfLial1eul3pH4nmo75dEe+0l+dnBf35P6PC4=";
name = "torch-2.0.0-cp310-cp310-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-2.0.0-cp310-cp310-manylinux2014_aarch64.whl";
hash = "sha256-nwH+H2Jj8xvQThdXlG/WOtUxrjfyi7Lb9m9cgm7gifQ=";
};
aarch64-linux-311 = {
name = "torch-2.0.0-cp311-cp311-manylinux2014_aarch64.whl";
url = "https://download.pytorch.org/whl/torch-2.0.0-cp311-cp311-manylinux2014_aarch64.whl";
hash = "sha256-1Dmuw0nJjxKBnoVkuMVACORhPdRChYKvDm4UwkyoWHA=";
};
};
}
6 changes: 5 additions & 1 deletion pkgs/development/python-modules/torch/prefetch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ set -eou pipefail

version=$1

linux_cuda_version="cu117"
linux_cuda_version="cu118"
linux_cuda_bucket="https://download.pytorch.org/whl/${linux_cuda_version}"
linux_cpu_bucket="https://download.pytorch.org/whl"
darwin_bucket="https://download.pytorch.org/whl/cpu"
Expand All @@ -14,15 +14,19 @@ url_and_key_list=(
"x86_64-linux-38 $linux_cuda_bucket/torch-${version}%2B${linux_cuda_version}-cp38-cp38-linux_x86_64.whl torch-${version}-cp38-cp38-linux_x86_64.whl"
"x86_64-linux-39 $linux_cuda_bucket/torch-${version}%2B${linux_cuda_version}-cp39-cp39-linux_x86_64.whl torch-${version}-cp39-cp39-linux_x86_64.whl"
"x86_64-linux-310 $linux_cuda_bucket/torch-${version}%2B${linux_cuda_version}-cp310-cp310-linux_x86_64.whl torch-${version}-cp310-cp310-linux_x86_64.whl"
"x86_64-linux-311 $linux_cuda_bucket/torch-${version}%2B${linux_cuda_version}-cp311-cp311-linux_x86_64.whl torch-${version}-cp311-cp311-linux_x86_64.whl"
"x86_64-darwin-38 $darwin_bucket/torch-${version}-cp38-none-macosx_10_9_x86_64.whl torch-${version}-cp38-none-macosx_10_9_x86_64.whl"
"x86_64-darwin-39 $darwin_bucket/torch-${version}-cp39-none-macosx_10_9_x86_64.whl torch-${version}-cp39-none-macosx_10_9_x86_64.whl"
"x86_64-darwin-310 $darwin_bucket/torch-${version}-cp310-none-macosx_10_9_x86_64.whl torch-${version}-cp310-none-macosx_10_9_x86_64.whl"
"x86_64-darwin-311 $darwin_bucket/torch-${version}-cp311-none-macosx_10_9_x86_64.whl torch-${version}-cp311-none-macosx_10_9_x86_64.whl"
"aarch64-darwin-38 $darwin_bucket/torch-${version}-cp38-none-macosx_11_0_arm64.whl torch-${version}-cp38-none-macosx_11_0_arm64.whl"
"aarch64-darwin-39 $darwin_bucket/torch-${version}-cp39-none-macosx_11_0_arm64.whl torch-${version}-cp39-none-macosx_11_0_arm64.whl"
"aarch64-darwin-310 $darwin_bucket/torch-${version}-cp310-none-macosx_11_0_arm64.whl torch-${version}-cp310-none-macosx_11_0_arm64.whl"
"aarch64-darwin-311 $darwin_bucket/torch-${version}-cp311-none-macosx_11_0_arm64.whl torch-${version}-cp311-none-macosx_11_0_arm64.whl"
"aarch64-linux-38 $linux_cpu_bucket/torch-${version}-cp38-cp38-manylinux2014_aarch64.whl torch-${version}-cp38-cp38-manylinux2014_aarch64.whl"
"aarch64-linux-39 $linux_cpu_bucket/torch-${version}-cp39-cp39-manylinux2014_aarch64.whl torch-${version}-cp39-cp39-manylinux2014_aarch64.whl"
"aarch64-linux-310 $linux_cpu_bucket/torch-${version}-cp310-cp310-manylinux2014_aarch64.whl torch-${version}-cp310-cp310-manylinux2014_aarch64.whl"
"aarch64-linux-311 $linux_cpu_bucket/torch-${version}-cp311-cp311-manylinux2014_aarch64.whl torch-${version}-cp311-cp311-manylinux2014_aarch64.whl"
)

hashfile="binary-hashes-$version.nix"
Expand Down
26 changes: 18 additions & 8 deletions pkgs/development/python-modules/torchaudio/bin.nix
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,18 @@
, isPy38
, isPy39
, isPy310
, isPy311
, python
, torch-bin
, pythonOlder
, pythonAtLeast
, patchelf
, addOpenGLRunpath
}:

buildPythonPackage rec {
pname = "torchaudio";
version = "0.13.1";
version = "2.0.1";
format = "wheel";

src =
Expand All @@ -23,7 +26,12 @@ buildPythonPackage rec {
srcs = (import ./binary-hashes.nix version)."${stdenv.system}-${pyVerNoDot}" or unsupported;
in fetchurl srcs;

disabled = !(isPy38 || isPy39 || isPy310);
disabled = !(isPy38 || isPy39 || isPy310 || isPy311);

nativeBuildInputs = [
patchelf
addOpenGLRunpath
];

propagatedBuildInputs = [
torch-bin
Expand All @@ -34,12 +42,14 @@ buildPythonPackage rec {

pythonImportsCheck = [ "torchaudio" ];

postFixup = ''
# Note: after patchelf'ing, libcudart can still not be found. However, this should
# not be an issue, because PyTorch is loaded before torchvision and brings
# in the necessary symbols.
patchelf --set-rpath "${lib.makeLibraryPath [ stdenv.cc.cc.lib ]}:${torch-bin}/${python.sitePackages}/torch/lib:" \
"$out/${python.sitePackages}/torchaudio/_torchaudio.so"
postFixup = let
rpath = lib.makeLibraryPath [ stdenv.cc.cc.lib ];
in ''
find $out/${python.sitePackages}/torchaudio/lib -type f \( -name '*.so' -or -name '*.so.*' \) | while read lib; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have autoPatchelfHook for exactly this kind of logic. It'll add libraries from buildInputs and things like $out/lib into runpaths of libraries and executables, depending on what they declare as DT_NEEDED

echo "setting rpath for $lib..."
patchelf --set-rpath "${rpath}:$out/${python.sitePackages}/torchaudio/lib" "$lib"
addOpenGLRunpath "$lib"
done
'';

meta = with lib; {
Expand Down