Add TensorRT packages #178397

aidalgol · 2022-06-21T03:44:33Z

Description of changes

Add derivations for TensorRT, a high-performance deep learning interface SDK from NVIDIA, which is at this point non-redistributable. Uses different upstream tarballs depending on the current CUDA version, similar to the cudnn derivations.

Things done

SuperSandro2000

Missing meta

pkgs/development/python-modules/tensorrt/default.nix

pkgs/applications/science/math/tensorrt/default.nix

aidalgol · 2022-06-21T10:10:48Z

@SuperSandro2000 I appreciate the quick feedback, but this is nowhere ready for review. I created this draft PR at the suggestion of @SomeoneSerge, after much back and forth in the Nix CUDA Maintainers Matrix room, to ease collaboration efforts on this package. I am aware of the code quality issues, but I am trying to get this to a state where the derivations will build and the python module finds the necessary libraries at runtime. Once this actually works, I will bring the nix code in line with nixpkgs conventions, and parameterise the version similar to the mathematica package.

SomeoneSerge · 2022-06-21T11:48:42Z

Good! Now let's expose the expression e.g. as an atttribute, e.g. cudaPackages.tensorrt
You'd need to write an extension for cudaPackages namespace.
It's going to be a function of the form final: prev: { tensorrt = final.callPackage ... { }; } where final an instance of cudaPackages.
You'll append it to this list:

nixpkgs/pkgs/top-level/cuda-packages.nix

Line 61 in 473669d

composedExtension = composeManyExtensions [

For now, I think you could define it right there, next to cutensorExtension (which you can use as a template, by the way)

Also, as Sandro suggests, it won't hurt to start bringing things into a shape right away: let's add a meta attribute that would declare the unfree license, limit the platforms to (I suppose) x86_64-linux, and provide a link to some changelog s.t. it's easier to track updates at least manually; let's move the version into a let ... in that we can reuse it when forming the filenames

I'd also suggest that you run nixpkgs-fmt prior to commits if you're not doing that yet

aidalgol · 2022-06-22T03:25:36Z

@SomeoneSerge With my latest changes, you should now be able to reproduce the error I was describing on matrix (from before creating this draft PR) by running NIXPKGS_ALLOW_UNFREE=1 nix-build -A python39Packages.tensorrt. I'm not sure whether the checkPhase I have written is appropriate for normal regression testing once we have this working, but it makes testing changes at this stage easier.

SuperSandro2000 · 2022-06-22T12:48:48Z

If you want to collaborate on very WIP things I would suggest to work in a PR or branch in your fork.

SomeoneSerge · 2022-06-22T17:47:13Z

@SuperSandro2000 thank you. It explicitly was my suggestion to @aidalgol that we move the discussion from matrix straight into a draft PR in nixpkgs that we can iterate faster. This is a messy non-redist package, which means there's some trivia about manually gluing pieces together to handle in the beginning

With that done, we'll mark the draft as "ready for review" and tag you

Cheers

pkgs/development/python-modules/tensorrt/default.nix

pkgs/development/libraries/science/math/tensorrt/generic.nix

pkgs/development/python-modules/tensorrt/default.nix

aidalgol · 2022-06-23T00:24:04Z

@SomeoneSerge With your suggested changes, I am back to where I was before moving this to a nixpkgs branch.

$ NIX_PATH=nixpkgs=$PWD NIXPKGS_ALLOW_UNFREE=1 nix-shell -p 'python310.withPackages (pypkgs: [ pypkgs.tensorrt ])'
$ python3 -c 'import tensorrt; assert tensorrt.Builder(tensorrt.Logger())'
[06/23/2022-12:09:18] [TRT] [E] 6: [libLoader.h::DynamicLibrary::49] Error Code 6: Internal Error (Unable to load library: libnvinfer_builder_resource.so.8.4.0)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: pybind11::init(): factory function returned nullptr

From running the same python3 command under strace, something appears to be looking for libnvinfer_builder_resource.so.8.4.0 in the wrong places:

openat(AT_FDCWD, "/run/opengl-driver/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/wk3xam5pd0a0jwirggg5zpbjzw8zzaf3-gcc-11.3.0-lib/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/fz33c1mfi2krpg1lwzizfw28kj705yg0-glibc-2.34-210/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

I have no idea what is looking for it here, though.

samuela

Thanks for showing these packages some love @aidalgol! It's def a step forward.

I'm afraid I may not have enough context to be very useful here, since I don't use TensorRT personally. I just have a few notes re the cudaPackages migration that we've been working on.

pkgs/development/python-modules/tensorrt/default.nix

pkgs/development/libraries/science/math/tensorrt/generic.nix

pkgs/development/python-modules/tensorrt/default.nix

pkgs/development/libraries/science/math/tensorrt/generic.nix

Add derivation for TensorRT 8, a high-performance deep learning interface SDK from NVIDIA, which is at this point non-redistributable. The current version aldo requires CUDA 11, so this is left out of the cudaPackages_10* scopes.

Refactor derivation to pick the version that supports the current CUDA version. Based on the implementation of the same concept in the cudnn derivation.

Hash mismatch for cudnn_8_3_2 discovered when building cudaPackages_10_2.tensorrt, which depends on this version of cudnn.

samuela

I'm not part of the NVIDIA dev program, so I don't have a means to test building this but the changes LGTM.

Any remaining hold ups on merging this?

aidalgol · 2022-08-05T09:56:45Z

Any remaining hold ups on merging this?

Nothing on my end. Just waiting for my last set of changes to be reviewed.

nixos-discourse · 2022-10-05T16:04:12Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-install-a-specific-version-of-cuda-and-cudnn/21725/4

github-actions bot added the 6.topic: python label Jun 21, 2022

ofborg bot added 10.rebuild-darwin: 0 10.rebuild-linux: 0 labels Jun 21, 2022

SomeoneSerge added the 6.topic: cuda label Jun 21, 2022

SuperSandro2000 requested changes Jun 21, 2022

View reviewed changes

aidalgol force-pushed the tensorrt branch from d5528a5 to 328b008 Compare June 21, 2022 10:10

aidalgol added 10.rebuild-linux: 0 and removed 10.rebuild-darwin: 0 10.rebuild-linux: 0 labels Jun 21, 2022

aidalgol requested review from a team and removed request for a team June 21, 2022 10:24

aidalgol self-assigned this Jun 21, 2022

aidalgol requested a review from SomeoneSerge June 22, 2022 03:25

ofborg bot added 8.has: package (new) 11.by: package-maintainer 10.rebuild-darwin: 0 10.rebuild-linux: 1-10 and removed 10.rebuild-linux: 0 labels Jun 22, 2022

SomeoneSerge reviewed Jun 22, 2022

View reviewed changes

samuela reviewed Jun 23, 2022

View reviewed changes

pkgs/development/python-modules/tensorrt/default.nix Show resolved Hide resolved

pkgs/development/libraries/science/math/tensorrt/generic.nix Show resolved Hide resolved

pkgs/development/python-modules/tensorrt/default.nix Outdated Show resolved Hide resolved

aidalgol requested a review from SomeoneSerge June 23, 2022 04:59

SomeoneSerge reviewed Jun 23, 2022

View reviewed changes

pkgs/development/libraries/science/math/tensorrt/generic.nix Outdated Show resolved Hide resolved

aidalgol force-pushed the tensorrt branch from e07c2ee to 5dfda10 Compare June 23, 2022 22:44

tensorrt: init at 8.4.0.6

d70b4df

Add derivation for TensorRT 8, a high-performance deep learning interface SDK from NVIDIA, which is at this point non-redistributable. The current version aldo requires CUDA 11, so this is left out of the cudaPackages_10* scopes.

aidalgol force-pushed the tensorrt branch from 5dfda10 to d70b4df Compare June 24, 2022 01:02

aidalgol changed the title ~~Draft: Add TensorRT packages~~ Add TensorRT packages Jun 24, 2022

aidalgol requested a review from a team June 24, 2022 01:06

aidalgol marked this pull request as ready for review June 24, 2022 01:06

aidalgol requested review from FRidh and jonringer as code owners June 24, 2022 01:06

aidalgol requested review from SomeoneSerge, SuperSandro2000 and samuela June 24, 2022 01:07

aidalgol removed the 10.rebuild-darwin: 0 label Jun 24, 2022

ofborg bot added the 10.rebuild-darwin: 0 label Jun 24, 2022

aidalgol added 2 commits July 2, 2022 14:12

tensorrt: support multiple CUDA versions

c8fba82

Refactor derivation to pick the version that supports the current CUDA version. Based on the implementation of the same concept in the cudnn derivation.

cudnn: fix incorrect hash

e28a8e0

Hash mismatch for cudnn_8_3_2 discovered when building cudaPackages_10_2.tensorrt, which depends on this version of cudnn.

samuela approved these changes Aug 5, 2022

View reviewed changes

SuperSandro2000 merged commit 12a7360 into NixOS:master Aug 5, 2022

aidalgol deleted the tensorrt branch May 12, 2023 18:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TensorRT packages #178397

Add TensorRT packages #178397

aidalgol commented Jun 21, 2022 •

edited

SuperSandro2000 left a comment

aidalgol commented Jun 21, 2022 •

edited

SomeoneSerge commented Jun 21, 2022 •

edited

aidalgol commented Jun 22, 2022

SuperSandro2000 commented Jun 22, 2022

SomeoneSerge commented Jun 22, 2022

aidalgol commented Jun 23, 2022

samuela left a comment

samuela left a comment

aidalgol commented Aug 5, 2022

nixos-discourse commented Oct 5, 2022

Add TensorRT packages #178397

Add TensorRT packages #178397

Conversation

aidalgol commented Jun 21, 2022 • edited

Description of changes

Things done

SuperSandro2000 left a comment

Choose a reason for hiding this comment

aidalgol commented Jun 21, 2022 • edited

SomeoneSerge commented Jun 21, 2022 • edited

aidalgol commented Jun 22, 2022

SuperSandro2000 commented Jun 22, 2022

SomeoneSerge commented Jun 22, 2022

aidalgol commented Jun 23, 2022

samuela left a comment

Choose a reason for hiding this comment

samuela left a comment

Choose a reason for hiding this comment

aidalgol commented Aug 5, 2022

nixos-discourse commented Oct 5, 2022

aidalgol commented Jun 21, 2022 •

edited

aidalgol commented Jun 21, 2022 •

edited

SomeoneSerge commented Jun 21, 2022 •

edited