-
-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorRT packages #178397
Add TensorRT packages #178397
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing meta
@SuperSandro2000 I appreciate the quick feedback, but this is nowhere ready for review. I created this draft PR at the suggestion of @SomeoneSerge, after much back and forth in the Nix CUDA Maintainers Matrix room, to ease collaboration efforts on this package. I am aware of the code quality issues, but I am trying to get this to a state where the derivations will build and the python module finds the necessary libraries at runtime. Once this actually works, I will bring the nix code in line with nixpkgs conventions, and parameterise the version similar to the mathematica package. |
Good! Now let's expose the expression e.g. as an atttribute, e.g. nixpkgs/pkgs/top-level/cuda-packages.nix Line 61 in 473669d
For now, I think you could define it right there, next to cutensorExtension (which you can use as a template, by the way)
Also, as Sandro suggests, it won't hurt to start bringing things into a shape right away: let's add a I'd also suggest that you run |
@SomeoneSerge With my latest changes, you should now be able to reproduce the error I was describing on matrix (from before creating this draft PR) by running |
If you want to collaborate on very WIP things I would suggest to work in a PR or branch in your fork. |
@SuperSandro2000 thank you. It explicitly was my suggestion to @aidalgol that we move the discussion from matrix straight into a draft PR in nixpkgs that we can iterate faster. This is a messy non-redist package, which means there's some trivia about manually gluing pieces together to handle in the beginning With that done, we'll mark the draft as "ready for review" and tag you Cheers |
@SomeoneSerge With your suggested changes, I am back to where I was before moving this to a $ NIX_PATH=nixpkgs=$PWD NIXPKGS_ALLOW_UNFREE=1 nix-shell -p 'python310.withPackages (pypkgs: [ pypkgs.tensorrt ])'
$ python3 -c 'import tensorrt; assert tensorrt.Builder(tensorrt.Logger())'
[06/23/2022-12:09:18] [TRT] [E] 6: [libLoader.h::DynamicLibrary::49] Error Code 6: Internal Error (Unable to load library: libnvinfer_builder_resource.so.8.4.0)
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: pybind11::init(): factory function returned nullptr From running the same openat(AT_FDCWD, "/run/opengl-driver/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/wk3xam5pd0a0jwirggg5zpbjzw8zzaf3-gcc-11.3.0-lib/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/fz33c1mfi2krpg1lwzizfw28kj705yg0-glibc-2.34-210/lib/libnvinfer_builder_resource.so.8.4.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) I have no idea what is looking for it here, though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for showing these packages some love @aidalgol! It's def a step forward.
I'm afraid I may not have enough context to be very useful here, since I don't use TensorRT personally. I just have a few notes re the cudaPackages migration that we've been working on.
Add derivation for TensorRT 8, a high-performance deep learning interface SDK from NVIDIA, which is at this point non-redistributable. The current version aldo requires CUDA 11, so this is left out of the cudaPackages_10* scopes.
Refactor derivation to pick the version that supports the current CUDA version. Based on the implementation of the same concept in the cudnn derivation.
Hash mismatch for cudnn_8_3_2 discovered when building cudaPackages_10_2.tensorrt, which depends on this version of cudnn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not part of the NVIDIA dev program, so I don't have a means to test building this but the changes LGTM.
Any remaining hold ups on merging this?
Nothing on my end. Just waiting for my last set of changes to be reviewed. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/how-to-install-a-specific-version-of-cuda-and-cudnn/21725/4 |
Description of changes
Add derivations for TensorRT, a high-performance deep learning interface SDK from NVIDIA, which is at this point non-redistributable. Uses different upstream tarballs depending on the current CUDA version, similar to the cudnn derivations.
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)nixos/doc/manual/md-to-db.sh
to update generated release notes