Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA updated their machine learning repos #642

Merged
merged 4 commits into from Jun 13, 2019
Merged

NVIDIA updated their machine learning repos #642

merged 4 commits into from Jun 13, 2019

Conversation

omen23
Copy link
Contributor

@omen23 omen23 commented Jun 5, 2019

fix wrong version numbers - libcudnn is still working with old numbers but libnvinfer5 is not
you can also use 7.6.0.64-1+cuda10.0 for libcudnn7 but it auto upgraded to 7.6.0.64-1+cuda10.1 - so who cares and libnvinfer5 will only install with libnvinfer5=5.1.5-1+cuda10.0 or libnvinfer5=5.1.2-1+cuda10.0 if you pass it the correct version number. Same for the development library. Just signed the CLA
Would be nice if I knew this earlier (using a Ubuntu / kubuntu 19.04 disco base system with repos for LTS 18.04.2)

fix wrong version numbers - libcudnn is still working with old numbers but libnvinfer5 is not
@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added the cla: no CLA has not been signed label Jun 5, 2019
@omen23
Copy link
Contributor Author

omen23 commented Jun 5, 2019

I signed it

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: yes CLA has been signed and removed cla: no CLA has not been signed labels Jun 5, 2019
site/en/install/gpu.md Outdated Show resolved Hide resolved
@lamberta lamberta requested a review from tfboyd June 5, 2019 18:33
</code>

# Install TensorRT. Requires that libcudnn7 is installed above.
<code class="devsite-terminal">sudo apt-get update && \
sudo apt-get install nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 \
&& sudo apt-get update \
&& sudo apt-get install -y --no-install-recommends libnvinfer-dev=5.0.2-1+cuda10.0
&& sudo apt-get install -y --no-install-recommends libnvinfer-dev=5.1.5-1+cuda10.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newest library version

@omen23
Copy link
Contributor Author

omen23 commented Jun 7, 2019

hey i know you are busy but this is important information for people trying to install TensorRT - it won't work with wrong version numbrs of libnvinfer5

@tfboyd
Copy link
Member

tfboyd commented Jun 7, 2019

I see a cuDNN specific for CUDA 10 (I suspect same binary or similar) in the downloads. I did not check the repo; could you check again. I realized it is not upgrading to CUDA 10.1 but it would avoid any visual confusion if it was just CUDA 10.0. I am adding lambda who can verify what version of TensorRT we should be suggesting. We also will want to upgrade our docker images. I have been wanting to stay on cuDNN 7.5; but if we need to move forward for TensorRT then that is a valid reason.

@tfboyd
Copy link
Member

tfboyd commented Jun 7, 2019

@aaroey

  • Is this the TensorRT we need and do we need to upgrade to cuDNN 7.6? I think we spoke about this but I forgot.

Docs team, you may (not passive aggressive honestly asking) want to change some permissions because I could not add lambda to review this. Not sure if that is desired; it was a bit of a road block/speed bump.

@lamberta
Copy link
Member

lamberta commented Jun 7, 2019

@tfboyd I think github requires write access to the repo to assign reviewers. I'm happy to give that to you, but it's not something we want to set for everyone because not everyone understands the copybara sync workflow. the github permission model is not very flexible :/

Maybe @gunan knows better?

@gunan
Copy link
Contributor

gunan commented Jun 7, 2019

I think we can also add collaborators to the repo and assign as reviewers?

@lamberta
Copy link
Member

lamberta commented Jun 7, 2019

@gunan @tfboyd Yes, we have a number of collaborator teams already attached to this repo. I can assign folks in these teams, but I have write access to this repo.

@lamberta
Copy link
Member

lamberta commented Jun 7, 2019

I should note: even with write access, I can not assign users that are not in these collaborator teams

@gunan
Copy link
Contributor

gunan commented Jun 7, 2019

Just checked repo permissions.
You are right, editing pull requests requires "Write" permissions to the repository.

@omen23
Copy link
Contributor Author

omen23 commented Jun 8, 2019

sudo apt-get install --no-install-recommends \
        cuda-10-0 \
        libcudnn7=7.6.0.64-1+cuda10.0  \
        libcudnn7-dev=7.4.1.5-1+cuda10.0

works too its the same file - I explained it in my original post

you can also use 7.6.0.64-1+cuda10.0 for libcudnn7 but it (apt) auto upgraded to 7.6.0.64-1+cuda10.1

I fixed the libcudnn version of the library to use the 10.0 one
Atleast if the version numbers for TensorRT are not updated the manual is broken.

Regards

trying to avoid confusion
@aaroey
Copy link
Member

aaroey commented Jun 10, 2019

@tfboyd yes 7.5.1 is the TensorRT version we'd like to use but we need to remove the installation of nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0. Just commented.

Copy link
Member

@tfboyd tfboyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if we need to install the nvinfer-runti-trt-repo.... Then we are done and this is a nice improvement.

site/en/install/gpu.md Show resolved Hide resolved
libcudnn7=7.4.1.5-1+cuda10.0 \
libcudnn7-dev=7.4.1.5-1+cuda10.0
libcudnn7=7.6.0.64-1+cuda10.0 \
libcudnn7-dev=7.6.0.64-1+cuda10.0
</code>

# Install TensorRT. Requires that libcudnn7 is installed above.
<code class="devsite-terminal">sudo apt-get update && \
sudo apt-get install nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, last item. We need to test if the install of the libnvinfra can be done without needing to install the nvinfer-runti-trt-repo. I suspect Yes given the code below is install a version different than the repo; but I would like that validated. Easy to do if you have a local fresh ubuntu docker or VM. I may get to it but I suspect anyone would be faster than me.

@omen23
Copy link
Contributor Author

omen23 commented Jun 13, 2019

Hey,
libnvinfer5 and libnvinfer-dev are actually in the repo http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64
which gets added earlier in the instalaltion process!

  • so we actually don't need the nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 repository.

Regards

PS: I think everything is fixed now – optically and logically .. so lets put it on the site so new users can install TensorRT… and the new versions of libcudnn7

@omen23
Copy link
Contributor Author

omen23 commented Jun 13, 2019

$ apt-cache policy libnvinfer-dev
libnvinfer-dev:
  Installed: 5.1.5-1+cuda10.0
  Candidate: 5.1.5-1+cuda10.1
  Version table:
     5.1.5-1+cuda10.1 500
        500 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages
 *** 5.1.5-1+cuda10.0 500
        500 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages
        100 /var/lib/dpkg/status
     5.1.2-1+cuda10.1 500
        500 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages
     5.1.2-1+cuda10.0 500
        500 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages

$ sudo apt-get install libnvinfer-dev=5.1.5-1+cuda10.0 --reinstall
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  libnvidia-common-418
Use 'sudo apt autoremove' to remove it.
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 8 not upgraded.
Need to get 44,3 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  libnvinfer-dev 5.1.5-1+cuda10.0 [44,3 MB]
Fetched 44,3 MB in 4s (12,3 MB/s)         
(Reading database ... 322083 files and directories currently installed.)
Preparing to unpack .../libnvinfer-dev_5.1.5-1+cuda10.0_amd64.deb ...
Unpacking libnvinfer-dev (5.1.5-1+cuda10.0) over (5.1.5-1+cuda10.0) ...
Setting up libnvinfer-dev (5.1.5-1+cuda10.0) ...

removed unnecessary repository
@omen23
Copy link
Contributor Author

omen23 commented Jun 13, 2019

ironed out everything – ready to be pushed on the manual website

Copy link
Member

@aaroey aaroey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve for TensorRT related changes, thanks!
Please wait for other approvals before merging.

@tfboyd
Copy link
Member

tfboyd commented Jun 13, 2019

@omen23 Thank you for doing all the hardwork. Testing this stuff takes time, I would know for sure, and I really appreciate it. LGTM from me.

Copy link
Member

@tfboyd tfboyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@lamberta lamberta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, everyone. Appreciate it!

@lamberta lamberta added the ready to pull Start merge process label Jun 13, 2019
@TensorFlow-Docs-Copybara TensorFlow-Docs-Copybara merged commit c561380 into tensorflow:master Jun 13, 2019
TensorFlow-Docs-Copybara pushed a commit that referenced this pull request Jun 13, 2019
PiperOrigin-RevId: 253061213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes CLA has been signed ready to pull Start merge process
Projects
None yet
7 participants