-
Notifications
You must be signed in to change notification settings - Fork 74.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tensorflow for Nvidia TX1 #851
Comments
Are you using jetpack 2? |
No, JetPack does not support running directly on the L4T platform. |
I meant if you have flashed the board with jetpack 2 to have cuda 7 support. |
Ah, yes I have Cuda 7 support and used jetpack 2. To be more precise, the target is not actually the Jetson TX1 but an repurposed Nvida Sield TV flashed to L4T 23.1 for Jetson. |
@Yangqing FYI |
I think there is a TX1 that I could use to take a look. I'll see what I can do. |
In theory, can TensorFlow run usefully on the TK1? Or is the 2G memory too small for, say, face verification? |
@robagar It all depends on how large your network is and whether you intend to train the model on TK1 or just run inference. Two GB of memory is plenty to run inference on almost any model. |
I have worked around an issue that prevented nvcc from compiling the Eigen codebase on Tegra X1 (https://bitbucket.org/eigen/eigen/commits/d0950ac79c0404047379eb5a927a176dbb9d12a5). |
That's good news ;) Whats the problem with bazel? maxcuda's instructions for building bazel worked quite well for me.. |
For building bazel I had to use a special java build which can cope with the 32bit rootfs on a 64bit machine
There seems to be one eigen issue I can't get around:
Can you have a look at TensorEvaluator.h please? |
I still haven't been able to install bazel. That said, the assertion you're facing seems to be triggered by the variadic template at line 195 of ./tensorflow/core/lib/strings/strcat.h. I would just comment this code and see how it goes. |
When you say maxcuda has "been unable to repeatedly build it" since then, does that mean that tensorflow is no longer working on the TK1 again? Because I just ordered the TK1 with the express purpose of being able to run tensorflow :-/ |
Yes, I have been unable to recompile the latest versions. The wheel I built around Thanksgiving should still work but it is quite an old version. |
Commenting the variadic template at line 195 helps a little but at line 234 there is a another template that seems to be required. Any hints how to rewrite that in nvcc friendly manner? |
@benoitsteiner
|
@damienmg FYI |
Hi folks, I'm also working on building everything from scratch on tx1. There is lots of discussions here and also on nvidia developer forums. But by now I haven't seen any well summarized instruction besides that tk1's. Can we start another repo or script file so people can work on it more efficient? |
Imho we have to first solve the fundamental issue of the variadic templates not working with nvcc. Either the developers would have to do without those templates which is backwards and probably not going to happen or nvidia has to step up and make nvcc more compatible? In theory nvcc should already be able to deal with your own variadic templates, but external e.g. STL headers won't "just work" because of the need to annotate all functions called on the device with "host device". Maybe someone knows a good way how to get around this issue.... |
@jmtatsch At the moment, the version of cuda that is shipped with the tegra x1 has problems with variadic templates. Nvidia is aware of this and working on a fix. I updated Eigen a few weeks ago to disable the use of variadic templates when compiling on tegra x1, and that seems to fix the bulk of the problem. However, StrCat and StrAppend still rely on variadic templates. Until nvidia releases a fix, the best solution is to comment out the variadic versions of StrCat and StrAppend, and create non variadic versions of StrCat and StrAppend with up to 11 arguments (since that's what TensorFlow currently needs). |
I have a build of TF 0.8 but it requires a new 7.0 compiler that is not yet available to the general public. |
Good work @maxcuda! Will it build on the TX1 too? |
Yes, it will build on TX1 too. I fixed a problem with the new memory allocator to take in account the 32bit OS. Some basic tests are passing but the label_image test is giving the wrong results so there may be some other places with 32bit issues. |
@benoitsteiner , with the new compiler your change to Eigen is not required anymore ( and it is forcing to edit a bunch of files). Could you please remove the check and re-enable variadic templates ? |
@maxcuda Where can I download the new cuda compiler? I'd like to make sure that I don't introduce new problems when I enable variadic templates again. |
@maxcuda is the new 7.0 compiler you were referencing part of Jetpack 2.2 that was just released? |
@elirex Did you manage to compile ? |
@piotrchmiel Yes, I successfully completed the compilation. I add 8GB swap space and run bazel build -c opt --local_resources 3072,4.0,1.0 --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package At compiling, I through free -h and top command to look the memory usage status. Tensorflow need to use about 8GB memory to compile. |
Thank you 👍 I will try to repeat your steps :-) |
Question:For those that compiled TensorFlow 0.9 on the Jetson TX1, which options did you use during the TensorFlow Error 1:I received Two bazel issue responders (1, 2) suggested people use the I'm ran bazel again, this time with the The remainder of the error said, Error 2:
I didn't use Plan:
|
It workedFollowing this StackOverflow guide but with an 8 GB swap file and using the following command successfully built TensorFlow 0.9 on the Jetson TX1 from a fresh install of JetPack 2.3:
I used the default settings for TensorFlow's My build took at least 6 hours. It'll be faster if you use an SSD instead of a USB drive. Thanks to Dwight Crow, @elirex, @tylerfox, everyone that helped them, and everyone in this thread for spending time on this problem. Creating a swap file
Adapted from JetsonHack's gist. I used this USB drive to store my swap file. The most memory I saw my system use was 7.7 GB (3.8 GB on Mem and 3.9 GB on Swap). The most swap memory I saw used at once was 4.4 GB. I used Creating the pip package and installingAdapted from the TensorFlow docs:
|
I use |
Could anyone build TF r0.11 on TX1 yet? |
Thanks for all the information here, got tensorflow r0.11.0 installed with Jetpack 2.3.1 on tx1. Following @elirex 's steps, make sure using the exact version of protobuf, grpc and bazel. I build tensorflow r0.11.0 instead of v0.11.0.rc2. When compiling, following @MatthewKleinsmith 's step to add swap file, you need a big swap, I tried 6G but failed in the middle with out of memory error, tried again with 10G swap file works. It took me about 5 hours for the compiling with swapfile allocated on usb drive. |
Is tensorflow working correctly on the TX1, ie. able to run inference and get good results? When I installed tensorflow on a TK1 it ran just fine however the convolutional layers where producing bad results. I could train fully connected models on mnist just fine but when I tried to use conv layers it stopped converging. Is this problem persistent in the TX1 build? |
Continually get this when running If I pull 0.2.3 I don't get the error, only with 0.3.x |
@zxwind How is TF 0.11 performance working for you on the TX1? |
FYI, I've got a branch off r1.0 with some hacks to build the r1.0 release on TX1 with Jetpack 2.3.1. In addition to the previously mentioned issues, there is a change in Eigen after the revision used on the TF r0.11 branch that causes the CUDA compiler to crash with an internal error. I changed workspace.bzl on r1.0 branch to point to the older Eigen revision. In order for that to build I had to remove the EXPM1 op that was added after r0.11. It's all rather ugly but got me up and running. Interesting to note, with the r1.0.0a build I'm able to run inference on a Resnet50 based network at 128x96 resolution that was running out of memory on r0.11. For anyone curious on benchmark numbers, was getting approx 15fps with single frame batches. Link to a tag on my clone of TF with binary wheels for anyone interested. The wheels will likely only work on a Jetpack 2.3.1 (L4T 24.2.1). No guarantees there aren't some serious issues but I've verified results on the networks I'm using right now. |
Closing since @rwightman / @MatthewKleinsmith solution seems to work, though not quite a seamless out-the-box experience. Feel free to reopen. |
@rwightman May I humbly ask you to provide another wheel for the r1.0 stable version? |
@rwightman How were you able to build tensorflow without gRPC? Thanks! Edit: never mind, I saw your repo : https://github.com/jetsonhacks/installTensorFlowTX1/ Thanks for setting that up. |
@sunsided Here's the Python 3.5.2 version for TF 1.0.1 that @dkopljar and I managed to build: https://drive.google.com/open?id=0B2jw9AHXtUJ_OFJDV19TWTEyaWc |
Hello all, I was able to install TensorFlow v1.0.1 on the new Jetson TX2. I had to follow similar process as mentioned above in this thread (protobuf, grpc, swapfile etc). For bazel, I downloaded bazel-0.4.5-dist.zip and applied @dtrebbien's change. Here is the pip wheel of my installation if it helps anyone. It's for Python 2.7: https://drive.google.com/file/d/0Bxl-G9VJ61mBYmZPY0hLSlFaUDg/view?usp=sharing |
slim: Typos at datasets/flowers.py
Hello all, I was able to install TensorFlow v1.0.1 on Tegra X1 using the build by @Barty777 |
@Barty777 you wouldn't happen to have 3.6 wheels, would you? 🙏 |
@gvoysey Unfortunately no. :( |
Here is the wheel file for TensorFlow 1.2, Nvidia TX1 and Python 2.7: https://drive.google.com/file/d/0B-Ljdh8jFZRbTnVNdGtGMHA2Ymc/view?usp=sharing |
i've been able to build a tensorflow wheel for python 3.6 for TX1, but i cannot build tensorflow-GPU support successfully. See https://stackoverflow.com/questions/45825708/error-building-tensorflow-gpu-1-1-0-on-nvidia-jetson-tx1-aarch64 for details. |
Sorry for the late comment, can anyone please help me regarding setting up tensorflow in Nvidia tk1? |
This change, suggested by @tylerfox at tensorflow/tensorflow#851 (comment) allows Bazel 0.4.5 to be built on a Jetson TX1 with JetPack 3.0. The other of @tylerfox's suggested changes was made in 7c4afb6. Refs #1264 Closes #2703. PiperOrigin-RevId: 152498304
Hello,
@maxcuda has recently got tensorflow running on the tk1 as documented in blogpost http://cudamusing.blogspot.de/2015/11/building-tensorflow-for-jetson-tk1.html but since then been unable to repeatedly build it. I am now trying to get tensorflow running on a tx1 tegra platform and need some support.
Much trouble seems to come from Eigen variadic templates and using C++11 initializer lists, both of wich could work according to http://devblogs.nvidia.com/parallelforall/cplusplus-11-in-cuda-variadic-templates/.
In theory std=c++11 should be set according to crosstool. Nevertheless, nvcc crashes happily on all of them. This smells as if the "-std=c++11" flag is not properly set.
How can I verify/enforce this?
Also in tensorflow.bzl, variadic templates in Eigen are said to be disabled
We have to disable variadic templates in Eigen for NVCC even though std=c++11 are enabled
is that still necessary?
Here is my build workflow:
The text was updated successfully, but these errors were encountered: