-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
I'm completely new to AI and the jetson ecosystem, but love this outstanding resource you've created here for people to get started. I just recently got a 4gb Jetson devkit bundle from sparkfun that included a power supply etc as it seemed to be literally the only place who had any jetsons in stock.
Following the tutorial and getting to the part of Using the ImageNet Program on Jetson as soon as I tried to classify an image it was never working, always failing with Exception: jetson.inference -- imageNet.Classify() encountered an error classifying the image. if running the python code, or simply not labeling the output image if using the C code.
I tried many things attempting to figure out what was wrong, I'll spare you the details but can assure you there were many times I overwrote the sd card with the factory image thinking maybe it was update related. Sometimes it did work if I ran imagenet right after initial setup, as long as I didn't reboot. Knowing the issue was DVFS, this makes sense as the clock and voltage were probably still higher from being busy with the initial setup.
Eventually after eliminating so many other things, I found that if I disabled DVFS the classification always worked every time, but as soon as I restored the default settings with DVFS turned on, or rebooted which does the same thing, the classification would basically nearly always fail (unless other system activity had already push clocks/voltage up enough to get lucky).
Is this expected behavior that the nano fails to classify images when DVFS is on? If it is then I would highly suggest adding some kind of warning or note telling people to run jetson_clocks before opening the docker container for the examples. I'm sure it would save some other person from this situation if jetson really is supposed to fail to classify images when DVFS is enabled.
I'm assuming this probably isn't expected behavior or I would find more people asking about this in the closed issues. I did find some issues with similar classification problems, but never saw any mention of jetson_clocks or DVFS.
Is this some kind of hardware defect that is only apparent when DVFS is maybe too aggressively, at least for my board, scales the voltage down to the point where its making incorrect calculations?
Heres are two examples showing a run with and without DVFS on. This is just with default options downloaded/installed for the docker container. To save on unnecessary lines of text in the logs, the initial TensorRT network optimization was pre run with DVFS disabled just so I could be sure it wasn't introducing any errors at that point. I have tested many times including deleting the cached network and even starting with a fresh SD card image. Once I realized DVFS was the issue it has been consistently reproducible.
I have attached two files that show the full output from running of the docker container and attempting the first imagenet example with DVFS on and off, including with the output of tegrastats to confirm the clock frequency before running the docker container.
dvfs-off.txt
dvfs-on.txt
The TLDR is that when DVFS is on it will fail with
Traceback (most recent call last):
File "./imagenet.py", line 68, in <module>
class
_id, confidence = net.Classify(img)
Exception: jetson.inference -- imageNet.Classify() encountered an error classifying the image